How Prime Factorizations govern the Collatz Conjecture
By applying some optimizations to the Collatz Conjecture, we can reduce noise that obscures the patterns that produce the sequences. It becomes apparent that the sequences generated are related to prime factorizations: the prime factorization of the number itself for decending, whereby all 2's are removed; or the prime factorization of the next higher even number for ascending, whereby twos and threes are traded off.Herein, I'll lay out the math, and show the interesting things it tells us about the Conjecture.
Contents
1. Get Rid of Noise
The Collatz is iterative. The usual rule to get the next value is, for any positive integer :
The conjecture is that if we repeat this enough times, we will always arrive at 1.
An odd always produces an even number, so subsequently we must always execute the "even" half of the equation at least once. So, as any good engineer would do, let's optimize by getting rid of the unnecessary iteration and the noise it generates:
With a bit of factoring, this becomes:In this last form, the "odd" portion of the equation always produces fractional half, before adding ½. (An odd times an odd is always odd, so the division results in the fractional ½). We can get rid of this by using as the input and making the appropriate adjustments. Note: since we're focusing on the the odd portion of the equation, I'm going to only show that portion through here.
Note that the last term is . If the successive iteration is also odd, the first step of which is to add 1, these effectively cancel out.
That is, we can redefine "odd" algorithmically:
- Add 1.
- While this value is even, divide by 2 and multiply by 3.
- Subtract 1.
Or, to write it mathematically, if we knew how many times to repeat n:
So how do we know the value of n?- is odd → is even, and thus the prime factorization of contains at least one 2.
- is even → is whole.
- Looking at the prime factorization if , multiplying by in effect replaces a 2 with a 3.
- → is even, since there were two 2s in the prime factorization of , leaving one remaining.
- → .
- → .
- ...and so forth.
So, generally, n = # of 2s in prime factorization of .
1.1. So what's this telling us
- For any the number governs whether it can ascend, and how far. I will call this the "control number".
- An ascending Collatz series is related by the prime factorizations of the corresponding control numbers, . These prime factorizations will all be the same overall size, and will contain the same set of numbers, with only the numbers of 2s and 3s varying.
- The combined number of 2s and 3s in the prime factorization tells us the length of this "ascending chain".
- The numbers of 2s in the prime factorization tells us how many links are left to reach the upper end of the ascending chain. If there are no 2s, we are at the upper bound of the ascending chain.
- If there are neither 2s nor 3s in , then is not part of an ascending chain.
1.2. Example
For clarity, let's review some examples:
15 Collatz Control Ctrl factors 15: 16 = (start) 2^4 23: 24 = (Inc,3) 2^3 * 3^1 35: 36 = (Inc,3) 2^2 * 3^2 53: 54 = (Inc,3) 2^1 * 3^3 80: 81 = (Inc,3) 3^4 40: 41 = (dec ) 41^1 20: 21 = (Dec,3) 3^1 * 7^1 10: 11 = (dec ) 11^1 5: 6 = (Dec,3) 2^1 * 3^1 8: 9 = (Inc,3) 3^2 4: 5 = (dec ) 5^1 2: 3 = (Dec,3) 3^1 1: 2 = (dec ) 2^1With the "standard" Collatz, the sequence from 15 would be 15, 46, 23, 70, 35, 106, 53, 160, 80, 40, 20, 10, etc. Since we debounced it, we eliminate the unnecessary and confusing waypoints, leaving us with 15,
Lest you think this is cherry-picked data, let's look at some output from a random number 5-digit number I picked. In this variation, we won't do a full prime factorization, instead just showing the 2s, 3s, and whatever is leftover. To see some long chains, look for 335, 319, and 911.
17835 Collatz Control incr 17835 17836 = 2^2 * 4459^1 26753 26754 = 2^1 * 3^1 * 4459^1 40130 40131 = 3^2 * 4459^1 decr 40130 40130 = 2^1 * 20065^1 20065 20065 = 20065^1 incr 20065 20066 = 2^1 * 10033^1 30098 30099 = 3^1 * 10033^1 decr 30098 30098 = 2^1 * 15049^1 15049 15049 = 15049^1 incr 15049 15050 = 2^1 * 7525^1 22574 22575 = 3^1 * 7525^1 decr 22574 22574 = 2^1 * 11287^1 11287 11287 = 11287^1 incr 11287 11288 = 2^3 * 1411^1 16931 16932 = 2^2 * 3^1 * 1411^1 25397 25398 = 2^1 * 3^2 * 1411^1 38096 38097 = 3^3 * 1411^1 decr 38096 38096 = 2^4 * 2381^1 19048 19048 = 2^3 * 2381^1 9524 9524 = 2^2 * 2381^1 4762 4762 = 2^1 * 2381^1 2381 2381 = 2381^1 incr 2381 2382 = 2^1 * 3^1 * 397^1 3572 3573 = 3^2 * 397^1 decr 3572 3572 = 2^2 * 893^1 1786 1786 = 2^1 * 893^1 893 893 = 893^1 incr 893 894 = 2^1 * 3^1 * 149^1 1340 1341 = 3^2 * 149^1 decr 1340 1340 = 2^2 * 335^1 670 670 = 2^1 * 335^1 335 335 = 335^1 incr 335 336 = 2^4 * 3^1 * 7^1 503 504 = 2^3 * 3^2 * 7^1 755 756 = 2^2 * 3^3 * 7^1 1133 1134 = 2^1 * 3^4 * 7^1 1700 1701 = 3^5 * 7^1 decr 1700 1700 = 2^2 * 425^1 850 850 = 2^1 * 425^1 425 425 = 425^1 incr 425 426 = 2^1 * 3^1 * 71^1 638 639 = 3^2 * 71^1 decr 638 638 = 2^1 * 319^1 319 319 = 319^1 incr 319 320 = 2^6 * 5^1 479 480 = 2^5 * 3^1 * 5^1 719 720 = 2^4 * 3^2 * 5^1 1079 1080 = 2^3 * 3^3 * 5^1 1619 1620 = 2^2 * 3^4 * 5^1 2429 2430 = 2^1 * 3^5 * 5^1 3644 3645 = 3^6 * 5^1 decr 3644 3644 = 2^2 * 911^1 1822 1822 = 2^1 * 911^1 911 911 = 911^1 incr 911 912 = 2^4 * 3^1 * 19^1 1367 1368 = 2^3 * 3^2 * 19^1 2051 2052 = 2^2 * 3^3 * 19^1 3077 3078 = 2^1 * 3^4 * 19^1 4616 4617 = 3^5 * 19^1 decr 4616 4616 = 2^3 * 577^1 2308 2308 = 2^2 * 577^1 1154 1154 = 2^1 * 577^1 577 577 = 577^1 incr 577 578 = 2^1 * 289^1 866 867 = 3^1 * 289^1 decr 866 866 = 2^1 * 433^1 433 433 = 433^1 incr 433 434 = 2^1 * 217^1 650 651 = 3^1 * 217^1 decr 650 650 = 2^1 * 325^1 325 325 = 325^1 incr 325 326 = 2^1 * 163^1 488 489 = 3^1 * 163^1 decr 488 488 = 2^3 * 61^1 244 244 = 2^2 * 61^1 122 122 = 2^1 * 61^1 61 61 = 61^1 incr 61 62 = 2^1 * 31^1 92 93 = 3^1 * 31^1 decr 92 92 = 2^2 * 23^1 46 46 = 2^1 * 23^1 23 23 = 23^1 incr 23 24 = 2^3 * 3^1 35 36 = 2^2 * 3^2 53 54 = 2^1 * 3^3 80 81 = 3^4 decr 80 80 = 2^4 * 5^1 40 40 = 2^3 * 5^1 20 20 = 2^2 * 5^1 10 10 = 2^1 * 5^1 5 5 = 5^1 incr 5 6 = 2^1 * 3^1 8 9 = 3^2 decr 8 8 = 2^3 4 4 = 2^2 2 2 = 2^1 1 1 =
1.3. Looking at this in reverse
I said earlier that the length of an ascending chain is equal to the combined 2s and 3s in the factorization of the control number , and that the number of 2s indicates how many steps remain to ascend.Conversely, the 3s represent the number of prior (i.e. lower) links in the chain.
Any number of course "comes from" . Until now, however, there was no way to tell if also "comes from" except by trial-and-error: that portion of the Collatz only went forward.
By looking at the factorization of control numbers , it is evident whether there are earlier links, and how many of them, from which it is possible to "come from":
- If the factorization of contains no threes, then is not arrived at from a lower number.
- If the factorization of does contain threes, then the number of threes indicates the number of lower values which ascend to this one.
- You can find these lower control numbers by changing 3s in the factorization of to twos and multiplying out; subtract 1 for the corresponding Collatz points.
So, again picking a number at random: 18935. Our "control" number will thus be 189361 , whose factorization is .
Changing the 3s to 2s, we have , and changing the 2s to 3s, we have . This ascension chain will be length 5, starting at Collatz value 8415 and ending at 63908.
8415 8415: 8416 = (start) 2^5 * 263^1 12623: 12624 = (Inc,3) 2^4 * 3^1 * 263^1 18935: 18936 = (Inc,3) 2^3 * 3^2 * 263^1 28403: 28404 = (Inc,3) 2^2 * 3^3 * 263^1 42605: 42606 = (Inc,3) 2^1 * 3^4 * 263^1 63908: 63909 = (Inc,3) 3^5 * 263^1
2. Throw it away and start again
Now that we understand the problem better, let's look at it in a fresh light.
- Elements of an ascending chain have prime factorizations that have identical contents for numbers >= 4, and have the same sum of 2s and 3s.
- Elements of a descending chain have the identical prime factorization, except for the number of 2s.
What we really have is an interesting game of playing with prime factorizations. Instead of seeing all the individual steps, we should look at what happens when our modified Collatz switches between ascending and descending modes.
- If the starting number is odd, go to step 3.
- Apply a descending chain by taking the minimum of the related descending chain values: the one with no 2s in the prime factorization.
- If the number is 1, stop.
- Swap out the prime factors for new ones by adding 1.
- Apply an ascending chain by taking the maximum of the related ascending chain values, the one with no 2s in the prime factorization.
- Swap out the prime factors for new ones by subtracting 1.
- Go to step 2.
- For any number , the closest number sharing any single value of its prime factorization will be .
- Since 1 is never part of a prime factorization, the primary factorizations of and are mutually exclusive; they cannot contain any of the same numbers.
- Moreover, the prime factorizations of and cannot be modified in a consistent way that they will occur next to eachother ever again. Looking at 15:16, for example ([3,5]:[2^4]), if we make the same change---say, adding another 2--- the resulting values, 30:32, are no longer next to each other. 2
- There will be exactly one value in ascending chain with a prime factorization devoid of 3s, and similarly exactly one devoid of 2s.
- There will be 1 or more values in ascending chains with a prime factorization containing 3s, and 1 or more containing 2s.
- There are thus at least 2 values in ascending chains.
- Since the end value of an ascending chain always contains at least one 3, the prime factorization of the subsequent descending chain will never contain 3.
There might be a proof by induction that the Collatz is acyclic using this. It is above my mathematical skills to do so.
A. Programs
3.1. collatz.c
This program follows the optimized Collatz sequence and prints the factorizations of the next-higher values while both ascending and descending.
Usage: collatz <number>...
/* Collatz conjecture research. Perette Barella. 2018-03-22. Please ensure I'm credited if this leads anywhere. */ #include <stdio.h> #include <string.h> #include <stdlib.h> #include <assert.h> void prime_factor (int v) { int n = 2; char multiply = ' '; for (int n = 2; v != 1; n += 1) { int count = 0; while (v % n == 0) { count++; v = v / n; } if (count) { printf ("%c %2d^%-4d", multiply, n, count); multiply = '*'; } } } void collatz (int n) { printf ("%d\n", n); int c = n; const char *action = "start"; while (1) { printf ("%9d: %9d = (%-5s) ", c, c+1, action); prime_factor (c + 1); printf ("\n"); if (c == 1) break; if (c & 1) { c = (c * 3 + 1) >> 1; action = "inc"; if ((c + 1) % 3 == 0) action = "Inc,3"; else assert (!"This doesn't happen!"); } else { c = c >> 1; action = "dec"; if ((c + 1) % 3 == 0) action = "Dec,3"; } } } int main (int argc, char **argv) { while (*++argv) { collatz (strtol (*argv, NULL, 10)); } return 0; }
3.2. coder.c
This program follows the optimized Collatz sequence. When ascending, for a given Collatz point the factorization of . When descending, the factorization of is shown. For the factorizations, 2s, 3s are prime factored out, and the rest left as a non-prime factor (unless it just happens to be prime).Usage: coder [-f] <number>...
The -f
flag causes only peaks and troughs to be
shown, skipping intermediate values in ascents and descents.
/* Collatz conjecture research. Perette Barella. 2018-03-22. Please ensure I'm credited if this leads anywhere. */ #include <stdio.h> #include <string.h> #include <stdlib.h> #include <stdbool.h> #include <assert.h> void factor (int v) { int n = 2; char multiply = ' '; for (int n = 2; v != 1; n += 1) { int count = 0; /* Only really interested in 2s, 3's, and "the rest". */ if (n == 4) n = v; while (v % n == 0) { count++; v = v / n; } if (count == 0) { printf ("%c %2s %-4s", ' ', "", ""); } else { printf ("%c %2d^%-4d", multiply, n, count); multiply = '*'; } } } void print (const char *mode, int c, int r) { printf (" %-5s %-10d %10d = ", mode, c, r); factor (r); printf ("\n"); } int next_collatz (int n) { if (n & 1) return n * 3 + 1; return n >> 1; } void collatz (int n, bool filter_noise) { printf ("%d\n", n); int c = n; int r = n; bool show; while (r != 1) { assert (c == r); r++; print ("incr", c, r); while ((r & 1) == 0) { c = next_collatz (next_collatz (c)); r += r >> 1; assert (r - 1 == c); if (!filter_noise) print ("", c, r); } if (filter_noise) print ("", c, r); r--; print ("decr", c, r); show = true; while ((r & 1) == 0) { c = next_collatz (c); r = r >> 1; assert (r == c); if (!filter_noise) print ("", c, r); } if (filter_noise) print ("", c, r); } } int main (int argc, char **argv) { bool filter_noise = false; if (argc > 1 && strcmp (argv [1], "-f") == 0) { filter_noise = true; argv++; } while (*++argv) { collatz (strtol (*argv, NULL, 10), filter_noise); } return 0; }