# How Prime Factorizations govern the Collatz Conjecture

By applying some optimizations to the Collatz Conjecture, we can reduce noise that obscures the patterns that produce the sequences. It becomes apparent that the sequences generated are related to prime factorizations: that of the number itself for decending, whereby all 2's are removed; and that of the next higher even number for ascending, whereby twos and threes are traded off.Herein, I'll lay out the math, and show the interesting things it tells us about the Conjecture.

## Contents

## 1. Get Rid of Noise

The Collatz is iterative. The usual rule to get the next value is, for any positive integer $c$:

$Collatz\left(c\right)=\left(\{,\begin{array}{cc}3c+1& ifcis\mathrm{odd}\\ \frac{c}{2}& ifcis\mathrm{even}\end{array}\right)$

The conjecture is that if we repeat this enough times, we will always arrive at 1.

An odd $c$ *always* produces an even number, so subsequently we
must *always* execute the "even" half of the equation at
least once. So, as any good engineer would do, let's optimize by
getting rid of the unnecessary iteration and the noise it
generates:

$Collatz\left(c\right)=\left(\{,\begin{array}{cc}\frac{3c+1}{2}& ifcis\mathrm{odd}\\ \frac{c}{2}& ifcis\mathrm{even}\end{array}\right)$

This means we always end up with a fractional half, before adding ½. (An odd times an odd is always odd, so the division results in the fractional ½). We can get rid of this by using $c+1$ as the input and making the appropriate adjustments:

$\mathrm{odd(c)}=\frac{3}{2}c+\frac{1}{2}=\left(\frac{3}{2}\right)\xb7(c+1)+\frac{1}{2}-\frac{3}{2}=\frac{3}{2}(c+1)-1$

Note that the last term is $-1$. If the successive iteration is also odd, the first step of which is to add 1, these effectively cancel out.

$\mathrm{odd(odd(c))}=\frac{3}{2}(\frac{3}{2}(c+1)-1+1)-1=\frac{3}{2}\left(\frac{3}{2}(c+1)\right)-1={\left(\frac{3}{2}\right)}^{2}\xb7(c+1)-1$

That is, we can redefine "odd" algorithmically:

- Add 1.
- While this value is even, divide by 2 and multiply by 3.
- Subtract 1.

Or, to write it mathematically, if we knew how many times to
repeat *n*:

$\mathrm{odd(c)}={\left(\frac{3}{2}\right)}^{n}\xb7(c+1)-1$So how do we know the value of

*n*?

- $c$ is odd → $c+1$ is even, and thus its prime factorization contains at least one 2.
- $c+1$ is even → $\frac{3}{2}(c+1)$ is whole.
- Looking at the prime factorization if $c+1$, multiplying by $\frac{3}{2}$ in effect replaces a 2 with a 3.
- $(c+1)mod4=0$ → $\frac{3}{2}(c+1)$ is even, since there were two 2s in the prime factorization of $c+1$, leaving one remaining.
- $(c+1)mod8=0$ → $\frac{3}{2}(c+1)mod4=0$.
- $(c+1)mod16=0$ → $\frac{3}{2}(c+1)mod8=0$.
- ...and so forth.

So, generally, *n* = # of 2s in prime factorization of
$c+1$.

### 1.1. So what's this telling us

- For any $c$ the number $c+1$ governs whether it can ascend, and how far. I will call this the "control number".
- An ascending Collatz series is related by the prime factorizations of the corresponding control numbers. These prime factorizations will all be the same overall size, and will contain the same set of numbers, with only the numbers of 2s and 3s varying.
- The combined number of 2s and 3s in the prime factorization tells us the length of this "ascending chain".
- The numbers of 2s in the prime factorization tells us how many links are left to reach the upper end of the chain. If there are no 2s, we are at the upper bound of the chain.
- If there are neither 2s nor 3s, the number is not part of a chain.
- Reminder: these represent the
*control number*. The corresponding Collatz points are 1 less.

### 1.2. Example

For clarity, let's review some examples:

15 Collatz Control Ctrl factors 15: 16 = (start) 2^4 23: 24 = (Inc,3) 2^3 * 3^1 35: 36 = (Inc,3) 2^2 * 3^2 53: 54 = (Inc,3) 2^1 * 3^3 80: 81 = (Inc,3) 3^4 40: 41 = (dec ) 41^1 20: 21 = (Dec,3) 3^1 * 7^1 10: 11 = (dec ) 11^1 5: 6 = (Dec,3) 2^1 * 3^1 8: 9 = (Inc,3) 3^2 4: 5 = (dec ) 5^1 2: 3 = (Dec,3) 3^1 1: 2 = (dec ) 2^1With the "standard" Collatz, the sequence from 15 would be 15, 46, 23, 70, 35, 106, 53, 160, 80, 40, 20, 10, etc. Since we debounced it, we eliminate the unnecessary and confusing waypoints, leaving us with 15,

Lest you think this is cherry-picked data, let's look at some output from a random number 5-digit number I picked. In this variation, we won't do a full prime factorization, instead just showing the 2s, 3s, and whatever is leftover. To see some long chains, look for 335, 319, and 911.

17835 Collatz Control incr 17835 17836 = 2^2 * 4459^1 26753 26754 = 2^1 * 3^1 * 4459^1 40130 40131 = 3^2 * 4459^1 decr 40130 40130 = 2^1 * 20065^1 20065 20065 = 20065^1 incr 20065 20066 = 2^1 * 10033^1 30098 30099 = 3^1 * 10033^1 decr 30098 30098 = 2^1 * 15049^1 15049 15049 = 15049^1 incr 15049 15050 = 2^1 * 7525^1 22574 22575 = 3^1 * 7525^1 decr 22574 22574 = 2^1 * 11287^1 11287 11287 = 11287^1 incr 11287 11288 = 2^3 * 1411^1 16931 16932 = 2^2 * 3^1 * 1411^1 25397 25398 = 2^1 * 3^2 * 1411^1 38096 38097 = 3^3 * 1411^1 decr 38096 38096 = 2^4 * 2381^1 19048 19048 = 2^3 * 2381^1 9524 9524 = 2^2 * 2381^1 4762 4762 = 2^1 * 2381^1 2381 2381 = 2381^1 incr 2381 2382 = 2^1 * 3^1 * 397^1 3572 3573 = 3^2 * 397^1 decr 3572 3572 = 2^2 * 893^1 1786 1786 = 2^1 * 893^1 893 893 = 893^1 incr 893 894 = 2^1 * 3^1 * 149^1 1340 1341 = 3^2 * 149^1 decr 1340 1340 = 2^2 * 335^1 670 670 = 2^1 * 335^1335 335 = 335^1 incr 335 336 = 2^4 * 3^1 * 7^1 503 504 = 2^3 * 3^2 * 7^1 755 756 = 2^2 * 3^3 * 7^1 1133 1134 = 2^1 * 3^4 * 7^1 1700 1701 = 3^5 * 7^1 decr 1700 1700 = 2^2 * 425^1850 850 = 2^1 * 425^1 425 425 = 425^1 incr 425 426 = 2^1 * 3^1 * 71^1 638 639 = 3^2 * 71^1 decr 638 638 = 2^1 * 319^1319 319 = 319^1 incr 319 320 = 2^6 * 5^1 479 480 = 2^5 * 3^1 * 5^1 719 720 = 2^4 * 3^2 * 5^1 1079 1080 = 2^3 * 3^3 * 5^1 1619 1620 = 2^2 * 3^4 * 5^1 2429 2430 = 2^1 * 3^5 * 5^1 3644 3645 = 3^6 * 5^1 decr 3644 3644 = 2^2 * 911^11822 1822 = 2^1 * 911^1911 911 = 911^1 incr 911 912 = 2^4 * 3^1 * 19^1 1367 1368 = 2^3 * 3^2 * 19^1 2051 2052 = 2^2 * 3^3 * 19^1 3077 3078 = 2^1 * 3^4 * 19^1 4616 4617 = 3^5 * 19^1 decr 4616 4616 = 2^3 * 577^12308 2308 = 2^2 * 577^1 1154 1154 = 2^1 * 577^1 577 577 = 577^1 incr 577 578 = 2^1 * 289^1 866 867 = 3^1 * 289^1 decr 866 866 = 2^1 * 433^1 433 433 = 433^1 incr 433 434 = 2^1 * 217^1 650 651 = 3^1 * 217^1 decr 650 650 = 2^1 * 325^1 325 325 = 325^1 incr 325 326 = 2^1 * 163^1 488 489 = 3^1 * 163^1 decr 488 488 = 2^3 * 61^1 244 244 = 2^2 * 61^1 122 122 = 2^1 * 61^1 61 61 = 61^1 incr 61 62 = 2^1 * 31^1 92 93 = 3^1 * 31^1 decr 92 92 = 2^2 * 23^1 46 46 = 2^1 * 23^1 23 23 = 23^1 incr 23 24 = 2^3 * 3^1 35 36 = 2^2 * 3^2 53 54 = 2^1 * 3^3 80 81 = 3^4 decr 80 80 = 2^4 * 5^1 40 40 = 2^3 * 5^1 20 20 = 2^2 * 5^1 10 10 = 2^1 * 5^1 5 5 = 5^1 incr 5 6 = 2^1 * 3^1 8 9 = 3^2 decr 8 8 = 2^3 4 4 = 2^2 2 2 = 2^1 1 1 =

### 1.3. Looking at this in reverse

I said earlier that the length of an ascending chain is equal to the combined 2s and 3s in the factorization of the control number, and that the number of 2s indicates how many steps remain to ascend.Conversely, the 3s represent the number of *prior* (i.e.
lower) links in the chain.

Any number $c$ of course "comes from" $2c$. Until now, however, there was no way to tell if $c$ also "comes from" $(2c-1)\xf73$ except by trial-and-error: that portion of the Collatz only went forward.

By looking at the factorization of control numbers $c+1$, it is evident whether there are earlier links, and how many of them, from which it is possible to "come from":

- If the factorization of $c+1$ contains
*no*threes, then $c$ is not arrived at from a lower number. - If the factorization
*does*contain threes, then the number of threes indicates the number of lower values which ascend to this one. - You can find these lower control numbers by changing 3s in the factorization to twos and multiplying out; subtract 1 for the corresponding Collatz points.

So, again picking a number at random: 18935. Our "control" number will thus be 189361 , whose factorization is ${2}^{3}\xb7{3}^{2}\xb7263$.

Changing the 3s to 2s, we have ${2}^{5}\xb7263=8416$, and changing the 2s to 3s, we have ${3}^{5}\xb7263=63909$. This ascension chain will be length 5, starting at Collatz value 8415 and ending at 63908.

8415 8415: 8416 = (start) 2^5 * 263^1 12623: 12624 = (Inc,3) 2^4 * 3^1 * 263^1 18935: 18936 = (Inc,3) 2^3 * 3^2 * 263^1 28403: 28404 = (Inc,3) 2^2 * 3^3 * 263^1 42605: 42606 = (Inc,3) 2^1 * 3^4 * 263^1 63908: 63909 = (Inc,3) 3^5 * 263^1

## Programs

### 2.1. collatz.c

This program follows the optimized Collatz sequence and prints the factorizations of the next-higher values while both ascending and descending.

Usage: `collatz <number>...`

/* Collatz conjecture research. Perette Barella. 2018-03-22. Please ensure I'm credited if this leads anywhere. */ #include <stdio.h> #include <string.h> #include <stdlib.h> #include <assert.h> void prime_factor (int v) { int n = 2; char multiply = ' '; for (int n = 2; v != 1; n += 1) { int count = 0; while (v % n == 0) { count++; v = v / n; } if (count) { printf ("%c %2d^%-4d", multiply, n, count); multiply = '*'; } } } void collatz (int n) { printf ("%d\n", n); int c = n; const char *action = "start"; while (1) { printf ("%9d: %9d = (%-5s) ", c, c+1, action); prime_factor (c + 1); printf ("\n"); if (c == 1) break; if (c & 1) { c = (c * 3 + 1) >> 1; action = "inc"; if ((c + 1) % 3 == 0) action = "Inc,3"; else assert (!"This doesn't happen!"); } else { c = c >> 1; action = "dec"; if ((c + 1) % 3 == 0) action = "Dec,3"; } } } int main (int argc, char **argv) { while (*++argv) { collatz (strtol (*argv, NULL, 10)); } return 0; }

### 2.2. coder.c

This program follows the optimized Collatz sequence. When ascending, for a given Collatz point $c$ the factorization of $c+1$. When descending, the factorization of $c$ is shown. For the factorizations, 2s, 3s are prime factored out, and the rest left as a non-prime factor (unless it just happens to be prime).Usage: `coder [-f] <number>...`

The `-f`

flag causes only peaks and troughs to be
shown, skipping intermediate values in ascents and descents.

/* Collatz conjecture research. Perette Barella. 2018-03-22. Please ensure I'm credited if this leads anywhere. */ #include <stdio.h> #include <string.h> #include <stdlib.h> #include <stdbool.h> #include <assert.h> void factor (int v) { int n = 2; char multiply = ' '; for (int n = 2; v != 1; n += 1) { int count = 0; /* Only really interested in 2s, 3's, and "the rest". */ if (n == 4) n = v; while (v % n == 0) { count++; v = v / n; } if (count == 0) { printf ("%c %2s %-4s", ' ', "", ""); } else { printf ("%c %2d^%-4d", multiply, n, count); multiply = '*'; } } } void print (const char *mode, int c, int r) { printf (" %-5s %-10d %10d = ", mode, c, r); factor (r); printf ("\n"); } int next_collatz (int n) { if (n & 1) return n * 3 + 1; return n >> 1; } void collatz (int n, bool filter_noise) { printf ("%d\n", n); int c = n; int r = n; bool show; while (r != 1) { assert (c == r); r++; print ("incr", c, r); while ((r & 1) == 0) { c = next_collatz (next_collatz (c)); r += r >> 1; assert (r - 1 == c); if (!filter_noise) print ("", c, r); } if (filter_noise) print ("", c, r); r--; print ("decr", c, r); show = true; while ((r & 1) == 0) { c = next_collatz (c); r = r >> 1; assert (r == c); if (!filter_noise) print ("", c, r); } if (filter_noise) print ("", c, r); } } int main (int argc, char **argv) { bool filter_noise = false; if (argc > 1 && strcmp (argv [1], "-f") == 0) { filter_noise = true; argv++; } while (*++argv) { collatz (strtol (*argv, NULL, 10), filter_noise); } return 0; }

## Footnotes:

- 1. Quite a lucky first pick, too, occuring squarely in the middle of a sequence.