Factoring stalls on certain inputs #187

mhostetter · 2021-10-18T15:37:41Z

As discovered in #185, factoring 2^256 - 1 stalls out which prevents testing if degree-256 polynomials over GF(2) are primitive. This would likely (?) be fixed once the quadratic sieve (#146) is added.

In [1]: import galois                                                                                                                        

# Takes forever... and may never return?
In [2]: galois.factors(2**256 - 1) 
^C---------------------------------------------------------------------------
KeyboardInterrupt                         Traceback (most recent call last)
<ipython-input-3-6806de99fcaa> in <module>
----> 1 galois.factors(2**256 - 1)

~/repos/galois/galois/_polymorphic.py in factors(value)
    514     """
    515     if isinstance(value, (int, np.integer)):
--> 516         return integer_factor.factors(value)
    517     elif isinstance(value, Poly):
    518         return poly_factor.factors(value)

~/repos/galois/galois/_factor.py in factors(n)
    250     # Step 4
    251     while n > 1 and not is_prime(n):
--> 252         f = pollard_rho(n)  # A non-trivial factor
    253         while f is None:
    254             # Try again with a different random function f(x)

~/repos/galois/galois/_factor.py in pollard_rho(n, c)
    571         a = f(a)
    572         b = f(f(b))
--> 573         d = math.gcd(a - b, n)
    574 
    575     if d == n:

KeyboardInterrupt:

The text was updated successfully, but these errors were encountered:

pivis · 2022-12-12T08:47:48Z

Apologies if all of this is obvious to you or already known/considered. I got here specifically because I really liked this project and was playing with larger GF(2**n) fields.

It looks like 2**256-1 factorization as implemented currently gets to the point where it needs to factorize 340282366920938463463374607431768211457==2**128+1 which is 59649589127497217 * 5704689200685129054721. Unfortunately, it looks like Pollard's rho algorithm is getting much more than usual time for this number for offset c=1 (not sure why - square root of the smallest factor is ~24M but it seems that it takes around 455M steps for Pollard's rho to find the factor). That said running pollard_rho with c=3 finds the factor after 19M iterations which is close to regular working times of Pollard's rho.

I only see 2 things can be done:

Have as many as possible p**n-1 factorizations hardcoded (or in a sqlite db similar to conway_polys.db). A reasonable trick would be to introduce a hardcoded list of primes to check as factors where the list specifically contains large-ish primes that are known factors of 2**n-1 with n up to some value (maybe p**n-1 for other p too).
Use more advanced factorization algorithms so they can get a bit further).

For the 1st approach there seems to be some very interesting public materials regarding mersenne numbers factorizations such as Factorization tables from Cunningham book.

I got a bit confused at first with the book only containing 2**n-1 factorization for odd values of n, but that turned out to be just because otherwise 2**(2k)-1 = (2**k-1) * (2**k +1) and there is a separate section of the book for the 2**n+1 factorization where n=4k. E.g. you can find 59649589127497217 there in factorization of 2**128+1.

So if just prime factors from 2**n-1 and 2**n+1 factorizations (tables called 2- and 2+) are taken from the book and saved aside to be checked e.g. after all factors under 10M and before running more advanced algorithms this should be enough to factorize 2**n-1 for some higher n numbers.

More about the notations they are using and such here. There are some things that are not very convenient (e.g. P40 meaning 40-digit prime number without specifying it, and C200 for not-factored number). The primes hidden under P40 notation are present in separate tables it seems, but they would not be necessarily needed here.

pivis · 2022-12-12T08:48:57Z

Specifically the link for the notations explanations: https://homes.cerias.purdue.edu/~ssw/cun/notat.txt

mhostetter · 2022-12-12T15:09:34Z

@pivis thanks for looking into this and responding. I've been hoping someone would help out here -- the prime factorization has always been a weakness in this library. Prime factorization algorithms were new to me when I started this library, so I'm open to review / suggestions / improvements to the factorization algorithms used, and their deployment order in factors().

I got here specifically because I really liked this project and was playing with larger GF(2**n) fields.

Taking too long / forever to factor 2^n and 2^n - 1 is a problem, and has been reported before. In fact, I made and documented some workarounds to avoid those factorizations, if necessary in a pinch. See the "Speed up creation of large finite field classes" note. However, this is only a stopgap and a better solution is definitely needed.

My thoughts on your ideas:

Have as many as possible p**n-1 factorizations hardcoded (or in a sqlite db similar to conway_polys.db)

I think this is a good idea. Finite fields with order 2^n seem to be particularly popular, so adding a database of factorizations for 2^n and 2^n - 1 seems wise. I can work on incorporating that. Thanks for the links. I will definitely use them to construct the database.

Potentially in the future, we can add more p, as you mention. Although, which ones and how many degrees? That gets large quickly...

Use more advanced factorization algorithms so they can get a bit further).

I tried implementing the Quadratic Sieve algorithm twice before, but for various reasons never got it fully working / merged. If you're aware of more powerful algorithms and are interested in contributing, I'd love the help.

I also recently noticed that the perfect power detection (Step 2 in the factors()) seems way too slow (or perhaps I don't have an appropriate expectation), see #443. My suspicion is that iroot() is the culprit. Any help in debugging / diagnosing issues there would be helpful too.

mhostetter · 2022-12-18T21:14:46Z

@pivis the parsing of those tables was more annoying than anticipated! But I added a database of all the prime factorizations in #452. This will really speed things up!

I'm also patching perfect_power() in #454 to speed up factoring powers of small primes.

Thanks for commenting here, the suggestions, and feedback. I'll release 0.3.2 with the improvement today.

mhostetter · 2022-12-19T01:10:43Z

I'm closing this. This will now be tracked in the consolidated #456.

mhostetter added the bug Something isn't working label Oct 18, 2021

mhostetter self-assigned this Oct 18, 2021

mhostetter mentioned this issue Oct 18, 2021

Incorrect is_primitive_poly attribute of "large" finite fields #188

Closed

mhostetter mentioned this issue Dec 23, 2021

Field Trace function crash on specific irreducible polynomial #212

Closed

mhostetter added the help wanted Extra attention is needed label Jan 23, 2022

mhostetter pinned this issue Mar 14, 2022

mhostetter unpinned this issue Mar 14, 2022

mhostetter mentioned this issue Nov 8, 2022

Lagrange interpolation speed problem #427

Closed

mhostetter pinned this issue Dec 9, 2022

mhostetter mentioned this issue Dec 12, 2022

Fix a bug in factorization function causing infinite loops in rare cases #450

Merged

This was referenced Dec 12, 2022

Add database of prime factorizations #451

Closed

Pollard rho has too many loops? #453

Open

mhostetter closed this as completed Dec 19, 2022

mhostetter unpinned this issue Dec 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Factoring stalls on certain inputs #187

Factoring stalls on certain inputs #187

mhostetter commented Oct 18, 2021 •

edited

Loading

pivis commented Dec 12, 2022

pivis commented Dec 12, 2022

mhostetter commented Dec 12, 2022

mhostetter commented Dec 18, 2022

mhostetter commented Dec 19, 2022

Factoring stalls on certain inputs #187

Factoring stalls on certain inputs #187

Comments

mhostetter commented Oct 18, 2021 • edited Loading

pivis commented Dec 12, 2022

pivis commented Dec 12, 2022

mhostetter commented Dec 12, 2022

mhostetter commented Dec 18, 2022

mhostetter commented Dec 19, 2022

mhostetter commented Oct 18, 2021 •

edited

Loading