NAME

Math::Prime::Util - Utilities related to prime numbers, including fast sieves and factoring

VERSION

Version 0.74

SYNOPSIS

# Nothing is exported by default.  List the functions, or use :all.
use Math::Prime::Util ':all';  # import all functions

# The ':rand' tag replaces srand and rand (not done by default)
use Math::Prime::Util ':rand';  # import srand, rand, irand, irand64


# Get a big array reference of many primes
my $aref = primes( 100_000_000 );

# All the primes between 5k and 10k inclusive
$aref = primes( 5_000, 10_000 );

# If you want them in an array instead
my @primes = @{primes( 500 )};

# You can do something for every prime in a range.  Twin primes to 10k:
forprimes { say if is_prime($_+2) } 10000;
# Or for the composites in a range
forcomposites { say if is_strong_pseudoprime($_,2) } 10000, 10**6;

# For non-bigints, is_prime and is_prob_prime will always be 0 or 2.
# They return 0 (composite), 2 (prime), or 1 (probably prime)
my $n = 1000003;  # for example
say "$n is prime"  if is_prime($n);
say "$n is ", (qw(composite maybe_prime? prime))[is_prob_prime($n)];

# Strong pseudoprime test with multiple bases, using Miller-Rabin
say "$n is a prime or 2/7/61-psp" if is_strong_pseudoprime($n, 2, 7, 61);

# Standard and strong Lucas-Selfridge, and extra strong Lucas tests
say "$n is a prime or lpsp"   if is_lucas_pseudoprime($n);
say "$n is a prime or slpsp"  if is_strong_lucas_pseudoprime($n);
say "$n is a prime or eslpsp" if is_extra_strong_lucas_pseudoprime($n);

# step to the next prime (returns 0 if not using bigints and we'd overflow)
$n = next_prime($n);

# step back (returns undef if given input 2 or less)
$n = prev_prime($n);


# Return Pi(n) -- the number of primes E<lt>= n.
my $primepi = prime_count( 1_000_000 );
$primepi = prime_count( 10**14, 10**14+1000 );  # also does ranges

# Quickly return an approximation to Pi(n)
my $approx_number_of_primes = prime_count_approx( 10**17 );

# Lower and upper bounds.  lower <= Pi(n) <= upper for all n
die unless prime_count_lower($n) <= prime_count($n);
die unless prime_count_upper($n) >= prime_count($n);


# Return p_n, the nth prime
say "The ten thousandth prime is ", nth_prime(10_000);

# Return a quick approximation to the nth prime
say "The one trillionth prime is ~ ", nth_prime_approx(10**12);

# Lower and upper bounds.   lower <= nth_prime(n) <= upper for all n
die unless nth_prime_lower($n) <= nth_prime($n);
die unless nth_prime_upper($n) >= nth_prime($n);


# Get the prime factors of a number
my @prime_factors = factor( $n );

# Return ([p1,e1],[p2,e2], ...) for $n = p1^e1 * p2^e2 * ...
my @pe = factor_exp( $n );

# Get all divisors including 1 and n
my @divisors = divisors( $n );
# Or just apply a block for each one
my $sum = 0; fordivisors  { $sum += $_ + $_*$_ }  $n;

# Euler phi (Euler's totient) on a large number
use bigint;  say euler_phi( 801294088771394680000412 );
say jordan_totient(5, 1234);  # Jordan's totient

# Moebius function used to calculate Mertens
$sum += moebius($_) for (1..200); say "Mertens(200) = $sum";
# Mertens function directly (more efficient for large values)
say mertens(10_000_000);
# Exponential of Mangoldt function
say "lamba(49) = ", log(exp_mangoldt(49));
# Some more number theoretical functions
say liouville(4292384);
say chebyshev_psi(234984);
say chebyshev_theta(92384234);
say partitions(1000);
# Show all prime partitions of 25
forpart { say "@_" unless scalar grep { !is_prime($_) } @_ } 25;
# List all 3-way combinations of an array
my @cdata = qw/apple bread curry donut eagle/;
forcomb { say "@cdata[@_]" } @cdata, 3;
# or all permutations
forperm { say "@cdata[@_]" } @cdata;

# divisor sum
my $sigma  = divisor_sum( $n );       # sum of divisors
my $sigma0 = divisor_sum( $n, 0 );    # count of divisors
my $sigmak = divisor_sum( $n, $k );
my $sigmaf = divisor_sum( $n, sub { log($_[0]) } ); # arbitrary func

# primorial n#, primorial p(n)#, and lcm
say "The product of primes below 47 is ",     primorial(47);
say "The product of the first 47 primes is ", pn_primorial(47);
say "lcm(1..1000) is ", consecutive_integer_lcm(1000);

# Ei, li, and Riemann R functions
my $ei   = ExponentialIntegral($x);   # $x a real: $x != 0
my $li   = LogarithmicIntegral($x);   # $x a real: $x >= 0
my $R    = RiemannR($x);              # $x a real: $x > 0
my $Zeta = RiemannZeta($x);           # $x a real: $x >= 0


# Precalculate a sieve, possibly speeding up later work.
prime_precalc( 1_000_000_000 );

# Free any memory used by the module.
prime_memfree;

# Alternate way to free.  When this leaves scope, memory is freed.
use Math::Prime::Util::MemFree;
my $mf = Math::Prime::Util::MemFree->new;


# Random primes
my($rand_prime);
$rand_prime = random_prime(1000);        # random prime <= limit
$rand_prime = random_prime(100, 10000);  # random prime within a range
$rand_prime = random_ndigit_prime(6);    # random 6-digit prime
$rand_prime = random_nbit_prime(128);    # random 128-bit prime
$rand_prime = random_safe_prime(192);    # random 192-bit safe prime
$rand_prime = random_strong_prime(256);  # random 256-bit strong prime
$rand_prime = random_maurer_prime(256);  # random 256-bit provable prime
$rand_prime = random_shawe_taylor_prime(256);  # as above

DESCRIPTION

A module for number theory in Perl. This includes prime sieving, primality tests, primality proofs, integer factoring, counts / bounds / approximations for primes, nth primes, and twin primes, random prime generation, and much more.

This module is the fastest on CPAN for almost all operations it supports. This includes Math::Prime::XS, Math::Prime::FastSieve, Math::Factor::XS, Math::Prime::TiedArray, Math::Big::Factors, Math::Factoring, and Math::Primality (when the GMP module is available). For numbers in the 10-20 digit range, it is often orders of magnitude faster. Typically it is faster than Math::Pari for 64-bit operations.

All operations support both Perl UV's (32-bit or 64-bit) and bignums. If you want high performance with big numbers (larger than Perl's native 32-bit or 64-bit size), you should install Math::Prime::Util::GMP and Math::BigInt::GMP. This will be a recurring theme throughout this documentation -- while all bignum operations are supported in pure Perl, most methods will be much slower than the C+GMP alternative.

The module is thread-safe and allows concurrency between Perl threads while still sharing a prime cache. It is not itself multi-threaded. See the Limitations section if you are using Win32 and threads in your program. Also note that Math::Pari is not thread-safe (and will crash as soon as it is loaded in threads), so if you use Math::BigInt::Pari rather than Math::BigInt::GMP or the default backend, things will go pear-shaped.

Two scripts are also included and installed by default:

primes.pl displays primes between start and end values or expressions, with many options for filtering (e.g. twin, safe, circular, good, lucky, etc.). Use --help to see all the options.
factor.pl operates similarly to the GNU factor program. It supports bigint and expression inputs.

ENVIRONMENT VARIABLES

There are two environment variables that affect operation. These are typically used for validation of the different methods or to simulate systems that have different support.

MPU_NO_XS

If set to 1, everything is run in pure Perl. No C functions are loaded or used, as XSLoader is not even called. All top-level XS functions are replaced by a pure Perl layer (the PPFE.pm module that supplies a "Pure Perl Front End").

Caveat: This does not change whether the GMP backend is used. For as much pure Perl as possible, you will need to set both variables.

If this variable is not set or set to anything other than 1, the module operates normally.

MPU_NO_GMP

If set to 1, the Math::Prime::Util::GMP backend is not loaded, and operation will be exactly as if it was not installed.

If this variable is not set or set to anything other than 1, the module operates normally.

MPU_DEVNAMES

If set to 1, the PP package will be loaded on startup rather than on demand, and the package aliases MPU, PP, GMP will be used for the main, Perl, and GMP packages respectively. Normally you wouldn't want this both for aggressive namespace pollution and for performance (there is often no need to load the huge PP module). But it is convenient if one wants to call the different paths explicitly.

Regarding performance, on a 2020 Macbook M1, normal startup time is about 10 milliseconds. With this option set it becomes 45 milliseconds. This is the reason the PP code is only loaded if needed. For many purposes this amount of time is trivial, but slower computers or more time critical short applications will care.

BIGNUM SUPPORT

By default all functions support bigints. For performance, you should install Math::Prime::Util::GMP which will be automatically used as a backend.

The default bigint class is Math::BigInt, which is not particularly speedy but is available by default in all Perl distributions, and is well tested. If you want to try something different, you can install and use Math::GMPz or Math::GMP which will be much faster. You can have this module use and return them using, for example:

prime_set_config(bigint => Math::GMPz);
my $n = next_prime(~0);
say "$n ",ref($n);
# 18446744073709551629 Math::GMPz

If you use Math::BigInt, I highly recommend also installing one of Math::BigInt::GMPz, Math::BigInt::GMP, or Math::BigInt::LTM.

If you are using bigints, here are some performance suggestions:

Install a recent version of Math::Prime::Util::GMP, as that will vastly increase the speed of many of the functions. This does require the GMP library be installed on your system, but this increasingly comes pre-installed or easily available using the OS vendor package installation tool.
Install and use Math::BigInt::GMP (or GMPz or LTM), then use use bigint try => 'GMP,GMPz,LTM,Pari' in your script, or on the command line e.g. -Mbigint=lib,GMP. Large modular exponentiation is much faster using the better backends, as are the math and approximation functions when called with very large inputs.
I have run these functions on many versions of Perl, and my experience is that if you're using anything older than Perl 5.14, I would recommend you upgrade if you are using bignums a lot. There are some brittle behaviors on 5.12.4 and earlier with bignums. For example, the default BigInt backend in older versions of Perl will sometimes convert small results to doubles, resulting in corrupted output.

PRIMALITY TESTING

This module provides three functions for general primality testing, as well as numerous specialized functions. The three main functions are: "is_prob_prime" and "is_prime" for general use, and "is_provable_prime" for proofs. For inputs below 2^64 the functions are identical and fast deterministic testing is performed. That is, the results will always be correct and should take at most a few microseconds for any input. This is hundreds to thousands of times faster than other CPAN modules. For inputs larger than 2^64, an extra-strong BPSW test is used. See the "PRIMALITY TESTING NOTES" section for more discussion.

Following the semantics used by Pari/GP, all primality test functions allow a negative primary argument, but will return false. All inputs must be integers or an error is raised.

FUNCTIONS

is_prime

print "$n is prime" if is_prime($n);

Given an integer n, returns 0 if the number is composite, 1 if it is probably prime, and 2 if it is definitely prime. For numbers smaller than 2^64 it will only return 0 (composite) or 2 (definitely prime), as this range has been exhaustively tested and has no counterexamples. For larger numbers, an extra-strong BPSW test is used. If Math::Prime::Util::GMP is installed, some additional primality tests are also performed, and a quick attempt is made to perform a primality proof, so it will return 2 for many other inputs.

Also see the "is_prob_prime" function, which will never do additional tests, and the "is_provable_prime" function which will construct a proof that the input is prime and returns 2 for almost all primes (at the expense of speed).

For native precision numbers (anything smaller than 2^64, all three functions are identical and use a deterministic set of tests (selected Miller-Rabin bases or BPSW). For larger inputs both "is_prob_prime" and "is_prime" return probable prime results using the extra-strong Baillie-PSW test, which has had no counterexample found since it was published in 1980.

For cryptographic key generation, you may want even more testing for probable primes (NIST recommends some additional M-R tests). This can be done using a different test (e.g. "is_frobenius_underwood_pseudoprime") or using additional M-R tests with random bases with "miller_rabin_random". Even better, make sure Math::Prime::Util::GMP is installed and use "is_provable_prime" which should be reasonably fast for sizes under 2048 bits. Another possibility is to use "random_maurer_prime" in Math::Prime::Util or "random_shawe_taylor_prime" in Math::Prime::Util which construct random provable primes.

primes

Returns all the primes between the lower and upper limits (inclusive), with a lower limit of 2 if none is given.

An array reference is returned (with large lists this is much faster and uses less memory than returning an array directly).

my $aref1 = primes( 1_000_000 );
my $aref2 = primes( 1_000_000_000_000, 1_000_000_001_000 );

my @primes = @{ primes( 500 ) };

print "$_\n" for @{primes(20,100)};

Sieving will be done if required. The algorithm used will depend on the range and whether a sieve result already exists. Possibilities include primality testing (for very small ranges), a Sieve of Eratosthenes using wheel factorization, or a segmented sieve.

next_prime

$n = next_prime($n);

Returns the next prime greater than the input number. The result will be a bigint if it can not be exactly represented in the native int type (larger than 4,294,967,291 in 32-bit Perl; larger than 18,446,744,073,709,551,557 in 64-bit).

prev_prime

$n = prev_prime($n);

Returns the prime preceding the input number (i.e. the largest prime that is strictly less than the input). undef is returned if the input is 2 or lower.

The behavior in various programs of the previous prime function is varied. Pari/GP and Math::Pari returns the input if it is prime, as does "nearest_le" in Math::Prime::FastSieve. When given an input such that the return value will be the first prime less than 2, Math::Prime::FastSieve, Math::Pari, Pari/GP, and older versions of MPU will return 0. Math::Primality and the current MPU will return undef. WolframAlpha returns -2. Maple gives a range error.

forprimes

forprimes { say } 100,200;                  # print primes from 100 to 200

$sum=0;  forprimes { $sum += $_ } 100000;   # sum primes to 100k

forprimes { say if is_prime($_+2) } 10000;  # print twin primes to 10k

Given a block and either an end count or a start and end pair, calls the block for each prime in the range. Compared to getting a big array of primes and iterating through it, this is more memory efficient and perhaps more convenient. This will almost always be the fastest way to loop over a range of primes. Nesting and use in threads are allowed.

Math::BigInt objects may be used for the range.

For some uses an iterator ("prime_iterator", "prime_iterator_object") or a tied array (Math::Prime::Util::PrimeArray) may be more convenient. Objects can be passed to functions, and allow early loop exits.

forcomposites

forcomposites { say } 1000;
forcomposites { say } 2000,2020;

Given a block and either an end number or a start and end pair, calls the block for each composite in the inclusive range. The composites, OEIS A002808, are the numbers greater than 1 which are not prime: 4, 6, 8, 9, 10, 12, 14, 15, ....

foroddcomposites

Similar to "forcomposites", but skipping all even numbers. The odd composites, OEIS A071904, are the numbers greater than 1 which are not prime and not divisible by two: 9, 15, 21, 25, 27, 33, 35, ....

forsemiprimes

Similar to "forcomposites", but only giving composites with exactly two factors. The semiprimes, OEIS A001358, are the products of two primes: 4, 6, 9, 10, 14, 15, 21, 22, 25, ....

This is essentially equivalent to:

forcomposites { if (is_semiprime($_)) { ... } }

foralmostprimes

foralmostprimes  { say }  3,  1000,2000;  # 3-almost-primes in [1000,2000]

Similar to "forprimes", "forsemiprimes", etc. but takes an additional first argument k and loops through the inclusive range for only those numbers with exactly k factors. If k=1 these are the primes, if k=2 these are the semiprimes, if k=3 these are the integers in the range with exactly 3 prime factors, etc.

This is functionally equivalent to:

for ($a .. $b) { if (is_almost_prime($k,$_)) { ... } }
# or
for ($a .. $b) { if (prime_bigomega($_) == $k) { ... } }

though significantly faster and avoids issues with large loop variables.

forfactored

forfactored { say "$_: @_"; } 100;

Given a block and either an end number or start/end pair, calls the block for each number in the inclusive range. $_ is set to the number while @_ holds the factors. Especially for small inputs or large ranges, this can be faster than calling "factor" on each sequential value.

Similar to the arrays returned by similar functions such as "forpart", the values in @_ are read-only. Any attempt to modify them will result in undefined behavior.

This corresponds to the Pari/GP 2.10 forfactored function.

forsquarefree

Similar to "forfactored", but skipping numbers in the range that have a repeated factor. Inside the block, the moebius function can be cheaply computed as ((scalar(@_) & 1) ? -1 : 1) or similar.

This corresponds to the Pari/GP 2.10 forsquarefree function.

forsquarefreeint

Similar to "forsquarefree", but only sieves for square-free integers in the range (in segments so very large ranges still use little memory). No factoring information is returned: the @_ variable is not set. In return it is 2 to 20 times faster.

As with range functions such as "foralmostprimes" this can be much faster than calling "is_square_free" for each integer in a large range.

fordivisors

fordivisors { $prod *= $_ } $n;

Given a block and a non-negative number n, the block is called with $_ set to each divisor in sorted order. Also see "divisor_sum".

forpart

forpart { say "@_" } 25;           # unrestricted partitions
forpart { say "@_" } 25,{n=>5}     # ... with exactly 5 values
forpart { say "@_" } 25,{nmax=>5}  # ... with <=5 values

Given a non-negative number n, the block is called with @_ set to the array of additive integer partitions. The operation is very similar to the forpart function in Pari/GP 2.6.x, though the ordering is different. The ordering is lexicographic. Use "partitions" to get just the count of unrestricted partitions.

An optional hash reference may be given to produce restricted partitions. Each value must be a non-negative integer. The allowable keys are:

n       restrict to exactly this many values
amin    all elements must be at least this value
amax    all elements must be at most this value
nmin    the array must have at least this many values
nmax    the array must have at most this many values
prime   all elements must be prime (non-zero) or non-prime (zero)

Like forcomb and forperm, the partition return values are read-only. Any attempt to modify them will result in undefined behavior.

forcomp

Similar to "forpart", but iterates over integer compositions rather than partitions. This can be thought of as all orderings of partitions, or alternately partitions may be viewed as an ordered subset of compositions. The ordering is lexicographic. All options from "forpart" may be used.

The number of unrestricted compositions of n is 2^(n-1).

forcomb

Given non-negative arguments n and k, the block is called with @_ set to the k element array of values from 0 to n-1 representing the combinations in lexicographical order. While the "binomial" function gives the total number, this function can be used to enumerate the choices.

Rather than give a data array as input, an integer is used for n. A convenient way to map to array elements is:

forcomb { say "@data[@_]" } @data, 3;

where the block maps the combination array @_ to array values, the argument for n is given the array since it will be evaluated as a scalar and hence give the size, and the argument for k is the desired size of the combinations.

Like forpart and forperm, the index return values are read-only. Any attempt to modify them will result in undefined behavior.

If the second argument k is not supplied, then all k-subsets are returned starting with the smallest set k=0 and continuing to k=n. Each k-subset is in lexicographical order. This is the power set of n.

This corresponds to the Pari/GP 2.10 forsubset function.

forperm

Given non-negative argument n, the block is called with @_ set to the k element array of values from 0 to n-1 representing permutations in lexicographical order. The total number of calls will be n!.

Rather than give a data array as input, an integer is used for n. A convenient way to map to array elements is:

forperm { say "@data[@_]" } @data;

where the block maps the permutation array @_ to array values, and the argument for n is given the array since it will be evaluated as a scalar and hence give the size.

Like forpart and forcomb, the index return values are read-only. Any attempt to modify them will result in undefined behavior.

forderange

Similar to forperm, but iterates over derangements. This is the set of permutations skipping any which maps an element to its original position.

formultiperm

# Show all anagrams of 'serpent':
formultiperm { say join("",@_) } [split(//,"serpent")];

Similar to "forperm" but takes an array reference as an argument. This is treated as a multiset, and the block will be called with each multiset permutation. While the standard permutation iterator takes a scalar and returns index permutations, this takes the set itself.

If all values are unique, then the results will be the same as a standard permutation. Otherwise, the results will be similar to a standard permutation removing duplicate entries. While generating all permutations and filtering out duplicates works, it is very slow for large sets. This iterator will be much more efficient.

There is no ordering requirement for the input array reference. The results will be in lexicographic order.

forsetproduct

forsetproduct { say "@_" } [1,2,3],[qw/a b c/],[qw/@ $ !/];

Takes zero or more array references as arguments and iterates over the set product (i.e. Cartesian product or cross product) of the lists. The given subroutine is repeatedly called with @_ set to the current list. Since no de-duplication is done, this is not literally a set product.

While zero or one array references are valid, the result is not very interesting. If any array reference is empty, the product is empty, so no subroutine calls are performed.

The subroutine is given an array whose values are aliased to the inputs, and are not set to read-only. Hence modifying the array inside the subroutine will cause side-effects.

As with other iterators, the lastfor function will cause an early exit.

lastfor

forprimes { lastfor,return if $_ > 1000; $sum += $_; } 1e9;

Calling lastfor requests that the current for... loop stop after this call. Ideally this would act exactly like a last inside a loop, but technical reasons mean it does not exit the block early, hence one typically adds a return if needed.

prime_iterator

my $it = prime_iterator;
$sum += $it->() for 1..100000;

Returns a closure-style iterator. The start value defaults to the first prime (2) but an initial value may be given as an argument, which will result in the first value returned being the next prime greater than or equal to the argument. For example, this:

my $it = prime_iterator(200);  say $it->();  say $it->();

will return 211 followed by 223, as those are the next primes >= 200. On each call, the iterator returns the current value and increments to the next prime.

Other options include "forprimes" (more efficiency, less flexibility), Math::Prime::Util::PrimeIterator (an iterator with more functionality), or Math::Prime::Util::PrimeArray (a tied array).

prime_iterator_object

my $it = prime_iterator_object;
while ($it->value < 100) { say $it->value; $it->next; }
$sum += $it->iterate for 1..100000;

Returns a Math::Prime::Util::PrimeIterator object. A shortcut that loads the package if needed, calls new, and returns the object. See the documentation for that package for details. This object has more features than the simple one above (e.g. the iterator is bi-directional), and also handles iterating across bigints.

prime_count

my $primepi = prime_count( 1_000 );
my $pirange = prime_count( 1_000, 10_000 );

Returns the Prime Count function Pi(n), also called primepi in some math packages. When given two arguments, it returns the inclusive count of primes between the ranges. E.g. (13,17) returns 2, (14,17) and (13,16) return 1, (14,16) returns 0.

The current implementation decides based on the ranges whether to use a segmented sieve with a fast bit count, or the extended LMO algorithm. The former is preferred for small sizes as well as small ranges. The latter is much faster for large ranges.

The segmented sieve is very memory efficient and is quite fast even with large base values. Its complexity is approximately O(sqrt(a) + (b-a)), where the first term is typically negligible below ~ 10^11. Memory use is proportional only to sqrt(a), with total memory use under 1MB for any base under 10^14.

The extended LMO method has complexity approximately O(b^(2/3)) + O(a^(2/3)), and also uses low memory. A calculation of Pi(10^14) completes in a few seconds, Pi(10^15) in well under a minute, and Pi(10^16) in about one minute. In contrast, even parallel primesieve would take over a week on a similar machine to determine Pi(10^16).

Also see the function "prime_count_approx" which gives a very good approximation to the prime count, and "prime_count_lower" and "prime_count_upper" which give tight bounds to the actual prime count. These functions return quickly for any input, including bigints.

prime_count_upper

Returns a proven upper bound on the number of primes up to n. See "prime_count_lower" for details common to both functions.

prime_count_lower

my $lower_limit = prime_count_lower($n);
my $upper_limit = prime_count_upper($n);
#   $lower_limit  <=  prime_count(n)  <=  $upper_limit

Returns a proven lower bound on the number of primes up to n. These are analytical routines, so will take a fixed amount of time and no memory. The actual prime_count will always be equal to or between these numbers.

A common place these would be used is sizing an array to hold the first $n primes. It may be desirable to use a bit more memory than is necessary, to avoid calling prime_count.

These routines use verified tight limits below a range at least 2^35. For larger inputs various methods are used including Dusart (2010), Büthe (2014,2015), and Axler (2014). These bounds do not assume the Riemann Hypothesis. If the configuration option assume_rh has been set (it is off by default), then the Schoenfeld (1976) bounds can be used for very large values.

prime_count_approx

print "there are about ",
      prime_count_approx( 10 ** 18 ),
      " primes below one quintillion.\n";

Returns an approximation to the prime_count function, without having to generate any primes. For values under 10^36 this uses the Riemann R function, which is quite accurate: an error of less than 0.0005% is typical for input values over 2^32, and decreases as the input gets larger.

A slightly faster but much less accurate answer can be obtained by averaging the upper and lower bounds.

is_prime_power

Given an integer n, returns k if n = p^k for some prime p, and zero otherwise.

If a second argument is present, it must be a scalar reference. If the return value is non-zero, then it will be set to p.

This corresponds to Pari/GP's isprimepower function. It is related to Mathematica's PrimePowerQ[n] function. These all return zero/false for n=1.

This is the OEIS series A246655.

prime_powers

my $aref = prime_powers( 10**4 );

Given either two non-negative limits lo, hi, or one non-negative limit hi, returns an array reference with all prime powers between the limits (inclusive). With only one input, the lower limit is 2.

The array reference values will be all p^e where lo <= p^e <= hi with p prime and e >= 1. Hence this includes the primes as well as higher powers of primes.

See also "primes" and "prime_power_count".

next_prime_power

Given an integer n, returns the smallest prime power greater than |n|. Similar to "next_prime", but also includes powers of primes.

prev_prime_power

Given an integer n, returns the greatest prime power less than |n|. Similar to "prev_prime", but also includes powers of primes. If given |n| less than 3, undef will be returned.

prime_power_count

Given a single non-negative integer n, returns the count of prime powers less than or equal to n. If given two non-negative integers lo and hi, returns the count of prime powers between lo and hi inclusive.

These are prime powers with exponent greater than 0. I.e. the prime powers not including 1. This is OEIS series A025528.

prime_power_count_approx

Given a non-negative integer n, quickly returns a good estimate of the count of prime powers less than or equal to n.

prime_power_count_lower

Given a non-negative integer n, quickly returns a lower bound of the count of prime powers less than or equal to n. The actual count will always be greater than or equal to the result.

prime_power_count_upper

Given a non-negative integer n, quickly returns an upper bound of the count of prime powers less than or equal to n. The actual count will always be less than or equal to the result.

nth_prime_power

Given a non-negative integer n, returns the n-th prime power.

nth_prime_power_approx

Given a non-negative integer n, quickly returns a good estimate of the n-th prime power.

nth_prime_power_lower

Given a non-negative integer n, quickly returns a lower bound of the n-th prime power. The actual value will always be greater than or equal to the result.

nth_prime_power_upper

Given a non-negative integer n, quickly returns an upper bound of the n-th prime power. The actual value will always be less than or equal to the result.

twin_primes

Returns the lesser of twin primes between the lower and upper limits (inclusive), with a lower limit of 2 if none is given. This is OEIS A001359. Given a twin prime pair (p,q) with q = p + 2, p prime, and q prime, this function uses p to represent the pair. Hence the bounds need to include p, and the returned list will have p but not q.

This works just like the "primes" function, though only the first primes of twin prime pairs are returned. Like that function, an array reference is returned.

twin_prime_count

Similar to prime count, but returns the count of twin primes (primes p where p+2 is also prime). Takes either a single number indicating a count from 2 to the argument, or two numbers indicating a range.

The primes being counted are the first value, so a range of (3,5) will return a count of two, because both 3 and 5 are counted as twin primes. A range of (12,13) will return a count of zero, because neither 12+2 nor 13+2 are prime. In contrast, primesieve requires all elements of a constellation to be within the range to be counted, so would return one for the first example (5 is not counted because its pair 7 is not in the range).

There is no useful formula known for this, unlike prime counts. We sieve for the answer, using some small table acceleration.

twin_prime_count_approx

Returns an approximation to the twin prime count of n. This returns quickly and has a very small error for large values. The method used is conjecture B of Hardy and Littlewood 1922, as stated in Sebah and Gourdon 2002. For inputs under 10M, a correction factor is additionally applied to reduce the mean squared error.

semi_primes

Returns an array reference to semiprimes between the lower and upper limits (inclusive), with a lower limit of 4 if none is given. This is OEIS A001358. The semiprimes are composite integers which are products of exactly two primes.

This works just like the "primes" function. Like that function, an array reference is returned.

semiprime_count

Similar to prime count, but returns the count of semiprimes (composites with exactly two factors). Takes either a single number indicating a count from 2 to the argument, or two numbers indicating a range.

A fast method that requires computation only to the square root of the range end is used, unless the range is so small that walking it is faster.

semiprime_count_approx

Returns an approximation to the semiprime count of n. This returns quickly and is square root accurate for native size inputs.

The series of Crisan and Erban (2020) is used with a maximum of 19 terms. Truncation is performed at empirical good crossovers. Clamping is done as needed at crossovers to ensure monotonic results.

almost_primes

my $ref_to_3_almost_primes = almost_primes(3, 1000, 2000);

This works just like the "primes" function. Like that function, an array reference is returned.

With k=1 these are the primes (OEIS A000040). With k=2 these are the semiprimes (OEIS A001358). With k=3 these are the 3-almost-primes (OEIS A014612). With k=4 these are the 4-almost-primes (OEIS A014613). OEIS sequences can be found through k=20.

almost_prime_count

say almost_prime_count(3,10000); # number of 3-almost-primes <= 10000

Given non-negative integers k and n, returns the count of k-almost-prime numbers up to and including n. With k=1 this is the standard prime count. With k=2 this is the semiprime count. In general, this is the count of all integers through n that have exactly k prime factors.

The implementation uses nested prime count sums, and caching along with LMO prime counts to get quite reasonable speeds.

almost_prime_count_approx

A fast approximation of the k-almost-prime count of n.

The current implementation for n greater than 64-bit is not well tested.

almost_prime_count_lower

Quickly returns a lower bound for the k-almost-prime count of n. The actual count will be greater than or equal to this result.

The current implementation for n greater than 64-bit is not well tested.

almost_prime_count_upper

Quickly returns an upper bound for the k-almost-prime count of n. The actual count will be less than or equal to this result.

The current implementation for n greater than 64-bit is not well tested.

omega_primes

Takes a non-negative integer argument k and either one or two additional non-negative integer arguments indicating the upper limit or lower and upper limits. The limits are inclusive. The k-omega-primes are positive integers which have exactly k distinct prime factors, with possible multiplicity. Hence these numbers are divisible by exactly k different primes.

The k-omega-primes (not a common term) are exactly those integers where prime_omega(n) == k. Compare to k-almost-primes where prime_bigomega(n) == k.

With k=1 these are the prime powers. With k=2 these are OEIS A007774. With k=3 these are OEIS A033992.

omega_prime_count

Given non-negative integers k and n, returns the count of k-omega-prime numbers from 1 up to and including n. This is the count of all positive integers through n that are divisible by exactly k different primes.

The implementation uses nested loops over prime powers.

Though we have defined prime_omega(0) = 1, it is not included.

ramanujan_primes

Returns the Ramanujan primes R_n between the upper and lower limits (inclusive), with a lower limit of 2 if none is given. This is OEIS A104272. These are the Rn such that if x > Rn then "prime_count"(n) - "prime_count"(n/2) >= n.

This has a similar API to the "primes" and "twin_primes" functions, and like them, returns an array reference.

Generating Ramanujan primes takes some effort, including overhead to cover a range. This will be substantially slower than generating standard primes.

ramanujan_prime_count

Similar to prime count, but returns the count of Ramanujan primes. Takes either a single number indicating a count from 2 to the argument, or two numbers indicating a range.

While not nearly as efficient as "prime_count", this does use a number of speedups that result in it being much more efficient than generating all the Ramanujan primes.

ramanujan_prime_count_approx

A fast approximation of the count of Ramanujan primes under n.

ramanujan_prime_count_lower

A fast lower limit on the count of Ramanujan primes under n.

ramanujan_prime_count_upper

A fast upper limit on the count of Ramanujan primes under n.

sieve_range

my @candidates = sieve_range(2**1000, 10000, 40000);

Given a start value n, and native unsigned integers width and depth, a sieve of maximum depth depth is done for the width consecutive numbers beginning with n. An array of offsets from the start is returned.

The returned list contains those offsets in the range n to n+width-1 where n + offset has no prime factors smaller than itself and less than or equal to depth. Hence a depth of 2 will remove all even numbers (other than 2 itself if it is in the range). A depth of 3 will remove all numbers divisible by 2 or 3 other than those primes themselves.

sieve_prime_cluster

my @s = sieve_prime_cluster(1, 1e9, 2,6,8,12,18,20);

Efficiently finds prime clusters between the first two arguments low and high. The remaining arguments describe the cluster. The cluster values must be even, less than 31 bits, and strictly increasing. Given a cluster set C, the returned values are all primes in the range where p+c is prime for each c in the cluster set C. For returned values under 2^64, all cluster values are definitely prime. Above this range, all cluster values are BPSW probable primes (no counterexamples known).

This function returns an array rather than an array reference. Typically the number of returned values is much lower than for other primes functions, so this uses the more convenient array return. This function has an identical signature to the function of the same name in Math::Prime::Util:GMP.

The cluster is described as offsets from 0, with the implicit prime at 0. Hence an empty list is asking for all primes (the cluster p+0). A list with the single value 2 will find all twin primes (the cluster where p+0 and p+2 are prime). The list 2,6,8 will find prime quadruplets. Note that there is no requirement that the list denote a constellation (a cluster with minimal distance) -- the list 42,92,606 is just fine.

sum_primes

Returns the summation of primes between the lower and upper limits (inclusive), with a lower limit of 2 if none is given. This is essentially similar to either of:

$sum = 0; forprimes { $sum += $_ } $low,$high;  $sum;
# or
vecsum( @{ primes($low,$high) } );

but is much more efficient.

The current implementation is a mix of small-table-enhanced sieve count for sums that fit in a UV, an efficient sieve count for small ranges, and a Legendre sum method, including XS support for 128-bit results.

While this is fairly efficient, the state of the art is Kim Walisch's primesum. It is recommended for very large values, as it can be hundreds of times faster.

print_primes

print_primes(1_000_000);             # print the first 1 million primes
print_primes(1000, 2000);            # print primes in range
print_primes(2,1000,fileno(STDERR))  # print to a different descriptor

With a single argument this prints all primes from 2 to n to standard out. With two arguments it prints primes between low and high to standard output. With three arguments it prints primes between low and high to the file descriptor given. If the file descriptor cannot be written to, this will croak with "print_primes write error". It will produce identical output to:

forprimes { say } $low,$high;

The point of this function is just efficiency. It is over 10x faster than using say, print, or printf, though much more limited in functionality. A later version may allow a file handle as the third argument.

nth_prime

say "The ten thousandth prime is ", nth_prime(10_000);

Returns the prime that lies in index n in the array of prime numbers. Put another way, this returns the smallest p such that Pi(p) >= n.

Like most programs with similar functionality, this is one-based. nth_prime(0) returns undef, nth_prime(1) returns 2.

For relatively small inputs (below 1 million or so), this does a sieve over a range containing the nth prime, then counts up to the number. This is fairly efficient in time and memory. For larger values, create a low-biased estimate using the inverse logarithmic integral, use a fast prime count, then sieve in the small difference.

While this method is thousands of times faster than generating primes, and doesn't involve big tables of precomputed values, it still can take a fair amount of time for large inputs. Calculating the 10^12th prime takes about 1 second, the 10^13th prime takes under 10 seconds, and the 10^14th prime (3475385758524527) takes under 30 seconds. Think about whether a bound or approximation would be acceptable, as they can be computed analytically.

If the result is larger than a native integer size (32-bit or 64-bit), the result will take a very long time. A later version of Math::Prime::Util::GMP may include this functionality which would help for 32-bit machines.

nth_prime_upper

Returns a proven upper bound on the Nth prime. See "nth_prime_lower" for details common to both functions.

nth_prime_lower

my $lower_limit = nth_prime_lower($n);
my $upper_limit = nth_prime_upper($n);
# For all $n:   $lower_limit  <=  nth_prime($n)  <=  $upper_limit

Returns a proven lower bound on the Nth prime. No sieving is done, so these are fast even for large inputs.

For tiny values of n. exact answers are returned. For small inputs, an inverse of the opposite prime count bound is used. For larger values, the Dusart (2010) and Axler (2013) bounds are used.

nth_prime_approx

say "The one trillionth prime is ~ ", nth_prime_approx(10**12);

Returns an approximation to the nth_prime function, without having to generate any primes. For values where the nth prime is smaller than 2^64, the inverse Riemann R function is used. For larger values, the inverse logarithmic integral is used.

The value returned will not necessarily be prime. This applies to all the following nth prime approximations, where the returned value is close to the real value, but no effort is made to coerce the result to the nearest set element.

nth_twin_prime

Returns the Nth twin prime. This is done via sieving and counting, so is not very fast for large values.

nth_twin_prime_approx

Returns an approximation to the Nth twin prime. A curve fit is used for small inputs (under 1200), while for larger inputs a binary search is done on the approximate twin prime count.

nth_semiprime

Returns the Nth semiprime, similar to where a forsemiprimes loop would end after N iterations, but much more efficiently.

nth_semiprime_approx

Returns an approximation to the Nth semiprime. The approximation is orders of magnitude better than the simple n log n / log log n approximation for large n. E.g. for n=10^12 the simple estimate is within 3.6%, but this function is within 0.000012%.

nth_almost_prime

say "500th number with exactly 3 factors: ", nth_almost_prime(3,500);

A k-almost prime is a product of k prime numbers, counted with multiplicity. That is, there are exactly k prime factors (which do not have to be distinct from each other).

Given non-negative integers k and n, returns the n-th k-almost prime. With k=1 this is the nth prime. With k=2 this is the nth semiprime.

The implementation does a binary search lookup with "almost_prime_count" so is reasonably efficient for large values.

undef is returned for n == 0 and for all k == 0 other than n == 1.

nth_almost_prime_approx

A fast approximation of the n-th k-almost prime.

nth_almost_prime_lower

Quickly returns a lower bound for the n-th k-almost prime. The actual nth k-almost-prime will be greater than or equal to this result.

nth_almost_prime_upper

Quickly returns an upper bound for the n-th k-almost prime. The actual nth k-almost-prime will be less than or equal to this result.

nth_omega_prime

Given non-negative integers k and n, returns the n-th k-omega prime. This is the n-th integer divisible by exactly k different primes.

The implementation does a binary search lookup with "omega_prime_count" so is reasonably efficient for large values.

undef is returned for n == 0 and for all k == 0 other than n == 1.

nth_ramanujan_prime

Returns the Nth Ramanujan prime. For reasonable size values of n, e.g. under 10^8 or so, this is relatively efficient for single calls. If multiple calls are being made, it will be much more efficient to get the list once.

nth_ramanujan_prime_approx

A fast approximation of the Nth Ramanujan prime.

nth_ramanujan_prime_lower

A fast lower limit on the Nth Ramanujan prime.

nth_ramanujan_prime_upper

A fast upper limit on the Nth Ramanujan prime.

is_pseudoprime

Given an integer n and zero or more positive bases, returns 1 if n is positive and a probable prime to each base, and returns 0 otherwise. This is the simple Fermat primality test. Removing primes, given base 2 this produces the sequence OEIS A001567.

If no bases are given, base 2 is used. All bases must be 2 or greater.

For practical use, "is_strong_pseudoprime" is a much stronger test with similar or better performance.

Note that there is a set of composites (the Carmichael numbers) that will pass this test for all bases. This downside is not shared by the Euler and strong probable prime tests (also called the Solovay-Strassen and Miller-Rabin tests).

is_euler_pseudoprime

Given an integer n and zero or more positive bases, returns 1 if n is positive and an Euler probable prime to each base, and returns 0 otherwise. This is the Euler test, sometimes called the Euler-Jacobi test. Removing primes, given base 2 this produces the sequence OEIS A047713.

If no bases are given, base 2 is used. All bases must be 2 or greater.

If 0 is returned, then the number really is a composite (for bases less than n). If 1 is returned, then it is either a prime or an Euler pseudoprime to all the given bases. Given enough distinct bases, the chances become very high that the number is actually prime.

This test forms the basis of the Solovay-Strassen test, which is a precursor to the Miller-Rabin test (which uses the strong probable prime test). There are no analogies to the Carmichael numbers for this test. For the Euler test, at most 1/2 of witnesses pass for a composite, while at most 1/4 pass for the strong pseudoprime test.

is_strong_pseudoprime

my $maybe_prime = is_strong_pseudoprime($n);
my $probably_prime = is_strong_pseudoprime($n, 2, 3, 5, 7, 11, 13, 17);

Given an integer n and zero or more positive bases, returns 1 if n is positive and a strong probable prime to each base, and returns 0 otherwise.

If no bases are given, base 2 is used. All bases must be 2 or greater.

If 0 is returned, then the number really is a composite (for any base). If 1 is returned, then it is either a prime or a strong pseudoprime to all the given bases. Given enough distinct bases, the chances become very high that the number is actually prime.

This is usually used in combination with other tests to make either stronger tests (e.g. the strong BPSW test) or deterministic results for numbers less than some verified limit (e.g. it has long been known that no more than three selected bases are required to give correct primality test results for any 32-bit number). Given the small chances of passing multiple bases, there are some math packages that just use multiple MR tests for primality testing.

Even inputs other than 2 will always return 0 (composite). While the algorithm does run with even input, most sources define it only on odd input. Returning composite for all non-2 even input makes the function match most other implementations including Math::Primality's is_strong_pseudoprime function.

Generally, bases of interest are between 2 and n-2. Bases 1 and n-1 will return 1 for any odd composites. Most sources do not define the test for bases equal to 0 mod n, and many do not for any bases larger than n. We allow all bases, noting that the case base = 0 mod n is defined as 1. This allows primes to return 1 regardless of the base.

is_lucas_pseudoprime

Given an integer n, returns 1 if n is positive and a standard Lucas probable prime using the Selfridge method of choosing D, P, and Q (some sources call this a Lucas-Selfridge pseudoprime). Removing primes, this produces the sequence OEIS A217120.

is_strong_lucas_pseudoprime

Given an integer n, returns 1 if n is positive and a strong Lucas probable prime using the Selfridge method of choosing D, P, and Q (some sources call this a strong Lucas-Selfridge pseudoprime). This is one half of the BPSW primality test (the Miller-Rabin strong pseudoprime test with base 2 being the other half). Removing primes, this produces the sequence OEIS A217255.

is_extra_strong_lucas_pseudoprime

Given an integer n, returns 1 if n is positive and an extra strong Lucas probable prime as defined in Grantham 2000. This test has more stringent conditions than the strong Lucas test, and produces about 60% fewer pseudoprimes. Performance is typically 20-30% faster than the strong Lucas test.

The parameters are selected using the Baillie-OEIS method: increment P from 3 until jacobi(D,n) = -1. Removing primes, this produces the sequence OEIS A217719.

is_almost_extra_strong_lucas_pseudoprime

This is similar to the "is_extra_strong_lucas_pseudoprime" function, but does not calculate U, so is a little faster, but also weaker. With the current implementations, there is little reason to prefer this unless trying to reproduce specific results. The extra-strong implementation has been optimized to use similar features, removing most of the performance advantage.

An optional second argument (an integer between 1 and 256) indicates the increment amount for P parameter selection. The default value of 1 yields the parameter selection described in "is_extra_strong_lucas_pseudoprime", creating a pseudoprime sequence which is a superset of the latter's pseudoprime sequence OEIS A217719. A value of 2 yields the method used by Pari.

Because the U = 0 condition is ignored, this produces about 5% more pseudoprimes than the extra-strong Lucas test. However this is still only 66% of the number produced by the strong Lucas-Selfridge test. No BPSW counterexamples have been found with any of the Lucas tests described.

is_euler_plumb_pseudoprime

Given an integer n, returns 1 if n is positive and passes Colin Plumb's Euler Criterion primality test, and returns 0 otherwise. Pseudoprimes to this test are a subset of the base 2 Fermat and Euler tests, but a superset of the base 2 strong pseudoprime (Miller-Rabin) test.

The main reason for this test is that it is slightly more efficient than other probable prime tests.

is_perrin_pseudoprime

Given an integer n, returns 1 if n is positive and n divides P(n) where P(n) is the Perrin number of n, and returns 0 otherwise. The Perrin sequence is defined by P(n) = P(n-2) + P(n-3) with P(0) = 3, P(1) = 0, P(2) = 2.

While pseudoprimes are relatively rare (the first two are 271441 and 904631), infinitely many exist. They have significant overlap with the base-2 pseudoprimes and strong pseudoprimes, making the test inferior to the Lucas or Frobenius tests for combined testing. The pseudoprime sequence is OEIS A013998.

The implementation uses modular pre-filters, Montgomery math, and the Adams/Shanks doubling method. This is significantly more efficient than other known implementations.

An optional second argument r indicates whether to run additional tests. With r=1, P(-n) = -1 mod n is also verified, creating the "minimal restricted" test. With r=2, the full signature is also tested using the Adams and Shanks (1982) rules (without the quadratic form test). With r=3, the full signature is tested using the Grantham (2000) test, which additionally does not allow pseudoprimes to be divisible by 2 or 23. The minimal restricted pseudoprime sequence is OEIS A018187.

is_catalan_pseudoprime

Given an integer n, returns 1 if n is positive and (-1)^{(n-1)/2} * C_{(n-1)/2} is congruent to 2 mod n, where C_n is the nth Catalan number, and returns 0 otherwise. The nth Catalan number is equal to binomial(2n,n)/(n+1). All odd primes satisfy this condition, and only three known composites.

The pseudoprime sequence is OEIS A163209.

There is no known efficient method to perform the Catalan primality test, so it is a curiosity rather than a practical test. The implementation uses a method from Charles Greathouse IV (2015) and results from Aebi and Cairns (2008) to produce results many orders of magnitude faster than other known implementations, but it is still vastly slower than other compositeness tests.

is_frobenius_pseudoprime

Given an integer n and two optional integer parameters a and b, returns 1 if n is positive and a Frobenius probable prime with respect to the polynomial x^2 - ax + b, and returns 0 otherwise. Without the parameters, b = 2 and a is the least positive odd number such that (a^2-4b|n) = -1. This selection has no pseudoprimes below 2^64 and none known. In any case, the discriminant a^2-4b must not be a perfect square.

Some authors use the Fibonacci polynomial x^2-x-1 corresponding to (1,-1) as the default method for a Frobenius probable prime test. This creates a weaker test than most other parameter choices (e.g. over twenty times more pseudoprimes than (3,-5)), so is not used as the default here. With the (1,-1) parameters the pseudoprime sequence is OEIS A212424.

The Frobenius test is a stronger test than the Lucas test. Any Frobenius (a,b) pseudoprime is also a Lucas (a,b) pseudoprime but the converse is not true, as any Frobenius (a,b) pseudoprime is also a Fermat pseudoprime to the base |b|. We can see that with the default parameters this is similar to, but somewhat weaker than, the BPSW test used by this module (which uses the strong and extra-strong versions of the probable prime and Lucas tests respectively).

Also see the more efficient "is_frobenius_khashin_pseudoprime" and "is_frobenius_underwood_pseudoprime" which have no known counterexamples and run quite a bit faster.

is_frobenius_underwood_pseudoprime

Given an integer n, returns 1 if n is positive and passes the efficient Frobenius test of Paul Underwood, and returns 0 otherwise. This selects a parameter a as the least non-negative integer such that (a^2-4|n)=-1, then verifies that (x+2)^(n+1) = 2a + 5 mod (x^2-ax+1,n). This combines a Fermat and Lucas test with a cost of only slightly more than 2 strong pseudoprime tests. This makes it similar to, but faster than, a regular Frobenius test.

There are no known pseudoprimes to this test and extensive computation has shown no counterexamples under 2^50. This test also has no overlap with the BPSW test, making it a very effective method for adding additional certainty. Performance at 1e12 is about 60% slower than BPSW.

is_frobenius_khashin_pseudoprime

Given an integer n, returns 1 if n is positive and passes the Frobenius test of Sergey Khashin, and returns 0 otherwise. The test verifies n is not a perfect square, selects the parameter c as the smallest odd prime such that (c|n)=-1, then verifies that (1+D)^n = (1-D) mod n where D = sqrt(c) mod n.

There are no known pseudoprimes to this test and Khashin (2018) shows there are no counterexamples under 2^64. Performance at 1e12 is about 40% slower than BPSW.

miller_rabin_random

Given an integer n and a positive integer k, returns 1 if n is positive and passes k Miller-Rabin tests using uniform random bases selected between 2 and n-2.

This should not be used in place of "is_prob_prime", "is_prime", or "is_provable_prime". Those functions will be faster and provide better results than running k Miller-Rabin tests. This function can be used if one wants more assurances for non-proven primes, such as for cryptographic uses where the size is large enough that proven primes are not desired.

is_prob_prime

my $prob_prime = is_prob_prime($n);
# Returns 0 (composite), 2 (prime), or 1 (probably prime)

Given an integer n, returns 0 (composite), 2 (definitely prime), or 1 (probably prime).

For 64-bit input (native or bignum), this uses either a deterministic set of Miller-Rabin tests (1, 2, or 3 tests) or a strong BPSW test consisting of a single base-2 strong probable prime test followed by a strong Lucas test. This has been verified with Jan Feitsma's 2-PSP database to produce no false results for 64-bit inputs. Hence the result will always be 0 (composite) or 2 (prime).

For inputs larger than 2^64, an extra-strong Baillie-PSW primality test is performed (also called BPSW or BSW). This is a probabilistic test, so only 0 (composite) and 1 (probably prime) are returned. There is a possibility that composites may be returned marked prime, but since the test was published in 1980, not a single BPSW pseudoprime has been found, so it is extremely likely to be prime. While we believe (Pomerance 1984) that an infinite number of counterexamples exist, there is a weak conjecture (Martin) that none exist under 10000 digits.

is_bpsw_prime

Given an integer n, returns 0 (composite), 2 (definitely prime), or 1 (probably prime), using the BPSW primality test (extra-strong variant). Normally one of the "is_prime" in Math::Prime::Util or "is_prob_prime" in Math::Prime::Util functions will suffice, but those functions do pre-tests to find easy composites. If you know this is not necessary, then calling "is_bpsw_prime" may save a small amount of time.

is_provable_prime

say "$n is definitely prime" if is_provable_prime($n) == 2;

Given an integer n, returns 0 (composite), 2 (definitely prime), or 1 (probably prime). This gives it the same return values as "is_prime" and "is_prob_prime". Note that numbers below 2^64 are considered proven by the deterministic set of Miller-Rabin bases or the BPSW test. Both of these have been tested for all small (64-bit) composites and do not return false positives.

Using the Math::Prime::Util::GMP module is highly recommended for doing primality proofs, as it is much, much faster. The pure Perl code is just not fast for this type of operation, nor does it have the best algorithms. It should suffice for proofs of up to 40 digit primes, while the latest MPU::GMP works for primes of hundreds of digits (thousands with an optional larger polynomial set).

The pure Perl implementation uses theorem 5 of BLS75 (Brillhart, Lehmer, and Selfridge's 1975 paper), an improvement on the Pocklington-Lehmer test. This requires n-1 to be factored to (n/2)^(1/3)). This is often fast, but as n gets larger, it takes exponentially longer to find factors.

Math::Prime::Util::GMP implements both the BLS75 theorem 5 test as well as ECPP (elliptic curve primality proving). It will typically try a quick n-1 proof before using ECPP. Certificates are available with either method. This results in proofs of 200-digit primes in under 1 second on average, and many hundreds of digits are possible. This makes it significantly faster than Pari 2.1.7's is_prime(n,1) which is the default for Math::Pari.

prime_certificate

my $cert = prime_certificate($n);
say verify_prime($cert) ? "proven prime" : "not prime";

Given an integer n, returns a primality certificate as a multi-line string. If we could not prove n prime, an empty string is returned (n may or may not be composite). This may be examined or given to "verify_prime" for verification. The latter function contains the description of the format.

is_provable_prime_with_cert

Given an integer n, returns a two element array containing the result of "is_provable_prime": 0 definitely composite 1 probably prime 2 definitely prime and a primality certificate like "prime_certificate". The certificate will be an empty string if the first element is not 2.

verify_prime

my $cert = prime_certificate($n);
say verify_prime($cert) ? "proven prime" : "not prime";

Given a primality certificate, returns either 0 (not verified) or 1 (verified). Most computations are done using pure Perl with Math::BigInt, so you probably want to install and use Math::BigInt::GMP, and ECPP certificates will be faster with Math::Prime::Util::GMP for its elliptic curve computations.

If the certificate is malformed, the routine will carp a warning in addition to returning 0. If the verbose option is set (see "prime_set_config") then if the validation fails, the reason for the failure is printed in addition to returning 0. If the verbose option is set to 2 or higher, then a message indicating success and the certificate type is also printed.

A certificate may have arbitrary text before the beginning (the primality routines from this module will not have any extra text, but this way verbose output from the prover can be safely stored in a certificate). The certificate begins with the line:

[MPU - Primality Certificate]

All lines in the certificate beginning with # are treated as comments and ignored, as are blank lines. A version number may follow, such as:

Version 1.0

For all inputs, base 10 is the default, but at any point this may be changed with a line like:

Base 16

where allowed bases are 10, 16, and 62. This module will only use base 10, so its routines will not output Base commands.

Next, we look for (using "100003" as an example):

Proof for:
N 100003

where the text Proof for: indicates we will read an N value. Skipping comments and blank lines, the next line should be "N " followed by the number.

After this, we read one or more blocks. Each block is a proof of the form:

If Q is prime, then N is prime.

Some of the blocks have more than one Q value associated with them, but most only have one. Each block has its own set of conditions which must be verified, and this can be done completely self-contained. That is, each block is independent of the other blocks and may be processed in any order. To be a complete proof, each block must successfully verify. The block types and their conditions are shown below.

Finally, when all blocks have been read and verified, we must ensure we can construct a proof tree from the set of blocks. The root of the tree is the initial N, and for each node (block), all Q values must either have a block using that value as its N or Q must be less than 2^64 and pass BPSW.

Some other certificate formats (e.g. Primo) use an ordered chain, where the first block must be for the initial N, a single Q is given which is the implied N for the next block, and so on. This simplifies validation implementation somewhat, and removes some redundant information from the certificate, but has no obvious way to add proof types such as Lucas or the various BLS75 theorems that use multiple factors. I decided that the most general solution was to have the certificate contain the set in any order, and let the verifier do the work of constructing the tree.

The blocks begin with the text "Type ..." where ... is the type. One or more values follow. The defined types are:

Small

Type Small
N 5791

N must be less than 2^64 and be prime (use BPSW or deterministic M-R).

BLS3

Type BLS3
N  2297612322987260054928384863
Q  16501461106821092981
A  5

A simple n-1 style proof using BLS75 theorem 3. This block verifies if: a Q is odd b Q > 2 c Q divides N-1 . Let M = (N-1)/Q d MQ+1 = N e M > 0 f 2Q+1 > sqrt(N) g A^((N-1)/2) mod N = N-1 h A^(M/2) mod N != N-1

Pocklington

Type Pocklington
N  2297612322987260054928384863
Q  16501461106821092981
A  5

A simple n-1 style proof using generalized Pocklington. This is more restrictive than BLS3 and much more than BLS5. This is Primo's type 1, and this module does not currently generate these blocks. This block verifies if: a Q divides N-1 . Let M = (N-1)/Q b M > 0 c M < Q d MQ+1 = N e A > 1 f A^(N-1) mod N = 1 g gcd(A^M - 1, N) = 1

BLS15

Type BLS15
N  8087094497428743437627091507362881
Q  175806402118016161687545467551367
LP 1
LQ 22

A simple n+1 style proof using BLS75 theorem 15. This block verifies if: a Q is odd b Q > 2 c Q divides N+1 . Let M = (N+1)/Q d MQ-1 = N e M > 0 f 2Q-1 > sqrt(N) . Let D = LP*LP - 4*LQ g D != 0 h Jacobi(D,N) = -1 . Note: V_{k} indicates the Lucas V sequence with LP,LQ i V_{m/2} mod N != 0 j V_{(N+1)/2} mod N == 0

BLS5

Type BLS5
N  8087094497428743437627091507362881
Q[1]  98277749
Q[2]  3631
A[0]  11
----

A more sophisticated n-1 proof using BLS theorem 5. This requires N-1 to be factored only to (N/2)^(1/3). While this looks much more complicated, it really isn't much more work. The biggest drawback is just that we have multiple Q values to chain rather than a single one. This block verifies if:

a  N > 2
b  N is odd
.  Note: the block terminates on the first line starting with a C<->.
.  Let Q[0] = 2
.  Let A[i] = 2 if Q[i] exists and A[i] does not
c  For each i (0 .. maxi):
c1   Q[i] > 1
c2   Q[i] < N-1
c3   A[i] > 1
c4   A[i] < N
c5   Q[i] divides N-1
. Let F = N-1 divided by each Q[i] as many times as evenly possible
. Let R = (N-1)/F
d  F is even
e  gcd(F, R) = 1
. Let s = integer    part of R / 2F
. Let f = fractional part of R / 2F
. Let P = (F+1) * (2*F*F + (r-1)*F + 1)
f  n < P
g  s = 0  OR  r^2-8s is not a perfect square
h  For each i (0 .. maxi):
h1   A[i]^(N-1) mod N = 1
h2   gcd(A[i]^((N-1)/Q[i])-1, N) = 1

ECPP

Type ECPP
N  175806402118016161687545467551367
A  96642115784172626892568853507766
B  111378324928567743759166231879523
M  175806402118016177622955224562171
Q  2297612322987260054928384863
X  3273750212
Y  82061726986387565872737368000504

An elliptic curve primality block, typically generated with an Atkin/Morain ECPP implementation, but this should be adequate for anything using the Atkin-Goldwasser-Kilian-Morain style certificates. Some basic elliptic curve math is needed for these. This block verifies if:

.  Note: A and B are allowed to be negative, with -1 not uncommon.
.  Let A = A % N
.  Let B = B % N
a  N > 0
b  gcd(N, 6) = 1
c  gcd(4*A^3 + 27*B^2, N) = 1
d  Y^2 mod N = X^3 + A*X + B mod N
e  M >= N - 2*sqrt(N) + 1
f  M <= N + 2*sqrt(N) + 1
g  Q > (N^(1/4)+1)^2
h  Q < N
i  M != Q
j  Q divides M
.  Note: EC(A,B,N,X,Y) is the point (X,Y) on Y^2 = X^3 + A*X + B, mod N
.        All values work in affine coordinates, but in theory other
.        representations work just as well.
.  Let POINT1 = (M/Q) * EC(A,B,N,X,Y)
.  Let POINT2 = M * EC(A,B,N,X,Y)  [ = Q * POINT1 ]
k  POINT1 is not the identity
l  POINT2 is the identity

is_aks_prime

say "$n is definitely prime" if is_aks_prime($n);

Given an integer n, returns 1 if n is positive and passes the Agrawal-Kayal-Saxena (AKS) primality test, and returns 0 otherwise. This is a deterministic unconditional primality test which runs in polynomial time for general input.

While this is an important theoretical algorithm, and makes an interesting example, it is hard to overstate just how impractically slow it is in practice. It is not used for any purpose in non-theoretical work, as it is literally millions of times slower than other algorithms. From R.P. Brent, 2010: "AKS is not a practical algorithm. ECPP is much faster." This module also has ECPP, and indeed it is much faster.

This implementation uses theorem 4.1 from Bernstein (2003). It runs substantially faster than the original, v6 revised paper with Lenstra improvements, or the late 2002 improvements of Voloch and Bornemann. The GMP implementation uses a binary segmentation method for modular polynomial multiplication (see Bernstein's 2007 Quartic paper), which reduces to a single scalar multiplication, at which GMP excels. Because of this, the GMP implementation is likely to be faster once the input is larger than 2^33.

is_mersenne_prime

say "2^607-1 (M607) is a Mersenne prime" if is_mersenne_prime(607);

Given an integer p, returns 1 if p is positive and the Mersenne number 2^p-1 is prime, and returns 0 otherwise. Since an enormous effort has gone into testing these, a list of known Mersenne primes is used to accelerate this. Beyond the highest sequential Mersenne prime (currently 37,156,667) this performs pretesting followed by the Lucas-Lehmer test.

The Lucas-Lehmer test is a deterministic unconditional test that runs very fast compared to other primality methods for numbers of comparable size, and vastly faster than any known general-form primality proof methods. While this test is fast, the GMP implementation is not nearly as fast as specialized programs such as prime95. Additionally, since we use the table for "small" numbers, testing via this function call will only occur for numbers with over 9.8 million digits. At this size, tools such as prime95 are greatly preferred.

is_ramanujan_prime

Given an integer n, returns 1 if n is positive and is a Ramanujan prime, and returns 0 otherwise. Therefore, numbers that can be produced by the functions "ramanujan_primes" and "nth_ramanujan_prime" will return 1, while all other numbers will return 0.

There is no simple function for this predicate, so Ramanujan primes through at least n are generated, then a search is performed for n. This is not efficient for multiple calls.

is_gaussian_prime

say is_gaussian_prime(3,0);  # "2"  :  3   => 3 mod 4 => prime
say is_gaussian_prime(1,1);  # "2"  :  1+i => norm 2  => prime
say is_gaussian_prime(5,0);  # "0"  :  5   => 1 mod 4 => (2+i)(2-i)

Given two integers a and b, returns either 0, 1, or 2 to indicate whether n = a+bi is, respectively, a Gaussian composite, probable Gaussian prime, or definite Gaussian prime. This is true if and only if one of:

a = 0 and |b| is a prime congruent to 3 modulo 4.
b = 0 and |a| is a prime congruent to 3 modulo 4.
a and b are nonzero and a^2 + b^2 is prime.

is_delicate_prime

Given an integer n, returns 1 if n is positive and is a digitally delicate prime, and returns 0 otherwise. These are numbers which are prime, but changing any single base-10 digit always produces a composite number.

An optional second argument is the base base which must be at least 2. This is the base used for changing digits to check for compositeness.

These are variously called "weakly prime" or "digitally delicate prime" numbers. Note that the first digit can be changed to a zero.

Variations not considered here include making changing the first digit restricted to non-zero (OEIS A158124) and allowing leading zero digits to be changed ("widely DDPs").

This is the OEIS series A050249. With different bases, this is OEIS series A186995.

is_odd

Given an integer n, returns 1 if n is odd and 0 otherwise.

is_even

Given an integer n, returns 1 if n is even and 0 otherwise.

is_divisible

Given integers n and d, returns 1 if |n| is exactly divisible by |d|, and 0 otherwise.

This corresponds to the GMP function mpz_divisible_p. This includes its semantics with d=0 which returns 0 unless n=0.

More than one divisor can be given, e.g. is_divisible(1001,2,3,5,7), allowing one to test multiple divisors with one call. The result is 1 if n is exactly divisible by any of the d values, and 0 if it is divisible by none of them.

is_congruent

Given integers n, c, and d, returns 1 if n is congruent to c modulo |d|, and 0 otherwise.

This corresponds to the GMP function mpz_congruent_p. This includes its semantics with d=0 which returns 0 unless n=c.

is_perfect_number

Given integer n, returns 1 if n is a positive integer that is the sum of its divisors excluding the number itself, or equivalently a number that is equal to its aliquot sum.

is_power

say "$n is a perfect square" if is_power($n, 2);
say "$n is a perfect cube" if is_power($n, 3);
say "$n is a ", is_power($n), "-th power";

Given a single integer input n, returns k if n = r^k for some integer r > 1, k > 1, and 0 otherwise. The k returned is the largest possible. This can be used in a boolean statement to determine if n is a perfect power.

If given an integer n and a non-negative integer k, returns 1 if n is a k-th power, and 0 otherwise. For example, if k=2 then this detects perfect squares. Setting k=0 gives behavior like the first case (the largest root is found and its value is returned).

If a third argument is given, it must be a scalar reference. If n is a k-th power, then this will be set to the k-th root of n. For example:

my $n = 222657534574035968;
if (my $pow = is_power($n, 0, \my $root)) { say "$n = $root^$pow" }
# prints:  222657534574035968 = 2948^5

This corresponds to Pari/GP's ispower function with integer arguments.

is_square

Given an integer n, returns 1 if n is a perfect square, and returns 0 otherwise. This is identical to is_power(n,2).

This corresponds to Pari/GP's issquare function.

is_sum_of_squares

Given an integer n and an optional positive integer number of squares k, returns 1 if |n| can be represented as the sum of exactly k squares. k defaults to 2. All positive integers can be represented by 4 or more squares, so only k == 2 and k == 3 are interesting cases.

With k == 2 this produces the sequence OEIS A001481. With k == 3 this produces the sequence OEIS A000378.

is_powerfree

Given an integer n and an optional non-negative integer k, returns 1 if |n| has no divisor d^k, and returns 0 otherwise. This determines if |n| has any k-th (or higher) powers in the prime factorization. k defaults to 2.

With k == 2 this produces the sequence of square-free integers OEIS A005117. With k == 3 this produces the sequence of cube-free integers OEIS A004709. With k == 4 this produces the sequence of biquadrate-free integers OEIS A046100.

powerfree_count

Given an integer n and an optional non-negative integer k, returns the number of k-powerfree positive integers less than or equal to n. k defaults to 2.

With k == 2 this produces the sequence OEIS A013928. With k == 3 this produces the sequence OEIS A060431.

nth_powerfree

Given a non-negative integer n and an optional non-negative integer k, returns the n-th k-powerfree number. If k is omitted, k=2 is used. Returns undef if k is less than 2 or n=0. Returns 1 for n=1.

With k == 2 this produces the sequence OEIS A005117. With k == 3 this produces the sequence OEIS A004709.

powerfree_sum

Given an integer n and an optional non-negative integer k, returns the sum of k-powerfree positive integers less than or equal to n. k defaults to 2.

With k == 2 this produces the sequence OEIS A066779.

powerfree_part

Given an integer n and an optional non-negative integer k, returns the k-powerfree part of n. This is done via removing "excess" powers, i.e. in the prime factorization of n, we reduce any exponents E from P^E to P^(E % k). Alternately we can say all k-th powers are divided out. k defaults to 2.

When k == 2, this is also sometimes called core(n). It is the unique square-free integer d such that n/d is a square.

With k == 2 this produces the sequence OEIS A007913. With k == 3 this produces the sequence OEIS A050985.

With k == 2 (the default), this corresponds to Pari/GP's core function and Sage's squarefree_part function.

powerfree_part_sum

Given an integer n and an optional non-negative integer k, returns the sum of k-powerfree parts of all positive integers <= n. This is equivalent to

vecsum(map { powerfree_part($_,$k) } 1..$n)

but substantially faster.

With k == 2 this produces the sequence OEIS A069891.

squarefree_kernel

Given an integer n, returns the square-free kernel of n. This is also known as the integer radical. It is the largest square-free divisor of n, which is also the product of the distinct primes dividing n.

We choose to accept negative inputs, with the result matching the input sign.

This is the OEIS series A007947.

sqrtint

Given a non-negative integer input n, returns the integer square root. For native integers, this is equal to int(sqrt(n)).

This corresponds to Pari/GP's sqrtint function.

rootint

Given a non-negative integer n and positive exponent k, return the integer k-th root of n. This is the largest integer r such that r^k <= n.

If a third argument is present, it must be a scalar reference. It will be set to r^k.

Technically if n is negative and k is odd, the root exists and is equal to sign(n) * |rootint(abs(n),k). It was decided to follow the behavior of Pari/GP and Math::BigInt and disallow negative n.

This corresponds to Pari/GP's sqrtnint function.

logint

say "decimal digits: ", 1+logint($n, 10);
say "digits in base 12: ", 1+logint($n, 12);
my $be; my $e = logint(1000, 2, \$be);
say "largest power of 2 less than or equal to 1000:  2^$e = $be";

Given a non-zero positive integer n and an integer base b greater than 1, returns the largest integer e such that b^e <= n.

If a third argument is present, it must be a scalar reference. It will be set to b^e.

This corresponds to Pari/GP's logint function.

lshiftint

Given an integer n and an optional integer number of bits k, perform a left shift of n by k bits. If the second argument is not provided, it is assumed to be 1. This is equivalent to multiplying by 2^k.

With negative n, this behaves as described above. This is similar to how Perl behaves with use integer or use bigint, but raw Perl coerces the argument into an unsigned before left shifting, which is unlikely to ever be what is wanted.

If k is negative, a right shift is performed by |k| bits.

This corresponds to Pari/GP's shift function with a positive number of bits, and Mathematica's BitShiftLeft function.

rshiftint

Given an integer n and an optional integer number of bits k, perform a right shift of n by k bits. If the second argument is not provided, it is assumed to be 1. This is equivalent to truncated division by 2^k.

With a negative n, the result is equal to -rshiftint(-n,k). This means it is not "arithmetic right shift" or "logical right shift" as commonly used with fixed-width registers in a particular bit format, but instead treated as sign and magnitude, where the magnitude is right shifted.

If k is negative, a left shift is performed by |k| bits.

For an interesting discussion of arithmetic right shift, see Guy Steele's 1977 article "Arithmetic Shift Considered Harmful".

This corresponds to Pari/GP's shift function with a negative number of bits, and Mathematica's BitShiftRight function. The result is equal to dividing by the power of 2 using "tdivrem" or GMP's mpz_tdiv_q_2exp.

rashiftint

Given an integer n and an optional integer number of bits k, perform a signed arithmetic right shift of n by k bits. If the second argument is not provided, it is assumed to be 1. This is equivalent to floor division by 2^k.

If k is negative, a left shift is performed by |k| bits.

For non-negative n, this is always equal to "rshiftint". With negative arguments it is similar to Math::BigInt#brsft, Python, and Java's BigInteger, which use floor division by 2^k. The result is equal to dividing by the power of 2 using "divint" or GMP's mpz_fdiv_q_2exp.

signint

Given an integer n, returns the sign of n. Returns -1, 0, or 1 if n is negative, zero, or positive respectively.

This corresponds to Pari/GP's sign function, GMP's mpz_sgn function, Raku's sign method, and Math::BigInt's sign method. Some of those extend to non-integers.

cmpint

Given integers a and b, returns -1, 0, or 1 if a is respectively less than, equal to, or greater than b.

The main value of this is to ensure Perl never silently converts the values to floating point, which can give wrong results, and also avoid having to manually convert everything to bigints.

This corresponds to Pari/GP's cmp function, GMP's mpz_cmp function, Math::BigInt's bcmp method, and Perl's << <=> >> operator. Prior to version 6.2, GMP could return negative or positive values other than -1 and 1.

addint

Given integers a and b, returns a + b.

These integer arithmetic functions (addint, subint, mulint, add1int, sub1int, absint, negint) exist to offer exact integer arithmetic without overflow or NV conversion, while returning native integers when they fit, and bigints only when needed. Other choices include:

Perl native operations. This is fine with small numbers, but once large enough, values will be converted to floating point (NV). This means incorrect results. Values larger than 64-bit are completely unsupported. One might expect 2^53 to be the usual point for "large enough", but not only is the NV type platform dependent, but very old 64-bit Perl will aggressively convert values to NV starting at 2^49 even with NV being a IEEE-754 double.
use integer. Gives exact integer math as if we were using IV types in C. We are still left with 32-bit versus 64-bit platform differences, being restricted to signed type, and no support for larger values.
Math::BigInt, Math::GMPz, etc. If one knows large values will be used, this is a good idea. Use bigint objects for all values, and all operations are methods on the objects and give correct results. This is functionally a good solution, but it will be 10 to 500 times slower as well as more memory.

All these functions accept native integers (IV/UV), bigints, and string representations of integers. Results will be in native types if possible, and as objects of the chosen bigint class otherwise. Best performance will still be had by native operations within range, or by using fast classes like Math::GMPz if most operations need it. We give correct behavior while only paying the performance penalty when needed, although there is still some overhead since we are not built into the language like Raku or Python.

subint

Given integers a and b, returns a - b.

add1int

Given integer n, returns n + 1.

sub1int

Given integer n, returns n - 1.

mulint

Given integers a and b, returns a * b.

powint

Given an integer a and a non-negative integer b, returns a^b. 0^0 will return 1.

The exponent b is converted into an unsigned long.

divint

Given integers a and b, returns the quotient a / b.

Floor division is used, so q is rounded towards -inf and the remainder has the same sign as the divisor b. This is the same as modern "bdiv" in Math::BigInt, GMP fdiv functions, and Python's integer division.

For negative inputs, this will not be identical to native Perl division, which oddly uses a truncated quotient and floored remainder. More importantly, consistent and correct 64-bit integer division in Perl is problematic.

Pari/GP's \\ integer division operator uses Euclidean division, which matches their divrem function. Our divint and modint operators both use floor division, which matches Raku and Python. We also have Euclidean, truncated, and ceiling division available via "divrem", "tdivrem", and "cdivrem" respectively.

modint

Given integers a and b, returns the modulo a % b.

r = a - b * floor(a / b)

Floor division is used, so q is rounded towards -inf and r has the same sign as the divisor b. This is the same as modern "bmod" in Math::BigInt and the GMP fdiv functions.

Like with divint, we use floor division, while Pari/GP uses Euclidean for their % integer remainder operator.

cdivint

Given integers a and b, returns the quotient a / b.

Ceiling division is used, so q is rounded towards +inf and the remainder has the opposite sign as the divisor b.

divrem

my($quo, $rem) = divrem($a, $b);

Given integers a and b, returns a list of two items: the Euclidean quotient and the Euclidean remainder. The remainder is always non-negative (0 <= r < |b|), and the quotient is chosen to satisfy a = b*q + r.

This corresponds to Pari/GP's divrem function. There is no explicit function in Math::BigInt that gives this division method for signed inputs.

tdivrem

Given integers a and b, returns a list of two items: the truncated quotient and the truncated remainder.

The resulting pair will match "btdiv" in Math::BigInt and "btmod" in Math::BigInt. This matches C99 "truncation toward zero" semantics as well.

fdivrem

Given integers a and b, returns a list of two items: the floored quotient and the floored remainder. The results will match the individual "divint" and "modint" functions, since they also use floored division.

This corresponds to Python's builtin divmod function, and Raku's builtin div and mod functions. The resulting pair will match "bdiv" in Math::BigInt and "bmod" in Math::BigInt.

cdivrem

Given integers a and b, returns a list of two items: the ceiling quotient (rounded towards +inf) and the ceiling remainder. The remainder has the opposite sign from the divisor b. This allows one to perform division with rounding up.

absint

Given integer n, return |n|, i.e. the absolute value of n.

negint

Given integer n, return -n.

lucasu

say "Fibonacci($_) = ", lucasu(1,-1,$_) for 0..100;

Given integers P, Q, and the non-negative integer k, computes U_k for the Lucas sequence defined by P,Q. These include the Fibonacci numbers (1,-1), the Pell numbers (2,-1), the Jacobsthal numbers (1,-2), the Mersenne numbers (3,2), and more.

Also see "lucasumod" for fast computation mod n.

This corresponds to OpenPFGW's lucasU function and gmpy2's lucasu function.

lucasv

say "Lucas($_) = ", lucasv(1,-1,$_) for 0..100;

Given integers P, Q, and the non-negative integer k, computes V_k for the Lucas sequence defined by P,Q. These include the Lucas numbers (1,-1).

Also see "lucasvmod" for fast computation mod n.

This corresponds to OpenPFGW's lucasV function and gmpy2's lucasv function.

lucasuv

($U, $V) = lucasuv(1,-2,17); # 17-th Jacobsthal, Jacobsthal-Lucas.

Given integers P, Q, and the non-negative integer k, computes both U_k and V_k for the Lucas sequence defined by P,Q. Generating both values is typically not much more time than one.

Also see "lucasuvmod" for fast computation mod n.

gcd

Given a list of integers, returns the greatest common divisor. This is often used to test for coprimality.

Each input n is treated as |n|.

lcm

Given a list of integers, returns the least common multiple. Note that we follow the semantics of Mathematica, Pari, and Raku, re:

lcm(0, n) = 0              Any zero in list results in zero return
lcm(n,-m) = lcm(n, m)      We use the absolute values
lcm() = 1                  lcm of empty list returns 1

gcdext

Given two integers x and y, returns u,v,d such that d = gcd(x,y) and u*x + v*y = d. This uses the extended Euclidean algorithm to compute the values satisfying Bézout's Identity.

This corresponds to Pari's gcdext function, which was renamed from bezout in Pari 2.6. The results will hence match "bezout" in Math::Pari.

chinese

say chinese( [14,643], [254,419], [87,733] );  # 87041638

Solves a system of simultaneous congruences using the Chinese Remainder Theorem (with extension to non-coprime moduli). A list of [a,n] pairs are taken as input, each representing an equation x ≡ a mod |n|. If no solution exists, undef is returned. If a solution is returned, the modulus is equal to the lcm of all the given moduli (see "lcm"). In the standard case where all values of n are coprime, this is just the product. The a values must be integers, while the n values must be non-zero integers. Like other mod functions, we use abs(n).

Comparison to similar functions in other software:

Math::ModInt::ChineseRemainder:
  cr_combine( mod(a1,m1), mod(a2,m2), ... )

Pari/GP:
  chinese( [Mod(a1,m1), Mod(a2,m2), ...] )

Mathematica:
  ChineseRemainder[{a1, a2, ...}, {m1, m2, ...}]

SAGE:
  crt( [a1,m1], [a2,m2], ... )
  crt(a1,m1,a2,m2,...)
  CRT_list( [a1,a2,...], [m1,m2,...] )

chinese2

Functions like "chinese" but returns two items: the remainder and the modulus. If a solution exists, the second value (the final modulus) is equal to the lcm of the absolute values of all the given moduli.

If no solution exists, both return values will be undef.

frobenius_number

Finds the Frobenius number of a set of positive integers. This is the largest positive integer that cannot be represented as a non-negative linear combination of the input set. Each set element must be positive (all elements greater than zero) and setwise coprime: gcd(a1,a2,...,an) = 1.

This is sometimes called the "coin problem".

This corresponds to Mathematica's FrobeniusNumber function. Matching their API, we return -1 if any set element is 1.

vecsum

say "Totient sum 500,000: ", vecsum(euler_phi(0,500_000));

Returns the sum of all arguments, each of which must be an integer. This is similar to List::Util's "sum0" in List::Util function, but has a very important difference. List::Util turns all inputs into doubles and returns a double, which will mean incorrect results with large integers. vecsum sums (signed) integers and returns the untruncated result.

Processing is done on native integers while possible, including using a 128-bit running sum in the C code.

vecprod

say "Totient product 5,000: ", vecprod(euler_phi(1,5_000));

Returns the product of all arguments, each of which must be an integer. This is similar to List::Util's "product" in List::Util function, but keeps all results as integers and automatically switches to bigints if needed.

vecmin

say "Smallest Totient 100k-200k: ", vecmin(euler_phi(100_000,200_000));

Returns the minimum of all arguments, each of which must be an integer. This is similar to List::Util's "min" in List::Util function, but has a very important difference. List::Util turns all inputs into doubles and returns a double, which gives incorrect results with large integers. vecmin validates and compares all results as integers. The validation step will make it a little slower than "min" in List::Util but this prevents accidental and unintentional use of floats.

vecmax

say "Largest Totient 100k-200k: ", vecmax(euler_phi(100_000,200_000));

Returns the maximum of all arguments, each of which must be an integer. This is similar to List::Util's "max" in List::Util function, but has a very important difference. List::Util turns all inputs into doubles and returns a double, which gives incorrect results with large integers. vecmax validates and compares all results as integers. The validation step will make it a little slower than "max" in List::Util but this prevents accidental and unintentional use of floats.

vecreduce

say "Count of non-zero elements: ", vecreduce { $a + !!$b } (0,@v);
my $checksum = vecreduce { $a ^ $b } @{twin_primes(1000000)};

Does a reduce operation via left fold. Takes a block and a list as arguments. The block uses the special local variables a and b representing the accumulation and next element respectively, with the result of the block being used for the new accumulation. No initial element is used, so undef will be returned with an empty list.

The interface is exactly the same as "reduce" in List::Util. This was done to increase portability and minimize confusion. See chapter 7 of Higher Order Perl (or many other references) for a discussion of reduce with empty or singular-element lists. It is often a good idea to give an identity element as the first list argument.

While operations like "vecmin", "vecmax", "vecsum", "vecprod", etc. can be fairly easily done with this function, it will not be as efficient. There are a wide variety of other functions that can be easily made with reduce, making it a useful tool.

vecany

Returns true if any element of a list satisfies a block. See "vecfirst".

vecall

Returns true if all elements of a list satisfy a block. See "vecfirst".

vecnone

Returns true if no element of a list satisfies a block. See "vecfirst".

vecnotall

Returns true if not all elements of a list satisfy a block. See "vecfirst".

vecfirst

say "all values are Carmichael" if vecall { is_carmichael($_) } @n;

Short circuit evaluations of a block over a list. Takes a block and a list as arguments. The block is called with $_ set to each list element, and evaluation on list elements is done until either all list values have been evaluated or the result condition can be determined. For instance, in the example of vecall above, evaluation stops as soon as any value returns false.

The interface is exactly the same as the any, all, none, notall, and first functions in List::Util. This was done to increase portability and minimize confusion. Unlike other vector functions like vecmin, vecmax, vecsum, etc. there is no added value to using these versus the ones from List::Util. They are here for convenience.

These operations can fairly easily be mapped to scalar(grep {...} @n), but that does not short-circuit and is less obvious.

vecfirstidx

say "first Carmichael is index ", vecfirstidx { is_carmichael($_) } @n;

Returns the index of the first element in a list that evaluates to true. Just like vecfirst, but returns the index instead of the value. Returns -1 if the item could not be found.

This interface matches firstidx and first_index from List::MoreUtils.

vecextract

say "Power set: ", join(" ",vecextract(\@v,$_)) for 0..2**scalar(@v)-1;
@word = vecextract(["a".."z"], [15, 17, 8, 12, 4]);

Extracts elements from an array reference based on a mask, with the result returned as an array. The mask is either an unsigned integer which is treated as a bit mask, or an array reference containing integer indices.

If the second argument is an integer, each bit set in the mask results in the corresponding element from the array reference to be returned. Bits are read from the right, so a mask of 1 returns the first element, while 5 will return the first and third. The mask may be a bigint.

If the second argument is an array reference, then its elements will be used as zero-based indices into the first array. Duplicate values are allowed and the ordering is preserved. Given that Perl has fully functional array slices in the language, this is for completeness with Pari/GP. These are equivalent:

vecextract($aref, $iref);
@$aref[@$iref];

vecuniq

my @vec = vecuniq(1,2,3,2,-10,-100,1);  # returns (1,2,3,-10,-100)

Given an array of integers, returns an array with all duplicate entries removed. The original ordering is preserved. All values must be defined.

This is similar to List::Util::uniqint (the integer comparison version of List::Util::uniq). Unlike the more generic List::Util::uniq and List::MoreUtils::XS::uniq, all inputs must be integers. With native integers, our function is 2-10x faster.

vecfreq

# Produce frequency hash:
my %h = vecfreq(1,2,2,2,3,1,4);   #  (1=>2, 2=>3, 3=>1, 4=>1)
# Print most common value:
say vecreduce { $h{$a} > $h{$b} ? $a : $b } keys %h;

Given an array of items, returns a hash with each key containing the unique items, with the associated value being the occurrence count in the array.

This is identical to List::MoreUtils::frequency. It is typically faster when given only native integers.

This is very similar to the Pari/GP function matreduce for vectors, and to Python's Counter.

vecsingleton

my @solo = vecsingleton(1,4,17,1,17,-8);  # (4,-8)
# Same but slower:
my %h = vecfreq(@n);
my @onlyuniqs = grep { $h{$_} == 1 } @n;

Given an array of items, returns an array with all entries removed that appear more than once in the list. The original ordering is preserved.

This is identical to List::MoreUtils::singleton. When given only native integers, it is typically 2 to 10x faster.

vecsort

my @sorted = vecsort(1,2,3,2,-10,-100,1);   # returns (-100,-10,1,1,2,2,3)
my @sorted = vecsort([1,2,3,2,-10,-100,1]); # same

Numerically (ascending) sort a list of integers. The input is either a list or a single array reference which holds the list.

All values must be defined and integers. They may be any mix of native IV, native UV, strings, bigints.

Perl's built-in numerical sort can sometimes give incorrect results for typical cases we encounter. Prior to version 5.26 (2017), large 64-bit integers were turned into NV (floating point) types. With all current versions of Perl, strings are turned into NV types even if they are the text of a 64-bit integer.

In scalar context, vecsort returns the number of items without sorting (but after input validation). This should be expected and what we typically want. E.g. if we only want the number of divisors, we call in scalar context and get the number without requiring actual sorting. Having the same results from $x = vecsort(5,6,7) and @v = vecsort(5,6,7); $x=@v; is what we want. This contrasts with Perl's built-in sort which in scalar context has undefined behaviour (in all current versions of perl it returns undef). In particular this forces all programs to use a workaround if they want to return the results of sorting an array. See Perl 5 issue 12803 for some discussion with no resolution.

Using an array reference as input is slightly faster.

This is almost always faster than Perl's built-in numerical sort: @a = sort { $a <=> $b } @a. See the performance section for more information.

vecsorti

my @arr = map { irand } 1..100000;
vecsorti \@arr;

Given an array reference of integers, numerically (ascending) sorts the integers in-place. The array reference is also returned for convenience.

This is more efficient than "vecsort". Perl's sort has this optimization built-in when doing straightforward sorting on non-references.

vecequal

my $is_equal = vecequal( [1,2,-3,[4,5,undef]], [1,2,-3,[4,5,undef]] );

Compare two arrays for equality, including nested arrays. The values inside the two input array references must be either an array reference, a scalar, or undef. Simple integers are tested with integer comparison, while other scalars use string comparison.

This is a vector comparison, not set comparison, so ordering is important. For the sake of wider applicability, non-integers are allowed. Types other than integers and strings (e.g. floating point values) are not guaranteed to have consistent results.

No circular reference detection is performed.

Performance with XS is 3x to 100x faster than perl looping or modules like Array::Compare, Data::Cmp, match::smart, List::Compare, and Algorithm::Diff. Those modules have additional functionality so this is not a complete comparison.

vecmex

my $minimum_excluded = vecmex(0,1,2,4,6);  # returns 3

Given a list of non-negative integers, returns the smallest non-negative integer that is not in the list. mex is short for "minimum excluded". The list can be seen as a set, and the return value is the minimum of the set complement. Repeated values are allowed in the list.

vecmex() = 0. vecmex(0,1,2,...,w) = w+1.

vecpmex

my $minimum_excluded = vecpmex(1,2,4,6);  # returns 3

Given a list of positive integers, returns the smallest positive integer that is not in the list. mex is short for "minimum excluded". The list can be seen as a set, and the return value is the minimum of the set complement. Repeated values are allowed in the list.

vecpmex() = 1. vecpmex(1,2,...,w) = w+1.

vecslide

@pairsum = vecslide {$a+$b} 1..5;  # returns (1+2,2+3,3+4,4+5)

say for vecslide { "$a->[0] $b->[1]" }
    (["hello","world"], ["goodbye","friends"], ["love","hate"]);
# hello friends
# goodbye hate

Given a code block and a list, calls the code block for each pair in the list, setting the local $a and $b to the values in each pair.

There is no restriction of what the list contains, as seen in the second example.

This is identical to List::MoreUtils::slide.

toset

my $set = toset(52,-6,14,-6,0);  # $set = [-6,0,14,52]
say "number of elements in set: ",scalar(@$set);
say "smallest value: ",$set->[0];
say "largest value: ",$set->[-1];

Given a list of integers, returns an array reference representing the integer set. The result is numerically sorted with duplicates removed. The input array must only contain integers (signed integers, bigints, objects that evaluate to integers, strings representing integers are all ok). This "set form" is optimal for the set operations.

After the set is in this form, the size of the set is simply the length. Similarly the set minimum and maximum are trivial. All values in the output will be typed as either native integers (IV or UV) or bigints.

setinsert

my $s=[-10..-1,1..10];
setinsert($s, 0);            # $s is now [-10..10]
setinsert($s, [5,10,15,20]); # $s is now [-10..10,15,20]

Given two array references of integers in set form, inserts all elements of the second set into the first set and returns the number of elements that were inserted.

Given an array reference of integers in set form, followed by zero or more integer scalars (possibly unordered and containing duplicates), inserts all list values into the first set and returns the number of elements that were inserted. This is essentially the same as wrapping the list in "toset" but convenient and possibly more efficient.

This may be viewed as an in-place "setunion".

The one or two sets (array references) must be in set form (numerically sorted with no duplicates) or the results are undefined.

setremove

my $s=[-10..10];
setremove($s, 0);            # $s is now [-10..-1,1..10]
setremove($s, [5,10,15,20]); # $s is now [-10..-1,1..4,6..9]

Given two array references of integers in set form, removes all elements of the second set from the first set and returns the number of elements that were removed.

Given an array reference of integers in set form, followed by zero or more integer scalars (possibly unordered and containing duplicates), removes all list values from the first set and returns the number of elements that were removed. This is essentially the same as wrapping the list in "toset" but convenient and possibly more efficient.

This may be viewed as an in-place "setminus".

The one or two sets (array references) must be in set form (numerically sorted with no duplicates) or the results are undefined.

setinvert

my $s=[-10..10];
setinvert($s, 0);            # $s is now [-10..-1,1..10]
setinvert($s, [5,10,15,20]); # $s is now [-10..-1,1..4,6..9,15,20]

Given two array references of integers in set form, inverts the containment status in the first set for each element of the second set. That is, for each element of the second set, inserts into the first set if not an element, and removes from the first set if it is an element.

Given an array reference of integers in set form, followed by zero or more integer scalars (possibly unordered and containing duplicates), does the same as if the list was wrapped in "toset".

An integer value is returned indicating how many values were inserted, minus the number of values deleted.

This may be viewed as an in-place "setdelta".

The one or two sets (array references) must be in set form (numerically sorted with no duplicates) or the results are undefined.

setcontains

my $has_element = setcontains( [-12,1..20], 15 );
my $is_subset   = setcontains( [-12,1..20], [-12,5,10,15] );

Given two sets (array references of numerically sorted de-duplicated integers), returns either 1 or 0 indicating whether the second argument is a subset of the first set (i.e. if all elements from the second argument are members of the first set).

Given a set and zero or more integers in any form (possibly unordered and can contain duplicates), does the same as if the list was wrapped in "toset".

If the first array reference is not in set form (numerically sorted with no duplicates, and no string forms), the result is undefined. It is unlikely to give a correct answer. Use "toset" to convert an arbitrary integer list into set form.

setcontainsany

# True if there is any intersection between the two sets
my $intersects = setcontainsany($set1,$set2);
my $has_one_of = setcontainsany( [-12,1..20], -14,0,1,100 );  # true

Given two sets (array references of numerically sorted de-duplicated integers), returns either 1 or 0 indicating whether any element of the second set is an element of the first set.

Alternately, a set followed by a list of unordered integers will do the same, as if the list was wrapped in "toset".

There is some functionality duplication, e.g. checking for disjoint sets can be done with any of these:

my $dj1 = set_is_disjoint($set1, $set2);
my $dj2 = scalar(@{setintersect($set1, $set2)}) == 0;
my $dj3 = !setcontainsany($set1, $set2);

This function requires the array reference inputs be in set form or the result is undefined. In return it can be thousands of times faster for large sets.

setbinop

my $sumset = setbinop { $a + $b } [1,2,3], [2,3,4];  # [3,4,5,6,7]
my $difset = setbinop { $a - $b } [1,2,3], [2,3,4];  # [-3,-2,-1,0,1]
my $setsum = setbinop { $a + $b } [1,2,3];           # [2,3,4,5,6]

Given a code block and two array references containing integers, treats them as integer sets and constructs a new set from applying the cross product to the block. If only one array reference is given, it will be used with itself.

The result will be in set form (numerically sorted, no duplicates). The input sets are not aliased inside the block (modifying $a and $b has no effect outside the block).

This corresponds to Pari's setbinop function. Our function uses much less memory, as of Pari 2.18.1.

sumset

Given two array references of integers, treats them as integer sets and returns the sumset as a set (a sorted de-duplicated array reference).

If only one array reference is given, it will be used for both. It is common to see sumset applied to a single set.

This is equivalent to:

my %r;  my @A=(2,4,6,8);  my @B=(3,5,7);
forsetproduct { $r{vecsum(@_)}=undef; } \@A,\@B;
my $sumset = [vecsort(keys %r)];

my $sumset1 = setbinop { addint($a,$b) } [1,2,3];
my $sumset2 = setbinop { addint($a,$b) } [1,2,3], [2,3,4];

In Mathematica one can use Total[Tuples[A,B],{2}]. In Pari/GP one can use setbinop((a,b)-a+b,X,Y)>.

setunion

Given exactly two array references of integers, treats them as sets and returns the union as a set. The returned set will have all elements that appear in either input set.

This is more efficient if the input is in set form (numerically sorted, no duplicates). The result will be in set form.

This corresponds to Pari's setunion function, Mathematica's Union function, and Sage's union function on Set objects.

setintersect

my $commonset = setintersect($set1,$set2);
my $is_disjoint = 0 == @$commonset;  # scalar size of the intersection

Given exactly two array references of integers, treats them as sets and returns the intersection as a set. The returned set will have all elements that appear in both input sets.

This is more efficient if the input is in set form (numerically sorted, no duplicates). The result will be in set form.

This corresponds to Pari's setintersect function, Mathematica's Intersection function, and Sage's intersection function on Set objects.

setminus

Given exactly two array references of integers, treats them as sets and returns the difference as a set. The returned set will have all elements that appear in the first set but not in the second.

This is more efficient if the input is in set form (numerically sorted, no duplicates). The result will be in set form.

This corresponds to Pari's setminus function, Mathematica's Complement function, and Sage's difference function on Set objects.

setdelta

Given exactly two array references of integers, treats them as sets and returns the symmetric difference as a set. The returned set will have all elements that appear in only one of the two input sets.

This is more efficient if the input is in set form (numerically sorted, no duplicates). The result will be in set form.

This corresponds to Pari's setdelta function, Mathematica's SymmetricDifference function, and Sage's symmetric_difference function on Set objects.

is_sidon_set

Given an array reference of integers, treats it as a set and returns 1 if it is a Sidon set (sometimes called Sidon sequence), and 0 otherwise. To be a Sidon set, all elements must be non-negative and all pair-wise sums a_i + a_j (i >= j) are unique.

All finite Sidon sets are Golomb rulers, and all Golomb rulers are Sidon.

is_sumfree_set

Given an array reference of integers, treats it as a set and returns 1 if it is a sum-free set, and 0 otherwise. A sum-free set is one where no sum of two elements from the set is equal to any element of the set. That is, the set and its sumset are disjoint.

set_is_disjoint

Given two array references of integers, treats them as sets and returns 1 if the sets have no elements in common, 0 otherwise.

This corresponds to Mathematica's DisjointQ function.

set_is_equal

Given two array references of integers in set form, returns 1 if the sets have all elements in common, 0 otherwise.

This function works even if the inputs are not sorted. If they are sorted (proper set form) then "vecequal" can be used and is typically much faster.

set_is_subset

Given two array references of integers in set form, returns 1 if the first set also contains all elements of the second set, 0 otherwise.

The "setcontains" function can be used equivalently, and does not require the second list to be in set form.

This corresponds to Mathematica's SubsetQ function (is B a subset of A).

set_is_proper_subset

Given two array references of integers in set form, returns 1 if the first set also contains all elements of the second set but are not equal, 0 otherwise. The size of the first set must be strictly larger than the second.

set_is_superset

Given two array references of integers in set form, returns 1 if the second set also contains all elements of the first set, 0 otherwise.

The "setcontains" function can be used equivalently (with reversed arguments).

set_is_proper_superset

Given two array references of integers in set form, returns 1 if the second set also contains all elements of the first set but are not equal, 0 otherwise. The size of the second set must be strictly larger than the first.

set_is_proper_intersection

Given two array references of integers in set form, returns 1 if the two sets have at least one element in common, and each of the two sets have at least one element not present in the other set. Returns 0 otherwise.

todigits

say "product of digits of n: ", vecprod(todigits($n));

Given an integer n, return an array of digits of |n|. An optional second integer argument specifies a base (default 10). For example, given a base of 2, this returns an array of binary digits of n. An optional third argument specifies a length for the returned array. The result will be either have upper digits truncated or have leading zeros added. This is most often used with base 2, 8, or 16.

The values returned may be read-only. todigits(0) returns an empty array. The base must be at least 2, and is limited to an int. Length must be at least zero and is limited to an int.

This corresponds to Pari's digits and binary functions, and Mathematica's IntegerDigits function.

todigitstring

# arguments are:  input integer, base (optional), truncate (optional)
say "decimal 456 in hex is ", todigitstring(456, 16);
say "last 4 bits of $n are: ", todigitstring($n, 2, 4);

Similar to "todigits" but returns a string. For bases <= 10, this is equivalent to joining the array returned by "todigits".

The first argument n is the input integer. The sign is ignored. If no other arguments are given, this just returns the string of n. An optional second argument is the base base which must be between 2 and 36. No prefix such as "0x" will be added, and all bases over 9 use lower case a to z.

An optional third argument k requires the result to be exactly k digits. This truncates to the last k digits if the result has k or fewer digits, or zero extends if the result has more digits.

This corresponds to Mathematica's IntegerString function.

fromdigits

say "hex 1c8 in decimal is ", fromdigits("1c8", 16);
say "Base 3 array to number is: ", fromdigits([0,1,2,2,2,1,0],3);

This takes either a string or array reference, and an optional base (default 10). With a string, each character will be interpreted as a digit in the given base, with both upper and lower case denoting values 11 through 36. With an array reference, the values indicate the entries in that location, and values larger than the base are allowed (results are carried). The result is a number (either a native integer or a bigint).

This corresponds to Pari's fromdigits function and Mathematica's FromDigits function.

tozeckendorf

say tozeckendorf(24);                     #  "1000100"
say fromdigits(tozeckendorf(24),2);       #  68

Given a non-negative integer n, return the Zeckendorf representation as a binary string. This represents n as a sum of nonconsecutive Fibonacci numbers. Each set bit indicates summing the corresponding Fibonacci number, e.g. 24 = 21+3 = F(8)+F(4). F(0)=0 and F(1)=1 are not used. This is sometimes also called Fibbinary or the Fibonacci base.

The restriction that consecutive values are not used ("11" cannot appear) is required to create a unique mapping to the positive integers. A simple greedy algorithm suffices to construct the encoding.

say reverse(tozeckendorf($_)).'1'  for  1..20

shows the first twenty Fibonacci C1 codes (Fraenkel and Klein, 1996). This is an example of a self-synchronizing variable length code.

This corresponds to Mathematica's ZeckendorfRepresentation[n] function. Also see Math::NumSeq::Fibbinary and Data::BitStream::Code::Fibonacci.

fromzeckendorf

say fromzeckendorf("1000100");            #  24
say fromzeckendorf(todigitstring(68,2));  #  24

Given a binary string in Zeckendorf representation, return the corresponding integer. The string may not contain anything other than the characters 0 and 1, and must not contain 11. The resulting number is the sum of the Fibonacci numbers in the position starting from the right (The Fibonacci index is offset by two, as F(0)=0 and F(1)=1 are not used).

sumdigits

# Sum digits of primes to 1 million.
my $s=0; forprimes { $s += sumdigits($_); } 1e6; say $s;

Given an input n, return the sum of the digits of n. Any non-digit characters of n are ignored (including negative signs and decimal points). This is similar to the command vecsum(split(//,$n)) but faster, allows non-positive-integer inputs, and can sum in other bases.

An optional second argument indicates the base of the input number. This defaults to 10, and must be between 2 and 36. Any character that is outside the range 0 to base-1 will be ignored.

If no base is given and the input number n begins with 0x or 0b then it will be interpreted as a string in base 16 or 2 respectively.

Regardless of the base, the output sum is a decimal number.

This is similar but not identical to Pari's sumdigits function from version 2.8 and later. The Pari/GP function always takes the input as a decimal number, uses the optional base as a base to first convert to, then sums the digits. This can be done with either vecsum(todigits($n, $base)) or sumdigits(todigitstring($n,$base)). Math::BigInt version 1.999818 has a similar digitsum function.

valuation

say "$n is divisible by 2 ", valuation($n,2), " times.";

Given integer n and non-negative integer k, returns the number of times n is divisible by k. This is a very limited version of the algebraic valuation -- here it is just applied to integers.

k must be greater than 1. |n| is used, |n| = 0 returns undef, and |n| = 1 returns zero.

This corresponds to Pari and SAGE's valuation function.

hammingweight

Given an integer n, returns the binary Hamming weight of abs(n). This is also called the population count, and is the number of 1s in the binary representation. This corresponds to Pari's hammingweight function for t_INT arguments.

is_square_free

say "$n has no repeating factors" if is_square_free($n);

Given integer n, returns 1 if |n| has no repeated factor.

is_cyclic

Given integer n, returns 1 if n is positive and cyclic in the number theory sense, and returns 0 otherwise. A cyclic number n has only one group of order n. n and φ(n) are relatively prime.

This is the OEIS series A003277.

is_carmichael

for (1..1e6) { say if is_carmichael($_) } # Carmichaels under 1,000,000

Given an integer n, returns 1 if n is positive and a Carmichael number, and returns 0 otherwise. These are composites that satisfy b^(n-1) ≡ 1 mod n for all 1 < b < n relatively prime to n. Alternately Korselt's theorem says these are composites such that n is square-free and p-1 divides n-1 for all prime divisors p of n.

For inputs larger than 50 digits after removing very small factors, this uses a probabilistic test since factoring the number could take unreasonably long. The first 150 primes are used for testing. Any that divide n are checked for square-free-ness and the Korselt condition, while those that do not divide n are used as the pseudoprime base. The chances of a non-Carmichael passing this test are less than 2^-150.

This is the OEIS series A002997.

is_quasi_carmichael

Given an integer n, returns 0 if n is negative or not a quasi-Carmichael number, and returns the number of bases otherwise. These are square-free composites that satisfy p+b divides n+b for all prime factors p of n and for one or more non-zero integer b.

This is the OEIS series A257750.

is_semiprime

Given an integer n, returns 1 if n is positive and a semiprime, and returns 0 otherwise. A semiprime is the product of exactly two primes.

The boolean result is the same as scalar(factor(n)) == 2, but this function performs shortcuts that can greatly speed up the operation.

is_almost_prime

say is_almost_prime(6,2169229601);  # True if n has exactly 6 factors

Given non-negative integers k and n, returns 1 if n has exactly k prime factors, and 0 otherwise. With k=1, this is a standard primality test. With k=2, this is the same as "is_semiprime".

Functionally identical but possibly faster than prime_bigomega(n) == k.

is_omega_prime

say is_omega_prime(6,2169229601);  # True if n has 6 distinct factors

Given non-negative integers k and n, returns 1 if n has exactly k distinct prime factors (not counting multiplicity), and 0 otherwise. With k=1, this is the same as "is_prime_power".

Functionally identical but possibly faster than prime_omega(n) == k.

is_chen_prime

Given non-negative integer n return 1 if n is a Chen prime. That is, if n is prime and n+2 is either a prime or semi-prime.

is_fundamental

Given an integer d, returns 1 if d is a fundamental discriminant, 0 otherwise. We consider 1 to be a fundamental discriminant.

This is the OEIS series A003658 (positive) and OEIS series A003657 (negative).

This corresponds to Pari's isfundamental function.

is_totient

Given an integer n, returns 1 if there exists an integer x where euler_phi(x) == n.

This corresponds to Pari's istotient function, though without the optional second argument to return an x. Math::NumSeq::Totient also has a similar function.

Also see "inverse_totient" which gives the count or list of values that produce a given totient. This function is more efficient than getting the full count or list.

is_pillai

Given a non-negative integer n, if there exists a v where v! % n == n-1 and n % v != 1, then the least v is returned. Otherwise 0.

For n prime, non-zero return values give OEIS series A063980. The non-zero values returned produce OEIS series A063828.

is_polygonal

Given an integer x and a positive integer s greater than 2, return 1 if x is an s-gonal number, and return 0 otherwise.

If a third argument is present, it must be a scalar reference. It will be set to n if x is the nth s-gonal number. If the function returns 0, then it will be unchanged.

This corresponds to Pari's ispolygonal function.

is_congruent_number

Given a non-negative integer n, returns 1 if n is the area of a rational right triangle, and 0 otherwise.

This function answers the congruent number problem using Tunnell's theorem which relies on the Birch Swinnerton-Dyer conjecture. It uses an extensive filter for known non-congruent families, including the works of Bastien (1915), Lagrange (1974), Monsky (1990), Serf (1991), Iskra (1996), Feng (1996), Reinholz et al. (2013), Cheng and Guo (2018 and 2019), Das and Saikia (2020), and Evink (2021).

cornacchia

Given non-negative integers d and n, finds solutions (x,y) to the equation x^2 + d y^2 = n. undef is returned if no solution exists.

In the case of n a prime, this is done using Cornacchia's algorithm.

For non-prime n, we use a combination of Cornacchia-Smith on all roots, as well as a loop to find solutions in the harder cases. This means we will always return a solution.

There will often be multiple solutions, but only one is returned.

contfrac

my @CF = contfrac(415,93);
# CF = (4,2,6,7)  =>  4+(1/(2+1/(6+1/7))) = 415/93
#                     ^     ^    ^   ^

Given an integer n and a positive integer d, returns a list with the simple continued fraction representation of the rational n / d.

This corresponds to a subset of Pari's contfrac function, Mathematica's ContinuedFraction[n/d] function, and Sage's continued_fraction function.

from_contfrac

my($N,$D) = from_contfrac(4,2,6,7);  # N = 415, D = 93

Given an array of integers representing the simple continued fraction, returns the rational n / d as two integers.

The first input value represents the whole part, and may be zero or negative. All successive input values must be non-negative and non-zero.

This corresponds to a subset of Pari's contfracpnqn function, Mathematica's FromContinuedFraction[list] function, and one value of Sage's convergent(n) method.

next_calkin_wilf

($n,$d) = next_calkin_wilf($n,$d);

Given two positive coprime integers n and d representing the rational n / d, returns the next value in the breadth-first traversal of the Calkin-Wilf tree of rationals as a two-element list.

The Calkin-Wilf tree has an entry for all positive rationals in lowest form, with each one appearing only once. While it is not a binary search tree over the positive rationals like the Stern-Brocot tree, it is more efficient to traverse in both depth and breadth order.

This corresponds to Julia's Nemo next_calkin_wilf function.

This can efficiently iterate through OEIS series A002487.

next_stern_brocot

($n,$d) = next_stern_brocot($n,$d);

Given two positive coprime integers n and d representing the rational n / d, returns the next value in the breadth-first traversal of the Stern-Brocot tree of rationals as a two-element list.

The Stern-Brocot tree has an entry for all positive rationals in lowest form, with each one appearing only once. Read left-to-right on each row, the numbers appear in ascending order. It is a binary search tree over the positive rationals (this was exactly Brocot's motivation). It is not as efficient as "next_calkin_wilf".

This produces OEIS series A007305 (numerators) and OEIS series A047679 (denominators).

calkin_wilf_n

my $idx = calkin_wilf_n($n,$d);

Given two positive coprime integers n and d representing the rational n / d, returns the index in the breadth-first traversal of the Calkin-Wilf tree of rationals.

This corresponds to the xy_to_n method in Math::PlanePath::RationalsTree with tree_type = 'CW'>.

stern_brocot_n

my $idx = stern_brocot_n($n,$d);

Given two positive coprime integers n and d representing the rational n / d, returns the index in the breadth-first traversal of the Stern-Brocot tree of rationals.

This corresponds to the xy_to_n method in Math::PlanePath::RationalsTree with tree_type = 'SB'>.

nth_calkin_wilf

($n,$d) = nth_calkin_wilf($idx);

Given a positive integer i, returns the rational in the corresponding index in the breadth-first traversal of the Calkin-Wilf tree of rationals.

This corresponds to the n_to_xy method in Math::PlanePath::RationalsTree with tree_type = 'CW'>.

nth_stern_brocot

($n,$d) = nth_stern_brocot($idx);

Given a positive integer i, returns the rational in the corresponding index in the breadth-first traversal of the Stern-Brocot tree of rationals.

This corresponds to the n_to_xy method in Math::PlanePath::RationalsTree with tree_type = 'SB'>.

nth_stern_diatomic

$n = nth_stern_diatomic($idx);

Given a non-negative integer i, returns the i-th Stern diatomic number. This is sometimes called fusc(i) (Dijkstra), Stern's diatomic series, or the Stern-Brocot sequence. The latter can be easily confused with the Stern-Brocot tree.

This corresponds to Sidef's fusc function. See also "next_calkin_wilf" where the sequence of numerators generates this sequence.

This produces OEIS series A002487.

farey

#    F[3] = 0/1  1/3  1/2  2/3  1/1
#
say scalar farey(3);   #  5
my @F3 = farey(3);     #  ([0,1], [1,3], [1,2], [2,3], [1,1])
my $F33 = farey(3,3);  #  [2/3] = $F3[3]
# Print the list in readable form
say join " ",map { join "/",@$_ } farey(3);

Given a single positive integer n returns the Farey sequence of order n. In scalar context, returns the length without computing terms. In array context, returns a list with each rational as a 2-entry array reference.

Given two values: a positive integer n and a non-negative integer k, returns the k-th entry of the order n Farey sequence. The index starts at zero so it matches using the full list as an array. If k is larger than the number of entries, undef is returned.

This corresponds to Mathematica's FareySequence function (their two argument version is 1-based rather than 0-based).

The lengths are OEIS series A005728. The numerators are OEIS series A006842. The denominators are OEIS series A006843.

next_farey

my $next = next_farey(9,[5,9]);  # returns [4,7]

Given a positive integer n and a 2-element array reference containing a non-negative integer p and a positive integer q, returns the next rational appearing after p/q in the order n Farey sequence. Returns undef if p/q is greater than or equal to one.

farey_rank

my $rank = farey_rank(9,[5,9]);  # returns 15

Given a positive integer n and a 2-element array reference containing a non-negative integer p and a positive integer q, returns the number of rationals less than p/q in the order n Farey sequence. The given fraction does not need to be an entry in the sequence, nor does it need to be in reduced form.

A unit fraction will return the totient sum of n. Any fraction greater than one will return the length of the order n sequence, as expected.

Many OEIS sequences can be produced from this, including OEIS series A005728 (<= 1), OEIS series A049806 (<= 1/2), OEIS series A049807 (<= 1/3), OEIS series A049808 (<= 1/4), ..., OEIS series A049805 (<= 1/k),

prime_bigomega

say "$n has ", prime_bigomega($n), " total factors";

Given a non-negative integer n, returns Ω(|n|), the prime Omega function. This is the total number of prime factors of n including multiplicities. The result is identical to scalar(factor($n)). The return value is a read-only constant.

This corresponds to Pari's bigomega function and Mathematica's PrimeOmega[n] function.

prime_omega

say "$n has ", prime_omega($n), " distinct factors";

Given a non-negative integer n, returns ω(|n|), the prime omega function. This is the number of distinct prime factors of n. The result is identical to scalar(factor_exp($n)). The return value is a read-only constant.

This corresponds to Pari's omega function and Mathematica's PrimeNu[n] function.

moebius

say "$n is square free" if moebius($n) != 0;
$sum += moebius($_) for (1..200); say "Mertens(200) = $sum";
say "Mertens(2000) = ", vecsum(moebius(0,2000));

Given a single integer n, returns μ(|n|), the Möbius function (also known as the Moebius, Mobius, or MoebiusMu function). This function is 1 if n = 1, 0 if n is not square-free (i.e. n has a repeated factor), and (-1)^t if n is a product of t distinct primes. This is an important function in prime number theory. Like SAGE, we define moebius(0) = 0 for convenience.

If given two integers low and high, they define a range, and the function returns an array with the value of the Möbius function for every |n| from low to high inclusive. Large values of high will result in a lot of memory use. The algorithm used for ranges is Deléglise and Rivat (1996) algorithm 4.1, which is a segmented version of Lioen and van de Lune (1994) algorithm 3.2.

Negative ranges are possible, e.g. moebius(-30,-20) will return moebius(|n|) for -30, -29, -28, ..., -20.

The return values are read-only constants. This should almost never come up, but it means trying to modify aliased return values will cause an exception (modifying the returned scalar or array is fine).

mertens

say "Mertens(10M) = ", mertens(10_000_000);   # = 1037

Given a non-negative integer n, return M(n), the Mertens function. This function is defined as sum(moebius(1..n)), but calculated more efficiently for large inputs. For example, computing Mertens(100M) takes:

time    approx mem
  0.01s     0.1MB   mertens(100_000_000)
  1.3s    880MB     vecsum(moebius(1,100_000_000))
 16s        0MB     $sum += moebius($_) for 1..100_000_000

The summation of individual terms via factoring is quite expensive in time, though uses O(1) space. Using the range version of moebius is much faster, but returns a 100M element array which, even though they are shared constants, is not good for memory at this size. In comparison, this function will generate the equivalent output via a sieving method that is relatively memory frugal and very fast. The current method is a simple n^1/2 version of Deléglise and Rivat (1996), which involves calculating all moebius values to n^1/2, which in turn will require prime sieving to n^1/4.

Various algorithms exist for this, using differing quantities of μ(n). The simplest way is to efficiently sum all n values. Benito and Varona (2008) show a clever and simple method that only requires n/3 values. Deléglise and Rivat (1996) describe a segmented method using only n^1/3 values. The current implementation does a simple non-segmented n^1/2 version of their method. Kuznetsov (2011) gives an alternate method that he indicates is even faster. Helfgott and Thompson (2020) give a fast method based on advanced prime count algorithms.

euler_phi

say "The Euler totient of $n is ", euler_phi($n);

Given a single integer n, returns φ(n), the Euler totient function (also called Euler's phi or phi function). This is an arithmetic function which counts the number of positive integers less than or equal to n that are relatively prime to n.

Given the definition used, euler_phi will return 0 for all n < 1. This follows the logic used by SAGE. Mathematica and Pari return euler_phi(-n) for n < 0. Mathematica returns 0 for n = 0, Pari pre-2.6.2 raises an exception, and Pari 2.6.2 and newer returns 2.

If called with two integer arguments low and high, they define an inclusive range. The function returns a list with the totient of every n from low to high inclusive.

inverse_totient

In array context, given a non-negative integer n, returns the complete list of values x where euler_phi(x) = n. This can be a memory intensive operation if there are many values.

In scalar context, returns just the count of values. This is faster and uses substantially less memory. The list/scalar distinction is similar to "factor" and "divisors".

This roughly corresponds to the Maple function InverseTotient, and the hidden Mathematica function EulerPhiInverse. The algorithm used is from Max Alekseyev (2016).

jordan_totient

say "Jordan's totient J_$k($n) is ", jordan_totient($k, $n);

Given non-negative integers k and n, returns Jordan's k-th totient function for n. Jordan's totient is a generalization of Euler's totient, where jordan_totient(1,$n) == euler_phi($n) This counts the number of k-tuples less than or equal to n that form a coprime tuple with n. As with euler_phi, 0 is returned for all n < 1. This function can be used to generate some other useful functions, such as the Dedekind psi function, where psi(n) = J(2,n) / J(1,n).

sumtotient

Given a non-negative integer n, returns the summatory Euler totient function. This function is defined as sum(euler_phi(1..n)), but calculated much more efficiently.

A sub-linear time recursion is implemented, using O(n^2/3) memory. Memory use is restricted so growth becomes approximately linear above 10^13.

This is OEIS series A002088.

ramanujan_sum

Given two non-negative integers k and n, returns Ramanujan's sum. This is the sum of the nth powers of the primitive k-th roots of unity.

Note this is not related to Ramanujan summation for divergent series.

exp_mangoldt

say "exp(lambda($_)) = ", exp_mangoldt($_) for 1 .. 100;

Given a non-negative integer n, returns EXP(Λ(n)), the exponential of the Mangoldt function (also known as von Mangoldt's function). The Mangoldt function is equal to log p if n is prime or a power of a prime, and 0 otherwise. We return the exponential so all results are integers. Hence the return value for exp_mangoldt is:

p   if n = p^m for some prime p and integer m >= 1
1   otherwise.

liouville

Given a non-negative integer n, returns λ(n), the Liouville function. This is -1 raised to Ω(n) (the total number of prime factors).

This corresponds to Mathematica's LiouvilleLambda[n] function. It can be computed in Pari/GP as (-1)^bigomega(n).

sumliouville

Given a non-negative integer n, returns L(n), the summatory Liouville function. This function is defined as sum(liouville(1..n)), but calculated much more efficiently.

There are a number of relations to the "mertens" function.

This is OEIS series A002819.

chebyshev_theta

say chebyshev_theta(10000);

Given a non-negative integer n, returns θ(n), the first Chebyshev function. This is the sum of the logarithm of each prime where p <= n. Effectively:

my $s = 0;  forprimes { $s += log($_) } $n;  return $s;

but computed more efficiently and accurately.

chebyshev_psi

say chebyshev_psi(10000);

Given a non-negative integer n, returns ψ(n), the second Chebyshev function. This is the sum of the logarithm of each prime power where p^k <= n for an integer k. Effectively:

my $s = 0;  for (1..$n) { $s += log(exp_mangoldt($_)) }  return $s;

but computed more efficiently and accurately. We compute it as a Neumaier sum from k = 1 .. floor(log2(n)) of chebyshev_theta(n^(1/k)).

divisor_sum

say "Sum of divisors of $n:", divisor_sum( $n );
say "sigma_2($n) = ", divisor_sum($n, 2);
say "Number of divisors: sigma_0($n) = ", divisor_sum($n, 0);

Given a single non-negative integer n, returns the sum of the divisors of n, including 1 and itself. We return 0 for n=0.

An optional second non-negative integer k may be given, indicating the sum should use the k-th powers of the divisors.

This is known as the sigma function (see Hardy and Wright section 16.7). The API is identical to Pari/GP's sigma function, and not dissimilar to Mathematica's DivisorSigma[k,n] function. This function is useful for calculating things like aliquot sums, abundant numbers, perfect numbers, etc.

With various k values, the results are the OEIS sequences OEIS series A000005 (k=0, number of divisors), OEIS series A000203 (k=1, sum of divisors), OEIS series A001157 (k=2, sum of squares of divisors), OEIS series A001158 (k=3, sum of cubes of divisors), etc.

The second argument may also be a code reference, which is called for each divisor and the results are summed. This allows computation of other functions, but will be less efficient than using the numeric second argument. This corresponds to Pari/GP's sumdiv function.

An example of the 5th Jordan totient (OEIS A059378):

divisor_sum( $n, sub { my $d=shift; $d**5 * moebius($n/$d); } );

though we have a function "jordan_totient" which is more efficient.

For numeric second arguments (sigma computations), the result will be a bigint if necessary. For the code reference case, the user must take care to return bigints if overflow will be a concern.

ramanujan_tau

Given an integer n, returns the value of Ramanujan's tau function. The result is a signed integer. Zero is returned for negative n. This corresponds to Pari v2.8's tauramanujan function and Mathematica's RamanujanTau function.

This currently uses a simple method based on divisor sums, which does not have a good computational growth rate. Pari's implementation uses Hurwitz class numbers and is more efficient for large inputs.

primorial

$prim = primorial(11); #        11# = 2*3*5*7*11 = 2310

Given a non-negative integer n, returns the primorial n#, defined as the product of the prime numbers less than or equal to n. This is the OEIS series A034386: primorial numbers second definition.

primorial(0)  == 1
primorial($n) == pn_primorial( prime_count($n) )

The result will be a Math::BigInt object if it is larger than the native bit size.

Be careful about which version (primorial or pn_primorial) matches the definition you want to use. Not all sources agree on the terminology, though they often give a clear definition of which of the two versions they mean. OEIS, Wikipedia, and Mathworld are all consistent, and these functions should match that terminology. This function should return the same result as the mpz_primorial_ui function added in GMP 5.1.

pn_primorial

$prim = pn_primorial(5); #      p_5# = 2*3*5*7*11 = 2310

Given a non-negative integer n, returns the primorial number p_n#, defined as the product of the first n prime numbers (compare to the factorial, which is the product of the first n natural numbers). This is the OEIS series A002110: primorial numbers first definition.

pn_primorial(0)  == 1
pn_primorial($n) == primorial( nth_prime($n) )

The result will be a Math::BigInt object if it is larger than the native bit size.

consecutive_integer_lcm

$lcm = consecutive_integer_lcm($n);

Given a non-negative integer n, returns the least common multiple of all integers from 1 to n. This can be done by manipulation of the primes up to n, resulting in much faster and memory-friendly results than using a factorial.

This is OEIS series A003418. Matching that series, we define consecutive_integer_lcm(0) = 1.

partitions

Given an integer n, returns the partition function p(n). If n is negative, 0 is returned. This is the number of ways of writing the integer n as a sum of positive integers, without restrictions.

This corresponds to Pari's numbpart function and Mathematica's PartitionsP function. The values produced in order are OEIS series A000041.

This uses a combinatorial calculation, which means it will not be very fast compared to Pari, Mathematica, or FLINT which use the Rademacher formula using multi-precision floating point. In 10 seconds:

           70    Integer::Partition
           90    MPU forpart { $n++ }
       15_000    MPU pure Perl partitions
      280_000    MPU GMP partitions
   35_000_000    Pari 2.6 numbpart
  500_000_000    Jonathan Bober's partitions_c.cc v0.6
1_400_000_000    Pari 2.8 numbpart

If you want the enumerated partitions, see "forpart".

lucky_numbers

Given a single 32-/64-bit non-negative integer n, returns an array reference of values up to the input n (inclusive) which remain after the lucky number sieve originally defined by Gardiner, Lazarus, Metropolis, and Ulam. This is OEIS series A000959.

If given two non-negative integers lo and hi, returns sieve results between the two ranges inclusive. This is identical to the above but does not include any numbers less than lo. Currently there is very little time savings, but it does use less memory.

A surprising number of asymptotic properties of the primes are shared with this sieve, though the resulting sets are quite different.

There is no current algorithm for efficiently sieving a segment, though the method used here is orders of magnitude faster than those linked on OEIS as of early 2023. CPU time growth is similar to prime sieving, about n log n. Memory use is linear with size and uses about n/25 bytes for the internal sieve.

is_lucky

Given an integer n, returns 1 if n is included in the set of lucky numbers and returns 0 otherwise. The process used is analogous to trial division using the lucky numbers less than n/log(n). For inputs not quickly discarded, the performance is essentially the same as generating the nth lucky number nearest to the input.

lucky_count

Given a single non-negative integer n, returns the count of lucky numbers less than or equal to n. If given two non-negative integers lo and hi, returns the count of lucky numbers between lo and hi inclusive.

lucky_count_approx

Given a single non-negative integer n, quickly returns a good estimate of the count of lucky numbers less than or equal to n.

lucky_count_lower

Given a single non-negative integer n, quickly returns a lower bound of the count of lucky numbers less than or equal to n. The actual count will always be greater than or equal to the result.

lucky_count_upper

Given a single non-negative integer n, quickly returns an upper bound of the count of lucky numbers less than or equal to n. The actual count will always be less than or equal to the result.

nth_lucky

Given a non-negative integer n, returns the n-th lucky number. This is done by sieving lucky numbers to n then performing a reverse calculation to determine the value at the nth position. This is much more efficient than generating all the lucky numbers up to the nth position, but is much slower than "nth_prime".

nth_lucky_approx

Given a single non-negative integer n, quickly returns a good estimate of the n-th lucky number.

nth_lucky_lower

Given a single non-negative integer n, quickly returns a lower bound of the n-th lucky number. The actual value will always be greater than or equal to the result.

nth_lucky_upper

Given a single non-negative integer n, quickly returns an upper bound of the n-th lucky number. The actual value will always be less than or equal to the result.

minimal_goldbach_pair

Given a single non-negative integer n, returns the smallest prime p such that p + q = n and both p and q are primes. Only the single value p is returned, with q = n-p and p <= q. Both p and q are prime.

undef is returned if no such p exists. This will happen for values less than 4 and for all odd n where n != 2+q for a prime q. The Goldbach Conjecture famously states that a p exists for all even n greater than 2.

This function is reasonably fast even for larger values of n as it can terminate after the first pair is found. On Macbook M1, average time is under 1 microsecond for 32-bit even inputs, under 10 microseconds for 64-bit even inputs, and 1 millisecond for 105 bit even inputs.

goldbach_pair_count

Given a single non-negative integer n, returns the number of pairs of primes p and q where p <= q and p + q = n.

If no such pairs exist, 0 is returned.

goldbach_pairs

Given a single non-negative integer n, returns a list containing each p for all prime pairs p and q where p <= q and p + q = n. The number of elements returned is the same as "goldbach_pair_count".

If no such pairs exist, an empty list is returned.

is_happy

Given a single non-negative integer n, returns the number of iterations required for the map of sum of squared base-10 digits to converge to 1, or 0 if it does not converge to the value 1.

This returns the height using the OEIS A090425 definition of height, which is zero for non-happy numbers, 1 for n=1, 2 for numbers that produce 1 after a single iteration, etc. This is one more than the definitions used in many papers (e.g. Cai and Zhou 2008) where n=1 is considered to have height 0.

An optional base and exponent may be given (default base 10 exponent 2). The base must be between 2 and 36, and the exponent between 0 and 10. The input n is read as a decimal number, so giving input such as "1001" will be treated as the decimal 1001 regardless of base.

With base 10 and exponent 2, non-zero values produce OEIS series A007770. The values themselves produce OEIS series A090425.

is_smooth

my $is_23_smooth = is_smooth($n, 23);

Given two non-negative integer inputs n and k, returns 1 if |n| is k-smooth, and 0 otherwise. This uses the OEIS definition: Returns true if no prime factors of n are larger than k.

The values for n=0 and n=1 use the definition along with noting that factor(0) returns 0 and factor(1) returns an empty list.

The result is identical to:

sub is_smooth { my($n,$k)=@_; return 0+(vecnone { $_ > $k } factor($n)); }

but shortcuts are taken to avoid fully factoring if possible.

This corresponds to Mathematica's SmoothIntegerQ[n] resource function.

is_rough

my $is_23_rough = is_rough($n, 23);

Given two non-negative integer inputs n and k, returns 1 if |n| is k-rough, and 0 otherwise. This uses the OEIS definition: Returns true if no prime factors of n are smaller than k.

The values for n=0 and n=1 use the definition along with noting that factor(0) returns 0 and factor(1) returns an empty list.

The result is identical to:

sub is_rough { my($n,$k)=@_; return 0+(vecnone { $_ < $k } factor($n)); }

but shortcuts are taken to avoid fully factoring if possible.

is_powerful

my $all_factors_cubes_or_higher = is_powerful($n, 3);

Given an integer n and an optional non-negative integer k, returns 1 if n is k-powerful, and 0 otherwise. If k is omitted, k=2 is used.

A k-powerful number is a positive integer where all prime factors appear at least k times. All positive integers are therefore 0- and 1-powerful. n=1 is powerful for all k. 0 is returned for all negative or zero values of n.

With k=2 this corresponds to Pari's ispowerful function for positive values of n. Pari chooses to define 0 as powerful and uses abs(n).

While we can easily code this as a one line function using "vecall" and "factor_exp", this is significantly faster and doesn't need to fully factor the input.

powerful_numbers

my $arrayref_pn1 = powerful_numbers(20);      # 1,4,8,9,16
my $arrayref_pn2 = powerful_numbers(20,40);   # 25,27,32,36
my $arrayref_pn3 = powerful_numbers(1,70,3);  # 1,8,16,27,32,64

Given a single non-negative integer n, returns an array reference with all 2-powerful integers from 1 to n inclusive.

Given two non-negative integers lo and hi, returns an array reference with all 2-powerful integers from lo to hi inclusive.

Given three non-negative integers lo, hi, and k, returns an array reference with all k-powerful integers from lo to hi inclusive.

# Alternate solutions for values 1-n

# Simple, but very slow for high $n.
for (1..$n) { say if is_powerful($_,$k); }

# Not so bad, especially for high $k.
for (1..powerful_count($n,$k)) { say nth_powerful($_,$k); }

# Best by far.
say for @{powerful_numbers(1,$n,$k)};

Note that n <= 0 are non-powerful.

powerful_count

my $npowerful3 = powerful_count(2**32, 3);

Given an integer n and an optional non-negative integer k, returns the total number of k-powerful numbers from 1 to n inclusive. If k is omitted, k=2 is used.

sumpowerful

Given an integer n and an optional non-negative integer k, returns the sum of positive integer k-powerful numbers less than or equal to n. That is, the sum for all x, 1 <= x <= n, where x is a k-powerful number. If k is omitted, k=2 is used.

nth_powerful

Given a non-negative integer n and an optional non-negative integer k, returns the n-th k-powerful number. If k is omitted, k=2 is used. For all k, returns undef for n=0 and 1 for n=1.

is_perfect_power

Given an integer n, returns 1 if n is a perfect power, and 0 otherwise. That is, if n = c^k for some integers c and k with k greater than 1.

The results match the mpz_perfect_power_p(n) function of GMP 4.0+. Following GMP, SAGE, and FLINT, we treat -1, 0, and 1 as perfect powers.

For positive integers, this is OEIS series A001597.

next_perfect_power

Given an integer n, returns the smallest perfect power greater than n. Similar in API to "next_prime", but returns the next perfect power with exponent greater than 1. Starting from 0 this gives the sequence 1,4,8,9,16,25,....

Negative inputs are supported, with the result being the nearest value greater than n where is_perfect_power returns true.

prev_perfect_power

Given an integer n, returns the greatest perfect power less than n. Similar in API to "prev_prime", but returns the previous perfect power with exponent greater than 1.

Negative inputs are supported, with the result being the nearest value less than n where is_perfect_power returns true.

perfect_power_count

Given a non-negative integer n, returns the number of integers not exceeding n which are perfect powers. If given two non-negative integers lo and hi, returns the count of perfect powers between lo and hi inclusive.

By convention, numbers less than 1 are not counted.

This can be calculated extremely quickly (less than 100ns per call for native size integers), so in most cases there is no need for the approximations or bounds.

This is OEIS series A069623.

perfect_power_count_approx

Given a non-negative integer n, quickly returns a good estimate of the count of perfect powers less than or equal to n.

perfect_power_count_lower

Given a non-negative integer n, quickly returns a lower bound of the count of perfect powers less than or equal to n. The actual count will always be greater than or equal to the result.

perfect_power_count_upper

Given a non-negative integer n, quickly returns an upper bound of the count of perfect powers less than or equal to n. The actual count will always be less than or equal to the result.

nth_perfect_power

Given a non-negative integer n, returns the n-th perfect power.

Since the perfect power count can be calculated extremely quickly, using inverse interpolation can calculate the n-th perfect power quite rapidly.

Similar to "perfect_power_count", the convention is to exclude all integers less than 1. Hence n=0 returns undef and n=1 returns 1.

nth_perfect_power_approx

Given a non-negative integer n, quickly returns a good estimate of the n-th perfect power.

nth_perfect_power_lower

Given a non-negative integer n, quickly returns a lower bound of the n-th perfect power. The actual value will always be greater than or equal to the result.

nth_perfect_power_upper

Given a non-negative integer n, quickly returns an upper bound of the n-th perfect power. The actual value will always be less than or equal to the result.

next_chen_prime

Given a non-negative integer n, return the smallest Chen prime strictly greater than n. This will be a prime p: p > n, where p+2 is either a prime or a semiprime.

smooth_count

Given non-negative integer inputs n and k, returns the number of integers between 1 and n inclusive, that have no prime factor larger than k.

For all n, smooth_count(n,0) = smooth_count(n,1) = 1. For all k, smooth_count(0,k) = 0 and smooth_count(1,k) = 1.

This is equivalent to, but much faster than, vecsum( map { is_smooth($_,$k) } 1..$n ).

rough_count

Given non-negative integer inputs n and k, returns the number of integers between 1 and n inclusive, that have no prime factor less than k.

For all n, rough_count(n,0) = rough_count(n,1) = rough_count(n,2) = n. For all k, rough_count(0,k) = 0 and rough_count(1,k) = 1.

This is equivalent to, but much faster than, vecsum( map { is_rough($_,$k) } 1..$n ).

is_practical

Given an integer n, returns 1 if n is a practical number, and returns 0 otherwise. A practical number is a positive integer n such that all smaller positive integers can be represented as sums of distinct divisors of n. This is OEIS series A005153.

carmichael_lambda

Given a non-negative integer n, returns the Carmichael function (also called the reduced totient function, or Carmichael λ(n)). This is the smallest positive integer m such that a^m = 1 mod n for every integer a coprime to n. This is OEIS series A002322.

This corresponds to Mathematica's CarmichaelLambda[n] function. It can be computed in Pari/GP as lcm(znstar(n)[2]).

kronecker

Given two integers a and n, returns the Kronecker symbol (a|n). The possible return values with their meanings for odd prime n are:

 0   a = 0 mod n
 1   a is a quadratic residue mod n       (x^2 = a mod n for some x)
-1   a is a quadratic non-residue mod n   (no x where x^2 = a mod n)

The Kronecker symbol is an extension of the Jacobi symbol to all integer values of n from the latter's domain of positive odd values of n. The Jacobi symbol is itself an extension of the Legendre symbol, which is only defined for odd prime values of n. This corresponds to Pari's kronecker(a,n) function, Mathematica's KroneckerSymbol[n,m] function, and GMP's mpz_kronecker(a,n), mpz_jacobi(a,n), and mpz_legendre(a,n) functions.

If n is not an odd prime, then the result does not necessarily indicate whether a is a quadratic residue mod n. Using the function "is_qr" will return correct results for any n, though could be slower.

factorial

Given a non-negative integer n, returns the factorial of n, defined as the product of the integers 1 to n with the special case of factorial(0) = 1. This corresponds to Pari's factorial(n) and Mathematica's Factorial[n] functions.

subfactorial

Given a non-negative integer n, returns the subfactorial of n, which is the number of derangements of n objects. This is the number of permutations of n items where each item is not allowed to stay in its starting position.

This is OEIS series A000166. This corresponds to Mathematica's Subfactorial[n] function.

binomial

Given two integers n and k, returns the binomial coefficient n*(n-1)*...*(n-k+1)/k!, also known as the choose function. Negative arguments use the Kronenburg extensions. This corresponds to Pari's binomial(n,k) function, Mathematica's Binomial[n,k] function, and GMP's mpz_bin_ui function.

For negative arguments, this matches Mathematica. Pari does not implement the n < 0, k <= n extension and instead returns 0 for this case. GMP's API does not allow negative k but otherwise matches. Math::BigInt version 1.999816 and later supports negative arguments with similar semantics. Prior to this, n < 0, k > 0 was undefined.

falling_factorial

Given two integers x and n, with n non-negative, returns the falling factorial of n.

falling_factorial(x,n) = x * (x-1) * (x-2) * ... * (x-(n-1))

This corresponds to Mathematica's FactorialPower[x,n] function.

rising_factorial

Given two integers x and n, with n non-negative, returns the rising factorial of n.

rising_factorial(x,n) = x * (x+1) * (x+2) * ... * (x+(n-1))

This corresponds to Mathematica's Pochhammer[x,n] function.

powersum

say powersum(100,1);  #     5050 = vecsum(1..100)
say powersum(100,2);  #   338350 = vecsum(map{powint($_,2)} 1..100)
say powersum(100,3);  # 25502500 = vecsum(map{powint($_,3)} 1..100)

Given two non-negative integers n and k, returns the sum of k-th powers of the first n positive integers.

With k=2 this is (OEIS A000330). With k=3 this is (OEIS A000537). With k=4 this is (OEIS A000538). OEIS sequences can be found through k=8.

This corresponds to the faulhaber_sum(n,k) function in Math::AnyNum and Pari's dirpowerssum(n,k) function using integer arguments.

hclassno

Given an integer n, returns 12 times the Hurwitz-Kronecker class number. This will always be an integer due to the pre-multiplication by 12. The result is 0 for negative n and all n congruent to 1 or 2 mod 4.

This is related to Pari's qfbhclassno(n) where hclassno(n) for positive n equals 12 * qfbhclassno(n) in Pari/GP. This is OEIS A259825.

bernfrac

my($num,$den) = bernfrac(12);  # returns (-691,2730)

Returns the Bernoulli number B_n for an integer argument n, as a rational number represented by two integers. B_1 is chosen as 1/2, which is the same as Pari's bernfrac(n) and Mathematica's BernoulliB functions.

Having a modern version of Math::Prime::Util::GMP installed will make a big difference in speed. That module uses a fast Pi/Zeta method. Our pure Perl backend uses the Seidel method as shown by Peter Luschny. This is faster than Math::Pari which uses an older algorithm, but quite a bit slower than modern Pari, Mathematica, or our GMP backend.

This corresponds to Pari's bernfrac function and Mathematica's BernoulliB function.

bernreal

Given a non-negative integer n, returns the Bernoulli number B_n as a Math::BigFloat object using the default precision. An optional second argument may be given specifying the precision to be used.

This corresponds to Pari's bernreal function.

stirling

say "s(14,2) = ", stirling(14, 2);
say "S(14,2) = ", stirling(14, 2, 2);
say "L(14,2) = ", stirling(14, 2, 3);

Given two 32-/64-bit non-negative integers n and k, plus an optional third argument kind (1, 2, or 3, with the default being 1), returns the Stirling number of the given kind. The third kind are the unsigned Lah numbers. This corresponds to Pari's stirling(n,k,{type}) function and Mathematica's StirlingS1 / StirlingS2 functions.

Stirling numbers of the first kind are -1^(n-k) times the number of permutations of n symbols with exactly k cycles. Stirling numbers of the second kind are the number of ways to partition a set of n elements into k non-empty subsets. The Lah numbers are the number of ways to split a set of n elements into k non-empty lists.

fubini

Given a non-negative integer n, returns the Fubini number of n, also called the ordered Bell numbers, or the number of ordered partitions of n. It is the count of rankings of n items allowing for ties.

This is the OEIS series A000670.

harmfrac

my($num,$den) = harmfrac(12);  # returns (86021,27720)

Given a non-negative integer n, returns the Harmonic number H_n as a rational number represented by two integers. The harmonic numbers are the sum of reciprocals of the first n natural numbers: 1 + 1/2 + 1/3 + ... + 1/n.

Binary splitting (Fredrik Johansson's elegant formulation) is used.

This corresponds to Mathematica's HarmonicNumber function.

harmreal

Given a non-negative integer n, returns the Harmonic number H_n as a Math::BigFloat object using the default precision. An optional second integer argument may be given specifying the precision to be used.

For large n values, using a lower precision may result in faster computation as an asymptotic formula may be used. For precisions of 13 or less, native floating point is used for even more speed.

legendre_phi

$phi = legendre_phi(1000000000, 41);

Given two non-negative integers n and a, returns the Legendre phi function (also called the Legendre sum). This is the count of positive integers <= n which are not divisible by any of the first a primes.

This corresponds to the legendre_phi(n,a) function in SAGE, and the --phi n a feature of primecount.

inverse_li

$approx_prime_count = inverse_li(1000000000);

Given a non-negative integer n, returns the least integer value k such that Li(k) >= n. Since the logarithmic integral Li(n) is a good approximation to the number of primes less than n, this function is a good simple approximation to the nth prime.

inverse_li_nv

$faster_approx_prime_count = inverse_li_nv(1000000000);

With input x and output both in NV (floating point), computes the inverse of the logarithmic integral. This should be very fast, as everything is done in native long double precision, no Perl bigints or bigfloats are involved, and the computed result is returned as an NV.

The "inverse_li" function uses this to start, then ensures the integer return value is the closest inverse of the integer result of the "LogarithmicIntegral" function. While this is a small amount of extra time for small inputs, once we have to go to Perl and use BigInt / BigFloat, the extra time can be significant.

numtoperm

@p = numtoperm(10,654321);  # @p=(1,8,2,7,6,5,3,4,9,0)

Given a non-negative integer n and integer k, return the rank k lexicographic permutation of n elements. k will be interpreted as mod n!.

This will match iteration number k (zero based) of "forperm".

This corresponds to Pari's numtoperm(n,k) function (Pari 2.6 and later use the same lexicographic ordering).

permtonum

$k = permtonum([1,8,2,7,6,5,3,4,9,0]);  # $k = 654321

Given an array reference containing each integer from 0 to n-1, in some ordering, returns the lexicographic permutation rank of the set. This is the inverse of the "numtoperm" function.

This will match iteration number k (zero based) of "forperm". The result will be between 0 and n!-1.

This corresponds to Pari's permtonum(n) function (Pari 2.6 and later use the same lexicographic ordering).

randperm

@p = randperm(100);   # returns shuffled 0..99
@p = randperm(100,4)  # returns 4 elements from shuffled 0..99
@s = @data[randperm(1+$#data)];    # shuffle an array
@p = @data[randperm(1+$#data,2)];  # pick 2 from an array

Given a single non-negative integer n, returns a random permutation of the integers from 0 to n-1.

Optionally takes a second non-negative integer argument k. The returned list will then have only k elements. This is more efficient than truncating the full shuffled list.

The randomness comes from our CSPRNG.

The slicing technique shown in the last example is similar to "vecsample".

shuffle

@shuffled = shuffle(@data);

Takes a list as input, and returns a random permutation of the list. Like randperm, the randomness comes from our CSPRNG.

This function is functionally identical to the shuffle function in List::Util. The only difference is the random source (Chacha20 with better randomness, a larger period, and a larger state). This does make it slower.

If the entire shuffled array is desired, this is faster than slicing with "randperm" as shown in its example above. If fewer elements are needed (a "pick" or "sample") then "vecsample" or slicing with "randperm" will be much more efficient.

vecsample

$oneof = vecsample(1,@data);  # Select one random value
@twoof = vecsample(2,@data);  # Select two random values

Takes a non-negative integer k and a list, and returns k randomly selected values from the list. The randomness comes from our CSPRNG.

If the input is exactly two elements (k and one other) and the second value is an array reference, then we will use it as the input list:

$oneof = vecsample(1, $arrayref);
@twoof = vecsample(1, \@data);

This can be a large performance increase if the input list is large (e.g. 2x at 1000 elements, can be 10x with more). While there might be confusion when sampling a list with exactly one element, where that element is an array reference, this is assumed to be a rare case.

This is similar to sample from List::Util, choose_multiple from Rust rand, and Raku's pick.

MODULAR ARITHMETIC

OVERVIEW

Functions for fast modular arithmetic are provided: add, subtract, multiply, divide, power, square root, nth root, inverse. Additionally, fast modular calculation of factorial, binomial, and Lucas sequences are provided. See "MODULAR FUNCTIONS" for more functions that operate mod n.

Semantics mostly follow Pari/GP, though in some cases they will indicate an error while we return undef.

We use the absolute value of the modulus.
A modulus of zero returns undef.
A modulus of 1 will return 0.
If a modular result doesn't exist, we return undef.

negmod

Given two integers a and n, return -a mod |n|.

This is similar to submod(0,$a,$n) or $n ? modint(-$a,absint($n)) : undef.

addmod

Given three integers a, b, and n, return (a+b) mod |n|. This is particularly useful when dealing with numbers that are larger than a half-word but still native size. No bigint package is needed and this can be 10-200x faster than using one.

submod

Given three integers a, b, and n, return (a-b) mod |n|.

mulmod

Given three integers a, b, and n, return (a*b) mod |n|. This is particularly useful when n fits in a native integer. No bigint package is needed and this can be 10-200x faster than using one.

muladdmod

Given four integers a, b, c, and n, return (a*b+c) mod |n|.

mulsubmod

Given four integers a, b, c, and n, return (a*b-c) mod |n|.

divmod

Given three integers a, b, and n, return (a/b) mod |n|. This is done as (a * (1/b mod |n|)) mod |n|. If no inverse of b mod |n| exists then undef is returned.

powmod

Given three integers a, b, and n, return (a ** b) mod |n|. Typically binary exponentiation is used, so the process is very efficient. With native size inputs, no bigint library is needed.

powmod(a,-b,n) is calculated as powmod(invmod(a,n),b,n). If 1/a mod |n| does not exist, undef is returned.

sqrtmod

Given two integers a and n, return the square root of a mod |n|. If no square root exists, undef is returned. If defined, the return value r will always satisfy r^2 = a mod |n|.

If the modulus is prime, the function will always return r, the smaller of the two square roots (the other being -r mod |n|. If the modulus is composite, one of possibly many square roots will be returned, and it will not necessarily be the smallest.

allsqrtmod

Given two integers a and n, returns a sorted list of all modular square roots of a mod |n|. If no square root exists, an empty list is returned.

Some inputs will return very many roots. For example, a = p^4, n = 24 * p^4 for prime p, has many roots, and sqrtmod(89**8, 24*89**8) has over 500 million.

rootmod

Given three integers a, k, and n, returns a k-th root of a modulo |n|, or undef if one does not exist. If defined, the return value r will satisfy r^k = a mod |n|. There is no guarantee that the smallest root will be returned.

For some composites with large prime powers this may not be efficient.

rootmod(a,-k,n) is calculated as rootmod(invmod(a,n),k,n). If 1/a mod |n| does not exist, undef is returned.

allrootmod

Given three integers a, k, and n, returns a sorted list of all modular k-th root of a modulo |n|. If no root exists, an empty list is returned.

Similar to "allsqrtmod", some inputs have millions or billions of roots, so it might not be able to successfully return them all.

invmod

say "The inverse of 42 mod 2017 = ", invmod(42,2017);

Given two integers a and n, return the inverse of a modulo |n|. If not defined, undef is returned. If defined, then the return value multiplied by a equals 1 modulo |n|.

The results correspond to the Pari result of lift(Mod(1/a,n)). The semantics with respect to negative arguments match Pari. Notably, a negative n is negated, which is different from Math::BigInt, but in both cases the return value is still congruent to 1 modulo n as expected.

Mathematica uses Powermod[a, -1, n], where n must be positive.

factorialmod

Given a non-negative integer n and an integer m, returns n! mod |m|. This is much faster than computing the large factorial(n) followed by a mod operation.

While very efficient, this is not state of the art. Currently, Fredrik Johansson's fast multi-point polynomial evaluation method as used in FLINT is the fastest known implementation. This becomes noticeable for n > 10^8 or so, and the O(n^.5) versus O(n) complexity is very apparent with large n.

binomialmod

Given integer arguments n, k, and m, returns binomial(n,k) mod |m|. This is much faster than computing the large binomial(n,k) followed by a mod operation.

|m| does not need to be prime. The result is extended to negative n. Negative k will return zero.

This corresponds to Mathematica's BinomialMod[n,m,p] function. It has similar functionality to Max Alekseyev's binomod.gp Pari routine.

lucasumod

Given integers P, Q, the non-negative integer k, and the integer n, efficiently compute lucasu(P,Q,k) mod |n|.

This corresponds to gmpy2's lucasu_mod function.

When (P,Q) = (1,-1) this returns the modular Fibonacci sequence. This corresponds to Sidef's fibmod function.

lucasvmod

Given integers P, Q, the non-negative integer k, and the integer n, efficiently compute lucasv(P,Q,k) mod |n|.

This corresponds to gmpy2's lucasv_mod function.

lucasuvmod

# Compute the 5000-th Fibonacci and Lucas numbers, mod 1001
($U,$V) = lucasuvmod(1, -1, 5000, 1001);

Given integers P, Q, the non-negative integer k, and the integer n, efficiently compute the k-th value of U(P,Q) mod |n| and V(P,Q) mod |n|.

This is similar to the "lucas_sequence" function, but uses a more consistent argument order and does not return Q_k.

lucas_sequence

my($U, $V, $Qk) = lucas_sequence($n, $P, $Q, $k);

lucas_sequence() is deprecated. Use lucasuvmod() instead.

Computes U_k, V_k, and Q_k for the Lucas sequence defined by P,Q, modulo |n|. The modular Lucas sequence is used in a number of primality tests and proofs. k must be non-negative, and n must be non-zero.

pisano_period

Given a non-negative integer n, returns the period of the Fibonacci sequence modulo n. The modular Fibonacci numbers can be produced using lucasumod(1,-1,k,n). They are periodic for any integer n, and the Pisano period is the length of the repeating sequence.

This is the OEIS series A001175.

MODULAR FUNCTIONS

OVERVIEW

More functions are provided that operate mod n. They use similar semantics with respect to the modulus: the absolute value is used, and a modulus of 0 will return undef. However the behavior with n = 1 is not always the same.

znlog

$k = znlog($a, $g, $p)

Returns the integer k that solves the equation a = g^k mod |p|, or undef if no solution is found. This is the discrete logarithm problem.

The implementation for native integers first applies Silver-Pohlig-Hellman on the group order to possibly reduce the problem to a set of smaller problems. The solutions are then performed using a mixture of trial, Shanks' BSGS, and Pollard's DLP Rho.

The PP implementation is less sophisticated, with only a memory-heavy BSGS being used.

znorder

$order = znorder(2, next_prime(10**16)-6);

Given two positive integers a and n, returns the multiplicative order of a modulo |n|. This is the smallest positive integer k such that a^k ≡ 1 mod |n|. Returns undef if n = 0, a = 0, or if a and n are not coprime, since no value can result in 1 mod n. Returns 1 if a = 1 or if n = 1.

Note the latter differs from other mod functions, because the return value is a positive integer, not an integer mod n.

This corresponds to Pari's znorder(Mod(a,n)) function and Mathematica's MultiplicativeOrder[a,n] function.

znprimroot

Given an integer n, where n is treated as |n|, returns the smallest primitive root of (Z/nZ)^*, or undef if no root exists. A root exists when euler_phi($n) == carmichael_lambda($n), which will be true only if n is one of {2, 4, p^k, 2p^k} for odd prime p.

Like other modular functions, if n = 0 the function returns undef.

OEIS A033948 is a sequence of integers where the primitive root exists, while OEIS A046145 is a list of the smallest primitive roots, which is what this function produces.

is_primitive_root

Given two integers a and n, returns 1 if a is a primitive root modulo |n|, and 0 if not. If a is a primitive root, then euler_phi(n) is the smallest e for which a^e = 1 mod n.

Like other modular functions, if n = 0 the function returns undef.

qnr

Given an integer n, returns the least quadratic non-residue modulo |n|. This is the smallest integer a where there does not exist an integer b such that a = b^2 mod |n|.

Like other modular functions, if n = 0 the function returns undef.

This is OEIS A020649. For primes it is OEIS A053760.

is_qr

Given two integers a and n, returns 1 if a is a quadratic residue modulo |n|, and 0 otherwise. A return value of 1 indicates there exists an x where a = x^2 mod |n|.

For odd primes, this is similar to checking a==0 || kronecker(a,n) == 1.

For all values, this will be equal to sqrtmod(a,n) != undef, with possibly better performance.

Like other modular functions, if n = 0 the function returns undef.

RANDOM NUMBERS

OVERVIEW

Prior to version 5.20, Perl's rand function used the system rand function. This meant it varied by system, and was almost always a poor choice. For 5.20, Perl standardized on drand48 and includes the source so there are no system dependencies. While this was an improvement, drand48 is not a good PRNG. It really only has 32 bits of random values, and fails many statistical tests. See http://www.pcg-random.org/statistical-tests.html for more information.

There are much better choices for standard random number generators, such as the Mersenne Twister, PCG, or Xoroshiro128+. Someday perhaps Perl will get one of these to replace drand48. In the mean time, Math::Random::MTwist provides numerous features and excellent performance, or this module.

Since we often deal with random primes for cryptographic purposes, we have additional requirements. This module uses a CSPRNG for its random stream. In particular, ChaCha20, which is the same algorithm used by BSD's arc4random and /dev/urandom on BSD and Linux 4.8+. Seeding is performed at startup using the Win32 Crypto API (on Windows), /dev/urandom, /dev/random, or Crypt::PRNG, whichever is found first.

We use the original ChaCha definition rather than RFC7539. This means a 64-bit counter, resulting in a period of 2^72 bytes or 2^68 calls to "drand" or "irand64". This compares favorably to the 2^48 period of Perl's drand48. It has a 512-bit state which is significantly larger than the 48-bit drand48 state. When seeding, 320 bits (40 bytes) are used. Among other things, this means all 52! permutations of a shuffled card deck are possible, which is not true of "shuffle" in List::Util.

One might think that performance would suffer from using a CSPRNG, but benchmarking shows this does not seem to be the case. The speed of irand, irand64, and drand is within 20% of the fastest existing modules using non-CSPRNG methods, and 2 to 20 times faster than most. While a faster underlying RNG is useful, the Perl call interface overhead is a majority of the time for these calls. Carefully tuning that interface is critical for any module.

For performance on large amounts of data, see the tables in "random_bytes".

Each thread uses its own context, meaning seeding in one thread has no impact on other threads. In addition to improved security, this is better for performance than a single context with locks. If explicit control of multiple independent streams is needed then using a more specific module is recommended. I believe Crypt::PRNG (part of CryptX) and Bytes::Random::Secure are good alternatives.

Using the :rand export option will define rand and srand as similar but improved versions of the system functions of the same name, as well as "irand" and "irand64".

irand

$n32 = irand;     # random 32-bit integer

Returns a random 32-bit integer using the CSPRNG.

irand64

$n64 = irand64;   # random 64-bit integer

Returns a random 64-bit integer using the CSPRNG (on 64-bit Perl). On a 32-bit Perl, it returns the maximum UV bits, which will be only 32.

drand

$f = drand;       # random floating point value in [0,1)
$r = drand(25.33);   # random floating point value in [0,25.33)

Returns a random NV (Perl's native floating point) using the CSPRNG. The API is similar to Perl's rand but giving better results.

The number of actual random bits will be equal to the number of mantissa bits in the NV type. For IEEE-754 doubles, this means 53 bits, and can go to 64 or 113 bits with long double / quadmath support. The "_nvmantbits" function allows seeing how many bits are used.

This gives substantially better quality random numbers than the default Perl rand function. Among other things, on modern Perl's, rand uses drand48, which has 32 bits of not-very-good randomness and 16 more bits of obvious patterns (e.g. the 48th bit alternates, the 47th has a period of 4, etc.). Output from rand fails at least 5 tests from the TestU01 SmallCrush suite, while our function easily passes.

With the ":rand" tag, this function is additionally exported as rand.

random_bytes

$str = random_bytes(32);     # 32 random bytes

Given an unsigned number n of bytes, returns a string filled with random data from the CSPRNG. Performance for large quantities:

Module/Method                  Rate   Type
-------------             ---------   ----------------------

Math::Prime::Util::GMP    1067 MB/s   CSPRNG - ISAAC
ntheory random_bytes       384 MB/s   CSPRNG - ChaCha20
Crypt::PRNG                140 MB/s   CSPRNG - Fortuna
Crypt::OpenSSL::Random      32 MB/s   CSPRNG - SHA1 counter
Math::Random::ISAAC::XS     15 MB/s   CSPRNG - ISAAC
ntheory entropy_bytes       13 MB/s   CSPRNG - /dev/urandom
Crypt::Random               12 MB/s   CSPRNG - /dev/urandom
Crypt::Urandom              12 MB/s   CSPRNG - /dev/urandom
Bytes::Random::Secure        6 MB/s   CSPRNG - ISAAC
ntheory pure perl ISAAC      5 MB/s   CSPRNG - ISAAC (no XS)
Math::Random::ISAAC::PP      2.5 MB/s CSPRNG - ISAAC (no XS)
ntheory pure perl ChaCha     1.0 MB/s CSPRNG - ChaCha20 (no XS)
Data::Entropy::Algorithms    0.5 MB/s CSPRNG - AES-CTR

Math::Random::MTwist       927 MB/s   PRNG - Mersenne Twister
Bytes::Random::XS          109 MB/s   PRNG - drand48
pack CORE::rand             25 MB/s   PRNG - drand48 (no XS)
Bytes::Random                2.6 MB/s PRNG - drand48 (no XS)

entropy_bytes

Similar to random_bytes, but directly using the entropy source. This is not normally recommended as it can consume shared system resources and is typically slow -- on the computer that produced the "random_bytes" chart above, using dd generated the same 13 MB/s performance as our "entropy_bytes" function.

The actual performance will be highly system dependent.

urandomb

$n32 = urandomb(32);    # Classic irand32, returns a UV
$n   = urandomb(1024);  # Random integer less than 2^1024

Given a number of bits b, returns a random unsigned integer less than 2^b. The result will be uniformly distributed between 0 and 2^b-1 inclusive.

urandomm

$n = urandomm(100);    # random integer in [0,99]
$n = urandomm(1024);   # random integer in [0,1023]

Given a positive integer n, returns a random unsigned integer less than n. The results will be uniformly distributed between 0 and n-1 inclusive. Care is taken to prevent modulo bias.

csrand

Takes a binary string data as input and seeds the internal CSPRNG. This is not normally needed as system entropy is used as a seed on startup. For best security this should be 16-128 bytes of good entropy. No more than 1024 bytes will be used.

With no argument, reseeds using system entropy, which is preferred.

If the secure configuration has been set, then this will croak if given an argument. This allows for control of reseeding with entropy the module gets itself, but not user supplied.

srand

Takes a single UV argument and seeds the CSPRNG with it, as well as returning it. If no argument is given, a new UV seed is constructed. Note that this creates a very weak seed from a cryptographic standpoint, so it is useful for testing or simulations but "csrand" is recommended, or keep using the system entropy default seed.

The API is nearly identical to the system function srand. It uses a UV which can be 64-bit rather than always 32-bit. The behaviour for undef, empty string, empty list, etc. is slightly different (we treat these as 0).

This function is not exported with the ":all" tag, but is with ":rand".

If the secure configuration has been set, this function will croak. Manual seeding using srand is not compatible with cryptographic security.

rand

An alias for "drand", not exported unless the ":rand" tag is used.

random_factored_integer

my($n, $factors) = random_factored_integer(1000000);

Given a positive non-zero input n, returns a uniform random integer in the range 1 to n, along with an array reference containing the factors.

This uses Kalai's algorithm for generating random integers along with their factorization, and is much faster than the naive method of generating random integers followed by a factorization. A later implementation may use Bach's more efficient algorithm.

RANDOM PRIMES

random_prime

my $small_prime = random_prime(1000);      # random prime <= limit
my $rand_prime = random_prime(100, 10000); # random prime within a range

Returns a pseudo-randomly selected prime that will be greater than or equal to the lower limit and less than or equal to the upper limit. If no lower limit is given, 2 is implied. Returns undef if no primes exist within the range.

The goal is to return a uniform distribution of the primes in the range, meaning for each prime in the range, the chances are equally likely that it will be seen. This is removes from consideration such algorithms as PRIMEINC, which although efficient, gives very non-random output. This also implies that the numbers will not be evenly distributed, since the primes are not evenly distributed. Stated differently, the random prime functions return a uniformly selected prime from the set of primes within the range. Hence given random_prime(1000), the numbers 2, 3, 487, 631, and 997 all have the same probability of being returned.

For small numbers, a random index selection is done, which gives ideal uniformity and is very efficient with small inputs. For ranges larger than this ~16-bit threshold but within the native bit size, a Monte Carlo method is used. This also gives ideal uniformity and can be very fast for reasonably sized ranges. For even larger numbers, we partition the range, choose a random partition, then select a random prime from the partition. This gives some loss of uniformity but results in many fewer bits of randomness being consumed as well as being much faster.

random_ndigit_prime

say "My 4-digit prime number is: ", random_ndigit_prime(4);

Selects a random n-digit prime, where the input is an integer number of digits. One of the primes within that range (e.g. 1000 - 9999 for 4-digits) will be uniformly selected.

If the number of digits is greater than or equal to the maximum native type, then the result will be returned as a BigInt. However, if the nobigint configuration option is on, then output will be restricted to native size numbers, and requests for more digits than natively supported will result in an error. For better performance with large bit sizes, install Math::Prime::Util::GMP.

random_nbit_prime

my $bigprime = random_nbit_prime(512);

Selects a random n-bit prime, where the input is an integer number of bits. A prime with the nth bit set will be uniformly selected.

For bit sizes of 64 and lower, "random_prime" is used, which gives completely uniform results in this range. For sizes larger than 64, Algorithm 1 of Fouque and Tibouchi (2011) is used, wherein we select a random odd number for the lower bits, then loop selecting random upper bits until the result is prime. This allows a more uniform distribution than the general "random_prime" case while running slightly faster (in contrast, for large bit sizes "random_prime" selects a random upper partition then loops on the values within the partition, which very slightly skews the results towards smaller numbers).

The result will be a BigInt if the number of bits is greater than the native bit size. For better performance with large bit sizes, install Math::Prime::Util::GMP.

random_safe_prime

my $bigprime = random_safe_prime(512);

Produces an n-bit safe prime. This is a prime p where p = 2q+1 and q is also prime.

These types of primes are sometimes useful for discrete logarithm based cryptography, and can be generated more efficiently using simultaneous sieving.

random_strong_prime

my $bigprime = random_strong_prime(512);

Constructs an n-bit strong prime using Gordon's algorithm. We consider a strong prime p to be one where

p is large. This function requires at least 128 bits.
p-1 has a large prime factor r.
p+1 has a large prime factor s
r-1 has a large prime factor t

Using a strong prime in cryptography guards against easy factoring with algorithms like Pollard's Rho. Rivest and Silverman (1999) present a case that using strong primes is unnecessary, and most modern cryptographic systems agree. First, the smoothness does not affect more modern factoring methods such as ECM. Second, modern factoring methods like GNFS are far faster than either method so makes the point moot. Third, due to key size growth and advances in factoring and attacks, for practical purposes, using large random primes offers security equivalent to strong primes.

Similar to "random_nbit_prime", the result will be a BigInt if the number of bits is greater than the native bit size. For better performance with large bit sizes, install Math::Prime::Util::GMP.

random_proven_prime

my $bigprime = random_proven_prime(512);

Constructs an n-bit random proven prime. Internally this may use "is_provable_prime"("random_nbit_prime") or "random_maurer_prime" depending on the platform and bit size.

random_proven_prime_with_cert

my($n, $cert) = random_proven_prime_with_cert(512);

Similar to "random_proven_prime", but returns a two-element array containing the n-bit provable prime along with a primality certificate. The certificate is the same as produced by "prime_certificate" or "is_provable_prime_with_cert", and can be parsed by "verify_prime" or any other software that understands MPU primality certificates.

random_maurer_prime

my $bigprime = random_maurer_prime(512);

Construct an n-bit provable prime, using the FastPrime algorithm of Ueli Maurer (1995). This is the same algorithm used by Crypt::Primes. Similar to "random_nbit_prime", the result will be a BigInt if the number of bits is greater than the native bit size.

The performance with Math::Prime::Util::GMP installed is hundreds of times faster, so it is highly recommended.

The differences between this function and that in Crypt::Primes are described in the "SEE ALSO" section.

Internally this additionally runs the BPSW probable prime test on every partial result, and constructs a primality certificate for the final result, which is verified. These provide additional checks that the resulting value has been properly constructed.

If you don't need absolutely proven results, then it is somewhat faster to use "random_nbit_prime" either by itself or with some additional tests, e.g. "miller_rabin_random" and/or "is_frobenius_underwood_pseudoprime". One could also run "is_provable_prime" on the result, but this will be slow.

random_maurer_prime_with_cert

my($n, $cert) = random_maurer_prime_with_cert(512);

As with "random_maurer_prime", but returns a two-element array containing the n-bit provable prime along with a primality certificate. The certificate is the same as produced by "prime_certificate" or "is_provable_prime_with_cert", and can be parsed by "verify_prime" or any other software that understands MPU primality certificates. The proof construction consists of a single chain of BLS3 types.

random_shawe_taylor_prime

my $bigprime = random_shawe_taylor_prime(8192);

Construct an n-bit provable prime, using the Shawe-Taylor algorithm in section C.6 of FIPS 186-4. This uses 512 bits of randomness and SHA-256 as the hash. This is a slightly simpler and older (1986) method than Maurer's 1995 construction. It is a bit faster than Maurer's method, and uses less system entropy for large sizes. The primary reason to use this rather than Maurer's method is to use the FIPS 186-4 algorithm.

Similar to "random_nbit_prime", the result will be a BigInt if the number of bits is greater than the native bit size. For better performance with large bit sizes, install Math::Prime::Util::GMP. Also see "random_maurer_prime" and "random_proven_prime".

random_shawe_taylor_prime_with_cert

my($n, $cert) = random_shawe_taylor_prime_with_cert(4096);

As with "random_shawe_taylor_prime", but returns a two-element array containing the n-bit provable prime along with a primality certificate. The certificate is the same as produced by "prime_certificate" or "is_provable_prime_with_cert", and can be parsed by "verify_prime" or any other software that understands MPU primality certificates. The proof construction consists of a single chain of Pocklington types.

random_semiprime

Takes a positive integer number of bits bits, returns a random semiprime of exactly bits bits. The result has exactly two prime factors (hence semiprime).

The factors will be approximately equal size, which is typical for cryptographic use. For example, a 64-bit semiprime of this type is the product of two 32-bit primes. bits must be 4 or greater.

Some effort is taken to select uniformly from the universe of bits-bit semiprimes. This takes slightly longer than some methods that do not select uniformly.

random_unrestricted_semiprime

Takes a positive integer number of bits bits, returns a random semiprime of exactly bits bits. The result has exactly two prime factors (hence semiprime).

The factors are uniformly selected from the universe of all bits-bit semiprimes. This means semiprimes with one factor equal to 2 will be most common, 3 next most common, etc. bits must be 3 or greater.

Some effort is taken to select uniformly from the universe of bits-bit semiprimes. This takes slightly longer than some methods that do not select uniformly.

UTILITY FUNCTIONS

prime_precalc

prime_precalc( 1_000_000_000 );

Let the module prepare for fast operation up to a specific number. It is not necessary to call this, but it gives you more control over when memory is allocated and gives faster results for multiple calls in some cases. In the current implementation this will calculate a sieve for all numbers up to the specified number.

prime_memfree

prime_memfree;

Frees any extra memory the module may have allocated. Like with prime_precalc, it is not necessary to call this, but if you're done making calls, or want things cleaned up, you can use this. The object method might be a better choice for complicated uses.

Math::Prime::Util::MemFree->new

use Math::Prime::Util::MemFree;
my $mf = Math::Prime::Util::MemFree->new;
# perform operations.  When $mf goes out of scope, memory will be recovered.

This is a more robust way of making sure any cached memory is freed, as it will be handled by the last MemFree object leaving scope. This means if your routines were inside an eval that died, things will still get cleaned up. If you call another function that uses a MemFree object, the cache will stay in place because you still have an object.

prime_get_config

my $cached_up_to = prime_get_config->{'precalc_to'};

# Print all configuration
my $r=prime_get_config();
say "$_ $r->{$_}" for sort (keys %$r);

Returns a reference to a hash of the current settings. The hash is a copy of the configuration, so changing it has no effect. The settings include:

verbose         verbose level.  1 or more will result in extra output.
bigintclass     the bigint type name (default Math::BigInt)
precalc_to      primes up to this number are calculated
maxbits         the maximum number of bits for native operations
xs              0 or 1, indicating the XS code is available
gmp             0 or 1, indicating GMP code is available
maxparam        the largest value for most functions, without bigint
maxdigits       the max digits in a number, without bigint
maxprime        the largest representable prime, without bigint
maxprimeidx     the index of maxprime, without bigint
assume_rh       whether to assume the Riemann hypothesis (default 0)
secure          disable ability to manually seed the CSPRNG

prime_set_config

prime_set_config( assume_rh => 1 );

prime_set_config(bigint=>Math::GMPz);

Allows setting of some parameters. Currently the only parameters are:

verbose      The default setting of 0 will generate no extra output.
             Setting to 1 or higher results in extra output.  For
             example, at setting 1 the AKS algorithm will indicate
             the chosen r and s values.  At setting 2 it will output
             a sequence of dots indicating progress.  Similarly, for
             random_maurer_prime, setting 3 shows real time progress.
             Factoring large numbers is another place where verbose
             settings can give progress indications.

bigint       You can give either a single object (e.g. a value of the
             class you want), or a comma separated list of class names.
             The first class we can load will be used for all operations
             that use a bigint.
             A warning will be produced if one was not found.

trybigint    Exactly the same behavior as C<bigint> but no warning
             will be output if we couldn't load anything from the list.

xs           Allows turning off the XS code, forcing the Pure Perl
             code to be used.  Set to 0 to disable XS, set to 1 to
             re-enable.  You probably will never want to do this.

gmp          Allows turning off the use of L<Math::Prime::Util::GMP>,
             which means using Pure Perl code for big numbers.  Set
             to 0 to disable GMP, set to 1 to re-enable.
             You probably will never want to do this.

assume_rh    Allows functions to assume the Riemann hypothesis is
             true if set to 1.  This defaults to 0.  Currently this
             setting only impacts prime count lower and upper
             bounds, but could later be applied to other areas such
             as primality testing.  A later version may also have a
             way to indicate whether no RH, RH, GRH, or ERH is to
             be assumed.

secure       The CSPRNG may no longer be manually seeded.  Once set,
             this option cannot be disabled.  L</srand> will croak
             if called, and L</csrand> will croak if called with any
             arguments.  L</csrand> with no arguments is still allowed,
             as that will use system entropy without giving anything
             to the caller.  The point of this option is to ensure that
             any called functions do not try to control the RNG.

FACTORING FUNCTIONS

factor

my @factors = factor(3_369_738_766_071_892_021);
# returns (204518747,16476429743)

Produces the prime factors of a positive number input, in numerical order. The product of the returned factors will be equal to the input. n = 1 will return an empty list, and n = 0 will return 0. This matches Pari.

In scalar context, returns Ω(n), the total number of prime factors (OEIS A001222). This corresponds to Pari's bigomega(n) function and Mathematica's PrimeOmega[n] function. This is the same result that we would get if we evaluated the resulting array in scalar context.

The current algorithm does a little trial division, a check for perfect powers, followed by combinations of Pollard's Rho, SQUFOF, and Pollard's p-1. The combination is applied to each non-prime factor found.

Factoring bigints works with pure Perl, and can be very handy on 32-bit machines for numbers just over the 32-bit limit, but it can be very slow for "hard" numbers. Installing the Math::Prime::Util::GMP module will speed up bigint factoring a lot, and all future effort on large number factoring will be in that module. If you do not have that module for some reason, use the GMP or Pari version of bigint if possible (e.g. use bigint try => 'GMP,Pari'), which will run 2-3x faster (though still 100x slower than the real GMP code).

factor_exp

my @factor_exponent_pairs = factor_exp(29513484000);
# returns ([2,5], [3,4], [5,3], [7,2], [11,1], [13,2])
# factor(29513484000)
# returns (2,2,2,2,2,3,3,3,3,5,5,5,7,7,11,13,13)

Produces pairs of prime factors and exponents in numerical factor order. This is more convenient for some algorithms. This is the same form that Mathematica's FactorInteger[n] and Pari/GP's factorint functions return. Note that Math::Pari transposes the Pari result matrix.

In scalar context, returns ω(n), the number of unique prime factors (OEIS A001221). This corresponds to Pari's omega(n) function and Mathematica's PrimeNu[n] function. This is the same result that we would get if we evaluated the resulting array in scalar context.

The internals are identical to "factor", so all comments there apply. Just the way the factors are arranged is different.

divisors

my @divisors = divisors(30);   # returns (1, 2, 3, 5, 6, 10, 15, 30)

Produces all the divisors of a positive number input, including 1 and the input number. The divisors are a power set of multiplications of the prime factors, returned as a uniqued sorted list. The result is identical to that of Pari's divisors and Mathematica's Divisors[n] functions.

In scalar context this returns the sigma0 function (see Hardy and Wright section 16.7). This is OEIS A000005. The result is identical to evaluating the array in scalar context, but more efficient. This corresponds to Pari's numdiv and Mathematica's DivisorSigma[0,n] functions.

Also see the "fordivisors" function for looping over the divisors.

When n=0 we return the empty set (zero in scalar context).

An optional second positive integer argument k indicates that the results should not include any value larger than k. This is especially useful when the number has thousands of divisors and we may only be interested in the small ones.

trial_factor

my @factors = trial_factor($n);

Produces the prime factors of a positive number input using trial division. The factors will be in numerical order. For large inputs this will be very slow.

An optional second argument will indicate an upper limit for factors. Factors 2, 3, and 5 are always pulled out. Factors larger than the second argument will not be found and hence the last value in the list might be composite.

Like all the specific-algorithm *_factor routines, this is not exported unless explicitly requested.

fermat_factor

my @factors = fermat_factor($n);

Produces factors, not necessarily prime, of the positive number input. The particular algorithm is Knuth's algorithm C. For small inputs this will be very fast, but it slows down quite rapidly as the number of digits increases. It is very fast for inputs with a factor close to the midpoint (e.g. a semiprime p*q where p and q are the same number of digits).

holf_factor

my @factors = holf_factor($n);

Produces factors, not necessarily prime, of the positive number input. An optional number of rounds can be given as a second parameter. It is possible the function will be unable to find a factor, in which case a single element, the input, is returned. This uses Hart's One Line Factorization with no premultiplier. It is an interesting alternative to Fermat's algorithm, and there are some inputs it can rapidly factor. Overall it has the same advantages and disadvantages as Fermat's method.

lehman_factor

my @factors = lehman_factor($n);

Produces factors, not necessarily prime, of the positive number input. An optional argument, defaulting to 0 (false), indicates whether to run trial division. Without trial division, is possible the function will be unable to find a factor, in which case a single element, the input, is returned.

This is Warren D. Smith's Lehman core with minor modifications. It is limited to 42-bit inputs: n < 8796393022208.

squfof_factor

my @factors = squfof_factor($n);

prho_factor

Pollard's rho factoring algorithm. See "pbrent_factor" for the shared description of both functions.

pbrent_factor

my @factors = prho_factor($n);
my @factors = pbrent_factor($n);

# Use a very small number of rounds
my @factors = prho_factor($n, 1000);

Produces factors, not necessarily prime, of the positive number input. An optional number of rounds can be given as a second parameter. These attempt to find a single factor using Pollard's Rho algorithm, either the original version or Brent's modified version. These are more specialized algorithms usually used for pre-factoring very large inputs, as they are very fast at finding small factors.

pminus1_factor

my @factors = pminus1_factor($n);
my @factors = pminus1_factor($n, 1_000);          # set B1 smoothness
my @factors = pminus1_factor($n, 1_000, 50_000);  # set B1 and B2

Produces factors, not necessarily prime, of the positive number input. This is Pollard's p-1 method, using two stages with default smoothness settings of 1_000_000 for B1, and 10 * B1 for B2. This method can rapidly find a factor p of n where p-1 is smooth (it has no large factors).

pplus1_factor

my @factors = pplus1_factor($n);
my @factors = pplus1_factor($n, 1_000);          # set B1 smoothness

Produces factors, not necessarily prime, of the positive number input. This is Williams' p+1 method, using one stage and two predefined initial points.

cheb_factor

my @factors = cheb_factor($n);
my @factors = cheb_factor($n, 1_000);          # set B1 smoothness

Produces factors, not necessarily prime, of the positive number input. This uses the properties of Chebyshev polynomials (particularly that T_mn(x) = T_m(T_n(x))) and their relationship with the Lucas sequence, to find factors if p-1 or p+1 is smooth.

This generally works better than our "pplus1_factor", but is slower than our "pminus1_factor".

ecm_factor

my @factors = ecm_factor($n);
my @factors = ecm_factor($n, 100, 400, 10);      # B1, B2, # of curves

Produces factors, not necessarily prime, of the positive number input. This is the elliptic curve method using two stages.

MATHEMATICAL FUNCTIONS

ExponentialIntegral

my $Ei = ExponentialIntegral($x);

Given a non-zero floating point input x, this returns the real-valued exponential integral of x, defined as the integral of e^t/t dt from -infinity to x.

For non-BigFloat inputs, the result should be accurate to at least 14 digits.

For BigFloat inputs, full accuracy and performance is obtained only if Math::Prime::Util::GMP is installed. If this module is not available, then other methods are used and give at least 14 digits of accuracy: continued fractions (x < -1), rational Chebyshev approximation (-1 < x < 0), a convergent series (small positive x), or an asymptotic divergent series (large positive x). The accuracy() setting of the input is used to determine the output accuracy.

LogarithmicIntegral

my $li = LogarithmicIntegral($x);

Given a non-negative floating point input, returns the floating point logarithmic integral of x, defined as the integral of dt/ln t from 0 to x. If given a negative input, the function will croak. The function returns 0 at x = 0, and -infinity at x = 1.

This is often known as li(x). A related function is the offset logarithmic integral, sometimes known as Li(x) which avoids the singularity at 1. It may be defined as Li(x) = li(x) - li(2). Crandall and Pomerance use the term li0 for this function, and define li(x) = Li0(x) - li0(2). Due to this terminology confusion, it is important to check which exact definition is being used.

For non-BigFloat objects, the result should be accurate to at least 14 digits.

For BigFloat inputs, full accuracy and performance is obtained only if Math::Prime::Util::GMP is installed. The accuracy() setting of the input is used to determine the output accuracy.

RiemannZeta

my $z = RiemannZeta($s);

Given a non-negative floating point input s, returns the floating point value of ζ(s)-1, where ζ(s) is the Riemann zeta function. One is subtracted to ensure maximum precision for large values of s. The zeta function is the sum from k=1 to infinity of 1 / k^s. This function only uses real arguments, so is more properly the Euler Zeta function.

For non-BigFloat objects, the result should be accurate to at least 14 digits. The XS code uses a rational Chebyshev approximation between 0.5 and 5, and a series for other values. The PP code uses an identical series for all values.

For BigFloat inputs, full accuracy and performance is obtained only if Math::Prime::Util::GMP is installed. If this module is not available, then other methods are used and give at least 14 digits of accuracy: Either Borwein (1991) algorithm 2, or the basic series. Math::BigFloat RT 43692 can produce incorrect high-accuracy computations when GMP is not used. The accuracy() setting of the input is used to determine the output accuracy.

RiemannR

my $r = RiemannR($x);

Given a positive non-zero floating point input, returns the floating point value of Riemann's R function. Riemann's R function gives a very close approximation to the prime counting function.

For non-BigFloat objects, the result should be accurate to at least 14 digits.

For BigFloat inputs, full accuracy and performance is obtained only if Math::Prime::Util::GMP is installed. If that module is not available, accuracy should be 35 digits. The accuracy() setting of the input is used to determine the output accuracy.

LambertW

Returns the principal branch of the Lambert W function of a real value. Given a value k this solves for W in the equation k = We^W. The input must not be less than -1/e. This corresponds to Pari's lambertw function and Mathematica's ProductLog / LambertW function.

This function handles all real value inputs with non-complex return values from the principal branch. Pari/GP's lambertw prior to 2.15 (2022) was a subset of this. Recent Pari/GP and Mathematica both have more complete functions with both branches, and support for complex arguments and results.

Calculation will be done with C long doubles if the input is a standard scalar, but if the input is a BigFloat type, then extended precision results will be generated. The accuracy() setting of the input is used to determine the output accuracy.

Speed of the native code is about half of the fastest native code (Veberic's C++), and about 10x faster than Pari/GP. However the bignum calculation is slower than Pari/GP.

Pi

my $tau = 2 * Pi;     # $tau = 6.28318530717959
my $tau = 2 * Pi(40); # $tau = 6.283185307179586476925286766559005768394

With no arguments, returns the value of Pi as an NV. With a positive integer argument, returns the value of Pi with the requested number of digits (including the leading 3). The return value will be an NV if the number of digits fits in an NV (typically 15 or less), or a Math::BigFloat object otherwise.

For sizes over 10k digits, having either Math::Prime::Util::GMP or Math::BigInt::GMP installed will help performance. For sizes over 50k, GMP is highly recommended.

PLATFORM INTROSPECTION

OVERVIEW

We include a number of non-exported functions that are useful for internal use but can also be useful for users. These functions are subject to change or deletion in future revisions.

_uvsize

Returns the size of a UV in bytes (typically 4 or 8). The size of the basic integer type used in Perl and the C library.

_uvbits

Returns the size of a UV in bits (typically 32 or 64).

_ivsize

Returns the size of an IV in bytes (typically 4 or 8). This is going to be the same as "_uvsize".

_nvsize

Returns the size of an NV in bytes (typically 4, 8, or 16). It's quite possible other sizes could be seen on non-standard configurations. Usually we won't care about this directly.

_nvmantbits

Returns the size of the mantissa of Perl's NV floating point type, in bits. This can vary widely, with 23, 52, 112 all possible from mainstream platforms and other numbers possible.

This gives the actual mantissa bits, not counting the implicit 1. The significand precision is therefore one higher than the value returned by this function. A typical IEEE-754 double will report 52 here, which means integers up to 2^53-1 are able to be accurately stored.

Perl prior to 5.23 did not configure this at build time. We will guess based on the byte size of the NV on an IEEE-754 machine.

_nvmantdigits

How many full decimal integer digits able to be stored in an NV.

EXAMPLES

Print Fibonacci numbers:

perl -Mntheory=:all -E 'say lucasu(1,-1,$_) for 0..20'

Print strong pseudoprimes to base 17 up to 10M:

# Similar to A001262's isStrongPsp function, but much faster
perl -MMath::Prime::Util=:all -E 'foroddcomposites { say if is_strong_pseudoprime($_,17) } 10000000;'

Print some primes above 64-bit range:

perl -MMath::Prime::Util=:all -Mbigint -E 'my $start=100000000000000000000; say join "\n", @{primes($start,$start+1000)}'

# Another way
perl -MMath::Prime::Util=:all -E 'forprimes { say } "100000000000000000039", "100000000000000000993"'

# Similar using Math::Pari:
# perl -MMath::Pari=:int,PARI,nextprime -E 'my $start = PARI "100000000000000000000"; my $end = $start+1000; my $p=nextprime($start); while ($p <= $end) { say $p; $p = nextprime($p+1); }'

Generate Carmichael numbers (OEIS A002997):

perl -Mntheory=:all -E 'foroddcomposites { say if is_carmichael($_) } 1e6;'

# Less efficient, similar to Mathematica or MAGMA:
perl -Mntheory=:all -E 'foroddcomposites { say if $_ % carmichael_lambda($_) == 1 } 1e6;'

Examining the η3(x) function of Planat and Solé (2011):

sub nu3 {
  my $n = shift;
  my $phix = chebyshev_psi($n);
  my $nu3 = 0;
  foreach my $nu (1..3) {
    $nu3 += (moebius($nu)/$nu)*LogarithmicIntegral($phix**(1/$nu));
  }
  return $nu3;
}
say prime_count(1000000);
say prime_count_approx(1000000);
say nu3(1000000);

Construct and use a Sophie-Germain prime iterator:

sub make_sophie_germain_iterator {
  my $p = shift || 2;
  my $it = prime_iterator($p);
  return sub {
    do { $p = $it->() } while !is_prime(2*$p+1);
    $p;
  };
}
my $sgit = make_sophie_germain_iterator();
print $sgit->(), "\n"  for 1 .. 10000;

Project Euler, problem 3 (Largest prime factor):

use Math::Prime::Util qw/factor/;
use bigint;  # Only necessary for 32-bit machines.
say 0+(factor(600851475143))[-1]

Project Euler, problem 7 (10001st prime):

use Math::Prime::Util qw/nth_prime/;
say nth_prime(10_001);

Project Euler, problem 10 (summation of primes):

use Math::Prime::Util qw/sum_primes/;
say sum_primes(2_000_000);
#  ... or do it a little more manually ...
use Math::Prime::Util qw/forprimes/;
my $sum = 0;
forprimes { $sum += $_ } 2_000_000;
say $sum;
#  ... or do it using a big list ...
use Math::Prime::Util qw/vecsum primes/;
say vecsum( @{primes(2_000_000)} );

Project Euler, problem 21 (Amicable numbers):

use Math::Prime::Util qw/divisor_sum/;
my $sum = 0;
foreach my $x (1..10000) {
  my $y = divisor_sum($x)-$x;
  $sum += $x + $y if $y > $x && $x == divisor_sum($y)-$y;
}
say $sum;
# Or using a pipeline:
use Math::Prime::Util qw/vecsum divisor_sum/;
say vecsum( map { divisor_sum($_) }
            grep { my $y = divisor_sum($_)-$_;
                   $y > $_ && $_==(divisor_sum($y)-$y) }
            1 .. 10000 );

Project Euler, problem 41 (Pandigital prime), brute force command line:

perl -MMath::Prime::Util=primes,vecfirst -E 'say vecfirst { /1/&&/2/&&/3/&&/4/&&/5/&&/6/&&/7/} reverse @{primes(1000000,9999999)};'

Project Euler, problem 47 (Distinct primes factors):

use Math::Prime::Util qw/pn_primorial factor_exp/;
my $n = pn_primorial(4);  # Start with the first 4-factor number
# factor_exp in scalar context returns the number of distinct prime factors
$n++ while (factor_exp($n) != 4 || factor_exp($n+1) != 4 || factor_exp($n+2) != 4 || factor_exp($n+3) != 4);
say $n;

Project Euler, problem 69, stupid brute force solution (about 1 second):

use Math::Prime::Util qw/euler_phi/;
my ($maxn, $maxratio) = (0,0);
foreach my $n (1..1000000) {
  my $ndivphi = $n / euler_phi($n);
  ($maxn, $maxratio) = ($n, $ndivphi) if $ndivphi > $maxratio;
}
say "$maxn  $maxratio";

Here is the right way to do PE problem 69 (under 0.03s):

use Math::Prime::Util qw/pn_primorial/;
my $n = 0;
$n++ while pn_primorial($n+1) < 1000000;
say pn_primorial($n);

Project Euler, problem 187, stupid brute force solution, 1 to 2 minutes:

use Math::Prime::Util qw/forcomposites factor/;
my $nsemis = 0;
forcomposites { $nsemis++ if scalar factor($_) == 2; } int(10**8)-1;
say $nsemis;

Here is one of the best ways for PE187: under 20 milliseconds from the command line. Much faster than Pari, and competitive with Mathematica.

use Math::Prime::Util qw/forprimes prime_count/;
my $limit = shift || int(10**8);
$limit--;
my ($sum, $pc) = (0, 1);
forprimes {
  $sum += prime_count(int($limit/$_)) + 1 - $pc++;
} int(sqrt($limit));
say $sum;

To get the result of "matches" in Math::Factor::XS:

use Math::Prime::Util qw/divisors/;
sub matches {
  my @d = divisors(shift);
  return map { [$d[$_],$d[$#d-$_]] } 1..(@d-1)>>1;
}
my $n = 139650;
say "$n = ", join(" = ", map { "$_->[0]·$_->[1]" } matches($n));

or its matches function with the skip_multiples option:

sub matches {
  my @d = divisors(shift);
  return map { [$d[$_],$d[$#d-$_]] }
         grep { my $div=$d[$_]; !scalar(grep {!($div % $d[$_])} 1..$_-1) }
         1..(@d-1)>>1; }
}

Compute OEIS A054903 just like CRG4s Pari example:

use Math::Prime::Util qw/forcomposites divisor_sum/;
forcomposites {
  say if divisor_sum($_)+6 == divisor_sum($_+6)
} 9,1e7;

Construct the table shown in OEIS A046147:

use Math::Prime::Util qw/znorder euler_phi gcd/;
foreach my $n (1..100) {
  if (!znprimroot($n)) {
    say "$n -";
  } else {
    my $phi = euler_phi($n);
    my @r = grep { gcd($_,$n) == 1 && znorder($_,$n) == $phi } 1..$n-1;
    say "$n ", join(" ", @r);
  }
}

Find the 7-digit palindromic primes in the first 20k digits of Pi:

use Math::Prime::Util qw/Pi is_prime/;
my $pi = "".Pi(20000);  # make sure we only stringify once
for my $pos (2 .. length($pi)-7) {
  my $s = substr($pi, $pos, 7);
  say "$s at $pos" if $s eq reverse($s) && is_prime($s);
}

# Or we could use the regex engine to find the palindromes:
while ($pi =~ /(([1379])(\d)(\d)\d\4\3\2)/g) {
  say "$1 at ",pos($pi)-7 if is_prime($1)
}

The Bell numbers B_n:

sub B { my $n = shift; vecsum(map { stirling($n,$_,2) } 0..$n) }
say "$_  ",B($_) for 1..50;

Recognizing tetrahedral numbers (OEIS A000292):

sub is_tetrahedral {
  my $n6 = vecprod(6,shift);
  my $k  = rootint($n6,3);
  vecprod($k,$k+1,$k+2) == $n6;
}

Recognizing powerful numbers (e.g. ispowerful from Pari/GP, or our built-in and much faster "is_powerful"):

sub ispowerful { (vecall { $_->[1] > 1 } factor_exp(shift)) ? 1 : 0; }

Convert from binary to hex (3000x faster than Math::BaseConvert):

my $hex_string = todigitstring(fromdigits($bin_string,2),16);

Calculate and print derangements using permutations:

my @data = qw/a b c d/;
forperm { say "@data[@_]" unless vecany { $_[$_]==$_ } 0..$#_ } @data;
# Using forderange directly is faster

Compute the subfactorial of n (OEIS A000166):

sub my_subfactorial { my $n = shift;
  vecsum(map{ vecprod((-1)**($n-$_),binomial($n,$_),factorial($_)) }0..$n);
}

Compute subfactorial (number of derangements) using simple recursion:

sub my_subfactorial { my $n = shift;
  use bigint;
  ($n < 1)  ?  1  :  $n * subfactorial($n-1) + (-1)**$n;
}

Recognize Sidon and sum-free sets. We have specific functions "is_sidon_set" and "is_sumfree_set" that are faster.

sub is_sidon { my $set = shift;  my $len = scalar(@$set);
  my $sumset = sumset($set);
  0+(@$sumset==(($len*$len+$len)/2));
}
sub is_sum_free { my $set = shift;
  1 - setcontainsany($set,sumset($set));
}

PRIMALITY TESTING NOTES

Above 2^64, "is_prob_prime" performs an extra-strong BPSW test which is fast (a little less than the time to perform 3 Miller-Rabin tests) and has no known counterexamples. If you trust the primality testing done by Pari, Maple, SAGE, FLINT, etc., then this function should be appropriate for you. "is_prime" will do the same BPSW test as well as some additional testing, making it slightly more time consuming but less likely to produce a false result. This is a little more stringent than Mathematica. "is_provable_prime" constructs a primality proof. If a certificate is requested, then either BLS75 theorem 5 or ECPP is performed. Without a certificate, the method is implementation specific (currently it is identical, but later releases may use APRCL). With Math::Prime::Util::GMP installed, this is quite fast through 300 or so digits.

Math systems 35 years ago typically used Miller-Rabin tests with k bases (usually fixed bases, sometimes random) for primality testing, but these have generally been replaced by some form of BPSW as used in this module. See Pinch's 1993 paper for examples of why using k M-R tests leads to poor results. All common contemporary usage is now some BPSW variant.

libtommath (previous to 1.1.0): As of version 1.1.0 (January 2019), this uses strong BPSW and even adds a base 3 strong pseudoprime test. Raku uses this so fixes one of my peeves I had with their design.
GMP/MPIR (previous to 6.2.0): As of version 6.2.0 (January 2020), this uses strong BPSW and typically adds one random-base strong pseudoprime test in addition.
Math::Pari (previous to Pari 2.3.0): Pari 2.1.7 is the default version installed with the Math::Pari module. It uses 10 random M-R bases (the PRNG uses a fixed seed set at compile time) and is highly susceptible to false positives. Pari 2.3.0 was released in May 2006 and it uses BPSW (or the APR-CL proof method), which are still used to this day in modern Pari/GP (a great ECPP implementation was added in 2.10 for even better proofs).

Basically the problem with running k M-R tests is that it is too easy to get counterexamples, forcing one to use a very large number of tests (at least 20) to avoid frequent false results. Using the BPSW test results in no known counterexamples after 45+ years and runs much faster. It can be enhanced with one or more random bases if one desires, and will still be much faster.

LIMITATIONS

Perl versions earlier than 5.8.0 have problems doing exact integer math. Some operations will flip signs, and many operations will convert intermediate or output results to doubles, which loses precision on 64-bit systems. This causes numerous functions to not work properly. The test suite will try to determine if your Perl is broken (this only applies to really old versions of Perl compiled for 64-bit when using numbers larger than ~ 2^49). The best solution is updating to a more recent Perl.

The module is thread-safe and should allow good concurrency on all platforms that support Perl threads except Win32. With Win32, either don't use threads or make sure prime_precalc is called before using primes, prime_count, or nth_prime with large inputs. This is only an issue if you use non-Cygwin Win32 and call these routines from within Perl threads.

The block calls like "forprimes", "vecreduce", etc. use MULTICALL. We optimize away the per-call scope if it looks like it isn't needed. This solves the functional and memory problems seen in RT95409 and RT127605, while still allowing higher performance on common simple blocks that don't create temporary variables or pass local references out of scope. Double braces for the function, e.g. forprimes {{ ... }} 50, can be used to force a separate scope.

PERFORMANCE

First, for those looking for the state of the art non-Perl solutions:

Primality testing

For general numbers smaller than 2000 or so digits, MPU is the fastest solution I am aware of (it is faster than Pari 2.7, PFGW, and FLINT). For very large inputs, PFGW is the fastest primality testing software I'm aware of. It has fast trial division, and is especially fast on many special forms. It does not have a BPSW test however, and there are quite a few counterexamples for a given base of its PRP test, so it is commonly used for fast filtering of large candidates. A test such as the BPSW test in this module is then recommended.

Primality proofs

Primo is the best method for open source primality proving for inputs over 1000 digits. Primo also does well below that size, but other good alternatives are David Cleaver's mpzaprcl, the APRCL from the modern Pari package, or the standalone ECPP from this module with large polynomial set.

Factoring

yafu, msieve, and gmp-ecm are all good choices for large inputs. The factoring code in this module (and all other CPAN modules) is very limited compared to those.

Primes

primesieve and yafu are the fastest publicly available code I am aware of. Primesieve will additionally take advantage of multiple cores with excellent efficiency. Tomás Oliveira e Silva's private code may be faster for very large values, but isn't available for testing.

Note that the Sieve of Atkin is not faster than the Sieve of Eratosthenes when both are well implemented. The only Sieve of Atkin that is even competitive is Bernstein's super optimized primegen, which runs on par with the SoE in this module. The SoE's in Pari, yafu, and primesieve are all faster.

Prime Counts and Nth Prime

The gold standard is currently Kim Walisch's fantastic primecount. For single threaded computations with 64-bit n, this module is fairly close in performance.

The fastest solution for small inputs is a hybrid table/sieve method. This module does this for values below 60M. As the inputs get larger, either the tables have to grow exponentially or speed must be sacrificed, so eventually we will use methods like LMO.

PRIME COUNTS

Counting the primes to 800_000_000 (800 million):

Time (s)   Module                      Version  Notes
---------  --------------------------  -------  -----------
     0.001 Math::Prime::Util           0.37     using extended LMO
     0.007 Math::Prime::Util           0.12     using Lehmer's method
     0.27  Math::Prime::Util           0.17     segmented mod-30 sieve
     0.39  Math::Prime::Util::PP       0.24     Perl (Lehmer's method)
     2.9   Math::Prime::FastSieve      0.12     decent odd-number sieve
    11.7   Math::Prime::XS             0.27     0.27 includes a count
    15.0   Bit::Vector                 7.2
    48.9   Math::Prime::Util::PP       0.14     Perl (fastest I know of)
    49.00  Math::Big                   1.16     Uses efficient Perl
   170.0   Faster Perl sieve (net)     2012-01  array of odds
   548.1   RosettaCode sieve (net)     2012-06  simplistic Perl
  3048.1   Math::Primality             0.08     Perl + Math::GMPz

Python's SymPy 1.1 (2017) up to current 1.14.0 (2025) uses Legendre's method. This is vastly preferable to sieving used by earlier versions of SymPy and by MPMATH (as of v1.4.0). It is a little slower than our Lehmer and quite a bit slower than LMO, but is much simpler.

PRIMALITY TESTING

Small inputs: is_prime from 1 to 20M

  2.0s  Math::Prime::Util      (sieve lookup if prime_precalc used)
  2.5s  Math::Prime::FastSieve (sieve lookup)
  3.3s  Math::Prime::Util      (trial + deterministic M-R)
 10.4s  Math::Prime::XS        (trial)
 19.1s  Math::Pari w/2.3.5     (BPSW)
 52.4s  Math::Pari             (10 random M-R)
480s    Math::Primality        (deterministic M-R)

Large native inputs: is_prime from 10^16 to 10^16 + 20M

  4.5s  Math::Prime::Util      (BPSW)
 24.9s  Math::Pari w/2.3.5     (BPSW)
117.0s  Math::Pari             (10 random M-R)
682s    Math::Primality        (BPSW)
30 HRS  Math::Prime::XS        (trial)

These inputs are too large for Math::Prime::FastSieve.

bigints: is_prime from 10^100 to 10^100 + 0.2M

  2.2s  Math::Prime::Util          (BPSW + 1 random M-R)
  2.7s  Math::Pari w/2.3.5         (BPSW)
 13.0s  Math::Primality            (BPSW)
 35.2s  Math::Pari                 (10 random M-R)
 38.6s  Math::Prime::Util w/o GMP  (BPSW)
 70.7s  Math::Prime::Util          (n-1 or ECPP proof)
102.9s  Math::Pari w/2.3.5         (APR-CL proof)

MPU is consistently the fastest solution, and performs the most stringent probable prime tests on bigints.
Math::Primality has a lot of overhead that makes it quite slow for native size integers. With bigints we finally see it work well.
Math::Pari built with 2.3.5 not only has a better primality test versus the default 2.1.7, but runs faster. It still has quite a bit of overhead with native size integers. Pari/GP 2.5.0 takes 11.3s, 16.9s, and 2.9s respectively for the tests above. MPU is still faster, but clearly the time for native integers is dominated by the calling overhead.

FACTORING

Factoring performance depends on the input, and the algorithm choices used are still being tuned. Math::Factor::XS is very fast when given input with only small factors, but it slows down rapidly as the smallest factor increases in size. For numbers larger than 32 bits, Math::Prime::Util can be 100x or more faster (a number with only very small factors will be nearly identical, while a semiprime may be 3000x faster). Math::Pari is much slower with native sized inputs, probably due to calling overhead. For bigints, the Math::Prime::Util::GMP module is needed or performance will be far worse than Math::Pari. With the GMP module, performance is pretty similar from 20 through 70 digits, with the caveat that the current MPU factoring uses more memory for 60+ digit numbers.

This slide presentation has a lot of data on 64-bit and GMP factoring performance I collected in 2009. Assuming you do not know anything about the inputs, trial division and optimized Fermat or Lehman work very well for small numbers (<= 10 digits), while native SQUFOF is typically the method of choice for 11-18 digits (I've seen claims that a lightweight QS can be faster for 15+ digits). Some form of Quadratic Sieve is usually used for inputs in the 19-100 digit range, and beyond that is the General Number Field Sieve. For serious factoring, I recommend looking at yafu, msieve, gmp-ecm, GGNFS, and Pari. The latest yafu should cover most uses, with GGNFS likely only providing a benefit for numbers large enough to warrant distributed processing.

PRIMALITY PROVING

The n-1 proving algorithm in Math::Prime::Util::GMP compares well to the version included in Pari. Both are pretty fast to about 60 digits, and work reasonably well to 80 or so before starting to take many minutes per number on a fast computer. Version 0.09 and newer of MPU::GMP contain an ECPP implementation that, while not state of the art compared to closed source solutions, works quite well. It averages less than a second for proving 200-digit primes including creating a certificate. Times below 200 digits are faster than Pari 2.3.5's APR-CL proof. For larger inputs the bottleneck is a limited set of discriminants, and time becomes more variable. There is a larger set of discriminants on github that help, with 300-digit primes taking ~5 seconds on average and typically under a minute for 500-digits. For primality proving with very large numbers, I recommend Primo.

RANDOM PRIME GENERATION

Seconds per prime for random prime generation on an early 2015 Macbook Pro (2.7 GHz i5) with Math::BigInt::GMP and Math::Prime::Util::GMP installed.

bits    random   +testing   Maurer   Shw-Tylr  CPMaurer
-----  --------  --------  --------  --------  --------
   64    0.00002 +0.000009   0.00004   0.00004    0.019
  128    0.00008 +0.00014    0.00018   0.00012    0.051
  256    0.0004  +0.0003     0.00085   0.00058    0.13
  512    0.0023  +0.0007     0.0048    0.0030     0.40
 1024    0.019   +0.0033     0.034     0.025      1.78
 2048    0.26    +0.014      0.41      0.25       8.02
 4096    2.82    +0.11       4.4       3.0      66.7
 8192   23.7     +0.65      50.8      38.7     929.4

random    = random_nbit_prime  (results pass BPSW)
random+   = additional time for 3 M-R and a Frobenius test
maurer    = random_maurer_prime
Shw-Tylr  = random_shawe_taylor_prime
CPMaurer  = Crypt::Primes::maurer

"random_nbit_prime" is reasonably fast, and for most purposes should suffice. For cryptographic purposes, one may want additional tests or a proven prime. Additional tests are quite cheap, as shown by the time for three extra M-R and a Frobenius test. At these bit sizes, the chances a composite number passes BPSW, three more M-R tests, and a Frobenius test is extraordinarily small.

"random_proven_prime" provides a randomly selected prime with an optional certificate, without specifying the particular method. With GMP installed this always uses Maurer's algorithm as it is the best compromise between speed and diversity.

"random_maurer_prime" constructs a provable prime. A primality test is run on each intermediate, and it also constructs a complete primality certificate which is verified at the end (and can be returned). While the result is uniformly distributed, only about 10% of the primes in the range are selected for output. This is a result of the FastPrime algorithm and is usually unimportant.

"random_shawe_taylor_prime" similarly constructs a provable prime. It uses a simpler construction method. It is slightly faster than Maurer's algorithm but provides less diversity (even fewer primes in the range are selected, though for typical cryptographic sizes this is not important). The Perl implementation uses a single large random seed followed by SHA-256 as specified by FIPS 186-4. The GMP implementation uses the same FIPS 186-4 algorithm but uses its own CSPRNG which may not be SHA-256.

"maurer" in Crypt::Primes times are included for comparison. It is reasonably fast for small sizes but gets slow as the size increases. It is 10 to 500 times slower than this module's GMP methods. It does not perform any primality checks on the intermediate results or the final result (I highly recommended running a primality test on the output). Additionally important for servers, "maurer" in Crypt::Primes uses excessive system entropy and can grind to a halt if /dev/random is exhausted (it can take days to return).

CONGRUENT NUMBERS

The "is_congruent_number" function, combined with our "forsquarefreeint" operator to loop over square free integers in a range, is quite fast compared to most public implementations. For computing many values, it is expected that fast theta series computations, such as demonstrated in Hart et al. (2009) (https://wrap.warwick.ac.uk/id/eprint/41654/), are significantly faster, albeit requiring more memory and disk space.

All congruent numbers less than 300,000 can be identified in under 2 seconds.

Giovanni Resta's list of 213318 square-free and mod 8 <= 4 congruent numbers less than 10^7 can be generated in 19 minutes on a single core of an M1 laptop.

SETS

Measuring the performance of various modules for set operations doesn't give a strict order. Many modules are fast at some operations and slow at others. Some have particular inputs they are very fast or very slow with. Each module has different functionality.

We chose, following Pari and Mathematica, to represent sets as native Perl lists of sorted de-duplicated integers, rather than a dedicated object. This allows flexibility and use for other purposes, but it isn't ideal for general performance, especially with very large sets (100k+ elements) where we spend a large amount of time parsing and manipulating the Perl input array. While an opaque data structure would use 8 or fewer bytes per element, Perl arrays use approximately 32 bytes per integer. Still, this is quite favorable compared to Perl hashes at 120 to 220 (e.g. Set::Light, Set::Tiny, Set::Scalar, Set::Functional).

For generic set use, I recommend Set::Tiny. The module source is very tiny, unlike this module. It offers an easy API for basic set functions and is fast. It is not limited to integers. On the other hand, with integers our module is typically faster (2-10x) and uses less memory, even with our choice of native Perl sorted arrays.

Finding the sumset size of the first 10,000 primes.

my %r;  my $p = primes(nth_prime(10000));

12.6s   15MB  forsetproduct {$r{vecsum(@_)}=undef;} $p,$p;
              say scalar(keys %r);
 9.4s 3900MB  Pari/GP X=primes(10000); #setbinop((a,b)->a+b,X,X)
 2.4s    3MB  $s=setbinop { $a+$b } $p;  say scalar @$s;
 0.4s    3MB  $s=sumset $p;  say scalar @$s;

Set intersection of [-1000..100] and [-100..1000], with Perl 5.43.7.

   4 uS  Set::IntSpan::Fast::XS
   5 uS  setintersect                       <===========  this module
   7 uS  Pari/GP 2.17.0
  14 uS  Set::IntSpan::Fast
  61 uS  native Perl hash intersection          /\ /\ /\  Faster
  62 uS  Set::Tiny
  66 uS  Set::Functional
 105 uS  PP::setintersect                       \/ \/ \/  Slower
 200 uS  Array::Set
 310 uS  Set::Object
 332 uS  Set::SortedArray
1508 uS  Set::Scalar

Set intersection of integers under 1000 divisible by 2 and 3 respectively. Sets are [grep{0==$_%2}0..999] and [grep{0==$_%3}0..999]:

   3 uS  setintersect                       <===========  this module
   6 uS  Pari/GP 2.17.0
  31 uS  Set::Tiny
  32 uS  native Perl hash intersection          /\ /\ /\  Faster
  34 uS  Set::Functional
  37 uS  PP::setintersect                       \/ \/ \/  Slower
  64 uS  Set::IntSpan::Fast::XS
  86 uS  Array::Set
 122 uS  Set::SortedArray
 138 uS  Set::Object
 615 uS  Set::Scalar
3090 uS  Set::IntSpan::Fast

Set::IntSpan::Fast is very fast with the first example using single span sets, but gets quite slow with more spans as seen in the second example. The other modules are mostly unaffected by data patterns.

Using our own set objects wrapping a C structure of some sort would be faster and lower memory. In particular, we often spend more time just reading the set values than we do performing the set operation.

SORTING

Perl's built-in sort is a cache-friendly stable merge sort. This is reasonably appropriate for the wide variety of uses expected. When sorting lists of integers, it could be improved. Perl 5.8 brought an in-place optimization, so @a=sort{$a<=$b}@a> is done without copying. The numerical sort is recognized and short-cut so doesn't actually call the well-known comparison function. However, Perl's old 32-bit legacy lived on until 5.26 as the inputs were turned into doubles, which can lead to subtle bugs with large integers. Inputs that started as strings (e.g. input read from a file) will still get turned into doubles.

Our vecsort tries to avoid these issues, making sure inputs are processed as only IV, UV, and/or bigints. Integer strings are converted to one of those. All inputs are validated to be integers. There is no need for separate interfaces for signed and unsigned numbers as Perl's representation stores this information explicitly and per-variable rather than per-array.

Input lists that contain bigints, or both negative numbers and positive numbers larger than the maximum IV (2^63-1 for 64-bit), cannot be stored in a native array of a single type, therefore will be sorted using Perl's sort rather than our C code. This is substantially slower, but produces the correct results.

Our sorting for native signed and unsigned integers is a combination of radix sort and quicksort (the latter using median of 9 partitioning, insertion sort for small partitions, and heapsort fallback if we detect repeated poor partitioning). It is quite fast and low overhead.

Sort::XS has a variety of algorithms. However there is no option for unsigned (UV), only signed integers (IV). Sort::Key offers a variety of interfaces including unsigned and signed integers, as well as in-place versions. The following table compares sorting random 64-bit unsigned integers and is shown as speedup relative to Perl's sort (higher is faster, v5.43.7).

                            10    100   1000  10000 100000     1M
vecsort                   2.0x   2.3x   4.7x   6.2x   6.9x   9.7x
Sort::Key::Radix usort    1.4x   2.3x   3.4x   3.4x   4.7x   2.7x
Sort::XS::quick_sort      1.2x   1.5x   1.8x   1.9x   2.0x   2.7x
Sort::Key usort           1.2x   1.3x   1.3x   1.3x   1.3x   1.3x
sort                      1.0x   1.0x   1.0x   1.0x   1.0x   1.0x
List::MoreUtils::qsort    0.6x   0.5x   0.4x   0.4x   0.4x   0.3x

The implementation does not currently try to exploit patterns. Regarding the above timing, when given sorted or reverse sorted data, Perl's sort is much faster versus the random values used above, though still not faster than "vecsort" and Sort::Key::Radix::usort (both of which use a radix sort).

List::MoreUtils::qsort has very different goals in mind than standard sorting of integer lists, as mentioned in their documentation. In contrast, this is exactly (and only) what vecsort does, so it should not be a surprise that our function looks good on this benchmark. Different use cases would show things differently.

AUTHORS

Dana Jacobsen <dana@acm.org>

ACKNOWLEDGEMENTS

Eratosthenes of Cyrene provided the elegant and simple algorithm for finding primes.

Terje Mathisen, A.R. Quesada, B. Van Pelt, and Kim Walisch all had useful ideas I used in my wheel sieve.

The SQUFOF implementation being used is a slight modification to the public domain racing version written by Ben Buhrow. Enhancements with ideas from Ben's later code as well as Jason Papadopoulos's public domain implementations are planned for a later version.

The LMO implementation is based on the 2003 preprint from Christian Bau, as well as the 2006 paper from Tomás Oliveira e Silva. I also want to thank Kim Walisch for the many discussions about prime counting.

REFERENCES

Christian Axler, "New bounds for the prime counting function π(x)", September 2014. For large values, improved limits versus Dusart 2010. http://arxiv.org/abs/1409.1780
Christian Axler, "Über die Primzahl-Zählfunktion, die n-te Primzahl und verallgemeinerte Ramanujan-Primzahlen", January 2013. Prime count and nth-prime bounds in more detail. Thesis in German, but first part is easily read. http://docserv.uni-duesseldorf.de/servlets/DerivateServlet/Derivate-28284/pdfa-1b.pdf
Christian Bau, "The Extended Meissel-Lehmer Algorithm", 2003, preprint with example C++ implementation. Very detailed implementation-specific paper which was used for the implementation here. Highly recommended for implementing a sieve-based LMO. http://cs.swan.ac.uk/~csoliver/ok-sat-library/OKplatform/ExternalSources/sources/NumberTheory/ChristianBau/
Manuel Benito and Juan L. Varona, "Recursive formulas related to the summation of the Möbius function", The Open Mathematics Journal, v1, pp 25-34, 2007. Among many other things, shows a simple formula for computing the Mertens functions with only n/3 Möbius values (not as fast as Deléglise and Rivat, but really simple). http://www.unirioja.es/cu/jvarona/downloads/Benito-Varona-TOMATJ-Mertens.pdf
John Brillhart, D. H. Lehmer, and J. L. Selfridge, "New Primality Criteria and Factorizations of 2^m +/- 1", Mathematics of Computation, v29, n130, Apr 1975, pp 620-647. http://www.ams.org/journals/mcom/1975-29-130/S0025-5718-1975-0384673-1/S0025-5718-1975-0384673-1.pdf
W. J. Cody and Henry C. Thacher, Jr., "Rational Chebyshev Approximations for the Exponential Integral E_1(x)", Mathematics of Computation, v22, pp 641-649, 1968.
W. J. Cody and Henry C. Thacher, Jr., "Chebyshev approximations for the exponential integral Ei(x)", Mathematics of Computation, v23, pp 289-303, 1969. http://www.ams.org/journals/mcom/1969-23-106/S0025-5718-1969-0242349-2/
W. J. Cody, K. E. Hillstrom, and Henry C. Thacher Jr., "Chebyshev Approximations for the Riemann Zeta Function", Mathematics of Computation, v25, n115, pp 537-547, July 1971.
Henri Cohen, "A Course in Computational Algebraic Number Theory", Springer, 1996. Practical computational number theory from the team lead of Pari. Lots of explicit algorithms.
Marc Deléglise and Joöl Rivat, "Computing the summation of the Möbius function", Experimental Mathematics, v5, n4, pp 291-295, 1996. Enhances the Möbius computation in Lioen/van de Lune, and gives a very efficient way to compute the Mertens function. http://projecteuclid.org/euclid.em/1047565447
Pierre Dusart, "Autour de la fonction qui compte le nombre de nombres premiers", PhD thesis, 1998. In French. The mathematics is readable and highly recommended reading if you're interested in prime number bounds. http://www.unilim.fr/laco/theses/1998/T1998_01.html
Pierre Dusart, "Estimates of Some Functions Over Primes without R.H.", preprint, 2010. Updates to the best non-RH bounds for prime count and nth prime. http://arxiv.org/abs/1002.0442/
Pierre-Alain Fouque and Mehdi Tibouchi, "Close to Uniform Prime Number Generation With Fewer Random Bits", pre-print, 2011. Describes random prime distributions, their algorithm for creating random primes using few random bits, and comparisons to other methods. Definitely worth reading for the discussions of uniformity. http://eprint.iacr.org/2011/481
Daan Leijen, "Division and Modulus for Computer Scientists", 2001. Paper discussing different div/mod methods. https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/divmodnote-letter.pdf
Walter M. Lioen and Jan van de Lune, "Systematic Computations on Mertens' Conjecture and Dirichlet's Divisor Problem by Vectorized Sieving", in From Universal Morphisms to Megabytes, Centrum voor Wiskunde en Informatica, pp. 421-432, 1994. Describes a nice way to compute a range of Möbius values. http://walter.lioen.com/papers/LL94.pdf
Ueli M. Maurer, "Fast Generation of Prime Numbers and Secure Public-Key Cryptographic Parameters", 1995. Generating random provable primes by building up the prime. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.26.2151
Gabriel Mincu, "An Asymptotic Expansion", Journal of Inequalities in Pure and Applied Mathematics, v4, n2, 2003. A very readable account of Cipolla's 1902 nth prime approximation. http://www.emis.de/journals/JIPAM/images/153_02_JIPAM/153_02.pdf
OEIS: Primorial
Vincent Pegoraro and Philipp Slusallek, "On the Evaluation of the Complex-Valued Exponential Integral", Journal of Graphics, GPU, and Game Tools, v15, n3, pp 183-198, 2011. http://www.cs.utah.edu/~vpegorar/research/2011_JGT/paper.pdf
William H. Press et al., "Numerical Recipes", 3rd edition.
Hans Riesel, "Prime Numbers and Computer Methods for Factorization", Birkh?user, 2nd edition, 1994. Lots of information, some code, easy to follow.
David M. Smith, "Multiple-Precision Exponential Integral and Related Functions", ACM Transactions on Mathematical Software, v37, n4, 2011. http://myweb.lmu.edu/dmsmith/toms2011.pdf
Douglas A. Stoll and Patrick Demichel , "The impact of ζ(s) complex zeros on π(x) for x < 10^{10^{13}}", Mathematics of Computation, v80, n276, pp 2381-2394, October 2011. http://www.ams.org/journals/mcom/2011-80-276/S0025-5718-2011-02477-4/home.html

COPYRIGHT

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

To install Math::Prime::Util, copy and paste the appropriate command in to your terminal.

cpanm

cpanm Math::Prime::Util

CPAN shell

perl -MCPAN -e shell
install Math::Prime::Util

For more information on module installation, please visit the detailed CPAN module installation guide.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	Go to GitHub issues (only if GitHub is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)