Pretty much the most interesting blog on the Internet.— Prof. Steven Landsburg

Once you get past the title, and the subtitle, and the equations, and the foreign quotes, and the computer code, and the various hapax legomena, a solid 50% English content!—The Proprietor

Wednesday, August 26, 2015

Testing Periodicity in Haskell

If you are dealing with infinite, but potentially periodic, series, you may sometimes have wished there was a function, let's call it periods, that could be given such a series and tell you whether it is periodic and, if so, the length and starting point of the period.

Sadly, such a function is impossible. There is no way for the function to determine whether any apparent period is genuine or there is a break at the \(n\)th element without looking at every single element. And checking every single element of an infinite series obviously cannot be done in finite time.

Even if, somehow, the periods function was able to peer into generator of the elements, the problem would remain untractable. Surely, a sufficiently clever, introspective periods function could identify with certainty the periodicity of many infinite series. But in the general case, this introspection would be the equivalent of solving the halting problem which, by famous proof, is impossible.

Yet one need not give up, for it is possible to write a weaker periods function which still produces very useful results in finite time. This weaker function does not answer the periodicity problem definitely. Instead it produces a list of the first occurrence of a match of any given length.<\p>

Output of such a function would always start with the tuple \((0,1)\) for there is always a zero-length match between the first two elements of the input list. The next output would be of the form \((a,b)\) where \(a\) and \(b\) are the first and second occurrence of the first repeated element. Then would follow \((c,d)\) where \(c\) and \(d\) are the starting points of first and second occurrence of the first repeated element pair, and so on. If the input list really is periodic, the output of periods would eventually turn to an infinitely repeated \((x,y)\) where \(x\) is the starting point of the period and \(y-x\) is the length of the period.

Here is what I believe to be a fairly efficient, yet simple implementation of such a general periods function in Haskell:

data PeriodsTree a = PeriodsNode !Int [a] (PeriodsTree a) (PeriodsTree a) (PeriodsTree a) | PeriodsNull periods:: forall a. Ord a => [a] -> [(Int,Int)] periods = (go PeriodsNull 0) . (zip [0..]) . tails where go :: (PeriodsTree a) -> Int -> [(Int,[a])] -> [(Int,Int)] go t c ((i,l):r) = let (nt,a) = merge t c i l in a ++ go nt (c+length a) r go _ _ _ = [] merge :: (PeriodsTree a) -> Int -> Int -> [a] -> (PeriodsTree a,[(Int,Int)]) merge pt _ _ [] = (pt,[]) merge (PeriodsNull) c i l = (PeriodsNode i l PeriodsNull (fst (merge PeriodsNull c i (tail l))) PeriodsNull,[]) merge pt@(PeriodsNode j k lt et gt) c i l | c==0 = let (nt,a) = merge pt 1 i l in (nt,(j,i):a) | null k || head l > head k = let (nt,a) = merge gt c i l in (PeriodsNode j k lt et nt,a) | head k > head l = let (nt,a) = merge lt c i l in (PeriodsNode j k nt et gt,a) | otherwise = let (nt,a) = merge et (c-1) i (tail l) in (PeriodsNode j k lt nt gt,a)

The function starts by generating indexed tuples of all the tails of the input list. That is, if the input list is \([a_0, a_1,a_2,a_3\dots]\), it is first converted to this list \([(0,[a_0, a_1,a_2,a_3\dots]),(1,[a_1,a_2,a_3\dots]),(2,[a_2,a_3\dots])\dots]\). This cooked input is then sorted into a custom search tree which, as it is being built, also spits out our desired ultimate output.

Note that when the input list is finite, the periods function too will terminate and produce the appropriate output. When the input list is infinite, the periods function the search tree too becomes infinite, but thanks to laziness, we still produce useful output. And when the input list is infinite and periodic, the periods function will, as soon as the period is found, rapidly produce and infinite list of pairs indicating the beginning and length of the period.

Hence, looking at, for example, (periods input)!!1000 is a useful exercise. For a non-periodic, infinite list, evaluating this output element is unlikely ever to terminate, but for a periodic infinite list, the evaluation will relatively quickly result in the indication of the likely period parameters. How likely cannot be determined without some measure of the universe of infinite lists looked at, but for most reasonable sets of lists, it is a near certainty and there is a guarantee that there will be at least a 1,000-element-long match at the stated positions.

As a test, one can run periods on the output of this function:

decimalsRecip :: Int -> [Int] decimalsRecip n = go 10 where go 0 = [] go a = let (d,m)=divMod a n in d:(go (10*m))

This function produces the decimal digits of the inverse of the argument. It is elementary that this series is infinite (unless the argument is of the form \(2^a 5^b\)), but periodic with a length less than the argument. And indeed:

decimalsRecip 388 → [0,0]++cycle [2,5,7,7,3,1,9,5,8,7,6,2,8,8,6,5,9,7,9,3,8,1,4,4, 3,2,9,8,9,6,9,0,7,2,1,6,4,9,4,8,4,5,3,6,0,8,2,4, 7,4,2,2,6,8,0,4,1,2,3,7,1,1,3,4,0,2,0,6,1,8,5,5, 6,7,0,1,0,3,0,9,2,7,8,3,5,0,5,1,5,4,6,3,9,1,7,5] periods (decimalsRecip 388) → [(0,1),(0,1),(1,66)]++repeat (2,98)