Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Are recursive calls in my "permutations with repetition" code accumulated to clog the RAM?

Tags:

haskell

A bit of background:

I am an amateur programmer, having picked up Haskell a few months ago, on my spare time, after a period of Mathematica programmning (my first language). I am currently going through my second Haskell book, by Will Kurt, but I still have miles to go to call myself comfortable around Haskell code. Codeabbey has been my platform for experimentation and learning so far.

I have written a piece of code to generate permutations of a given number, that deals with possible duplicate numbers, so for 588 it will internally generate 588, 858 and 885.

However, because I want to scale to pretty big input numbers (think perhaps even a hundred digits long), I don't want to output the whole list and then perform calculations on it, instead every number that is generated is checked on the spot for a certain property and if it has it, well, we have a winner, the number is returned as output and there's no need to go through the rest of the humongous list. If sadly no desired number is found and we unsuccessfully go through all possible permutations, it outputs a "0".

I have also opted to make it a command line program to feed values to it via gnu parallel for faster work.

So here is the code

import System.Environment

import Data.List 

toDigits :: Integer -> [Integer]
toDigits n = map (\n -> read [n]) (show n)

fromDigits :: Integral a => [a] -> Integer
fromDigits list = fromDigitsHelperFunction list 0

fromDigitsHelperFunction :: Integral a => [a] -> Integer -> Integer
fromDigitsHelperFunction [] acc = acc
fromDigitsHelperFunction (x:[]) acc = (fromIntegral x) + acc
fromDigitsHelperFunction digits@(x:xs) acc = fromDigitsHelperFunction xs (acc + ((fromIntegral x) * 10 ^((length digits) - 1 )))

testPermutationsWithRepetition :: ([Integer],Int,[Int],[(Int,Integer)]) -> [Integer]
testPermutationsWithRepetition (digits, index, rotationMap, registeredPositions)
   | index == 0 && rotationMap !! index == 0                                    = [0,0,0] --finish state (no more recursion). Nothing more to do
   | index == digitsLength - 1 && beautyCheck (fromDigits digits)           = digits
   | index == digitsLength - 1                                                  = testPermutationsWithRepetition (digits, index-1, rotationMap, registeredPositions)
   | not ((index,digits!!index) `elem` registeredPositions)                     = testPermutationsWithRepetition (digits, index+1, rotationMap, (index,digits!!index):registeredPositions)
   | rotationMap!!index == 0                                                    = testPermutationsWithRepetition (digits, index-1, restoredRotMap, restoredRegPositions)
   | rotationMap!!index > 0 && (index,digits!!index) `elem` registeredPositions = testPermutationsWithRepetition (shiftLDigits, index, subtractRot, registeredPositions)
   where digitsLength = length digits
         shiftLDigits = (fst splitDigits) ++ (tail $ snd splitDigits) ++ [head $ snd splitDigits]
         splitDigits = splitAt index digits
         restoredRotMap = (fst splitRotMap) ++ [digitsLength - index] ++ (tail $ snd splitRotMap)
         splitRotMap = splitAt index rotationMap
         restoredRegPositions = filter (\pos -> fst pos < index) registeredPositions  --clear everything below the parent index
         subtractRot = (fst splitRotMap) ++ [(head $ snd splitRotMap) - 1] ++ (tail $ snd splitRotMap)


--Frontend function for testing permutations by inputting a single parameter (the number in digit form)
testPermsWithRep :: [Integer] -> [Integer]
testPermsWithRep digits = testPermutationsWithRepetition (digits, 0, [length $ digits, (length $ digits) -1 .. 1], [])

main :: IO ()
main = do
   args <- getArgs
   let number = read (head args) :: Integer
   let checkResult = fromDigits $ testPermsWithRep $ toDigits number
   print checkResult

It's really a sequential process with an index variable that points to a certain number on the digit list and performs a recursive call on that list based on my rules. The functions tracks its progress through the digit list for visited numbers in certain positions so far (to avoid repetition following already visited paths until it gets to the last digit (index == length -1). If the number that we get there passes the beauty check, it exits with the number produced.

Now, in a Mathematica (or I guess any imperative language) I would probably implement this with a While loop and Cases for its checks, and by the logic of the program, however long it took to compute (generate the permutations and check them for validity) it would take a moderate amount of memory, just enough to hold the list of "registeredPositions" really (you could call it the record of visited digits in specific positions, so it's a variable list as we go deeper in index but gets cleaned up as we move back up). However in this case, the recursive calls stack up as it seems and the whole thing acts as a fork bomb for sufficiently large numbers (e.g 27777772222222222222222223333) and eventually crashes. Is this behaviour something that can be handled differently in Haskell or is there no way to avoid the recursion and memory hogging?

I really like Haskell because the programs make logical sense, but I would like to use it also for cases like this where performance (and resources) matters.

As a side note, my brother pointed to this Algorithm to print all permutations with repetition of numbers in C that is reasonably fast (only generates a list though) and most importantly has minimal memory footprint, although I can tell there's also recursion used in it. Other that that I'm clueless when it comes to C and I would like to stick to Haskell, if it can do what I want at the end of the day, that is.

Any help is welcome. Have a good day!

Edit: Per Soleil's suggestion I update my post with additional info provided in the comments. Specifically:

  1. After compiling with "ghc checking_program.hs" I run the program with "./checking program 27777772222222222222222223333". On an i5 3470 with 4GB RAM it runs for about 10 minutes and exits with a segmentation fault. On my brothers 32GB machine he let it run until it took up 20GB of RAM. No need to go further I guess. My tests were on Ubuntu via Win10 WSL. His is bare Linux

  2. testPermsWithRep is just a front end for testPermutationsWithRepetition, so that I can only provide the number and testPermsWithRep creates the initial parameters and calls testPermutationsWithRepetition with those. It outputs exactly what testPermutationsWithRepetition outputs, either a number (in digit form) that passes the test, or [0,0,0]. Now the test, the beautyCheck function is simply a test for single digit divisors of that number, that returns True or False. I didn't include it because it really is inconsequential. It could even be just a "bigger than x number" test.

An an example, calling "testPermsWithRep [2,6,7,3]" will call "testPermutationsWithRepetition ([2,6,7,3], 0, [4,3,2,1],[])" and whatever comes out of that function, testPermsWithRep will return that as well.

like image 265
Κωστής Καρβουνιάρης Avatar asked Jan 25 '23 09:01

Κωστής Καρβουνιάρης


1 Answers

The performance issue with your program doesn't have anything to do with recursion. Rather, you seem to be running up against an accumulation of a partially evaluated, lazy data structure in your rotation map. Your program will run in constant memory if you use the deepseq package to fully force evaluation of the restoredRotMap:

-- Install the `deepseq` package and add this import
import Control.DeepSeq

-- And then change this one case
... | rotationMap!!index == 0 = restoredRotMap `deepseq`
      testPermutationsWithRepetition (digits, index-1, restoredRotMap, restoredRegPositions)

Compiled with ghc -O2 and using beautyMap _ = False, this runs with a fixed resident memory usage of about 6 megs.

Some other performance targets:

  • You might want to replace most of your Integer types with Int, as this will be faster. I think you only need Integer for the input to toDigits and the output of fromDigits, and everything else can be Int, since it's all indexes and digits.
  • An even bigger win will be to replace your rotation map and registered positions with better data structures. If you find yourself splicing up lists with lots of listpart1 ++ [x] ++ listpart2 calls, there are going to be enormous performance costs to that, and the linear lookups with (!!) aren't helping either.
like image 76
K. A. Buhr Avatar answered Jan 26 '23 21:01

K. A. Buhr