Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to generate a list of all possible strings from shortest to longest

I need to generate an infinite list of strings using numbers and letters. The first string should simply be "a", then "b" through to "z", then "0" through to "9", then "aa", "ab" etc.

I can easily generate the ones with one character, but then it gets more complicated.

like image 209
Nick Brunt Avatar asked Mar 03 '12 00:03

Nick Brunt


4 Answers

So, suppose we already had this list of all possible strings:

 allStrings :: [String]
 allStrings = ...

Given allStrings, how could we create another list of all possible strings?

 alsoAllStrings :: [String]

Let's look at each possible string as a single character prefix, and a string suffix

 alsoAllStrings = [ c : s 

The string suffix is either the empty string or a member of the list of all possible strings.

                  | s <- "" : allStrings

The single character prefix is in 'a' thru 'z' or '0' thru '9'.

                  , c <- ['a'..'z'] ++ ['0'..'9']
                  ]

(this is using list comprehensions - we could also do the same using concatMap:

alsoAllStrings = concatMap (\s -> map (:s) $ ['a'..'z'] ++ ['0'..'9']) $ "" : allStrings

)

Now let's go back to the original question. How do we find allStrings?

In most languages, we couldn't - it's an infinite list of strings, the program would never finish. But since Haskell is lazy, it's cool with generating only as much of allStrings as we actually use.

One surprising thing that this lets us do is we can define allStrings in terms of alsoAllStrings.

 allStrings = alsoAllStrings

Or, more likely, in terms of itself.

 allStrings = [ c : s | s <- "" : allStrings, c <- ['a'..'z'] ++ ['0'..'9'] ]

This is called corecursive data - data that is defined in terms of itself. Try it out in ghci:

 ghci> let allStrings = [ c : s | s <- "": allStrings, c <- ['a'..'z'] ++ ['0'..'9'] ]
 ghci> take 100 allStrings
 ["a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z","0","1","2","3","4","5","6","7","8","9","aa","ba","ca","da","ea","fa","ga","ha","ia","ja","ka","la","ma","na","oa","pa","qa","ra","sa","ta","ua","va","wa","xa","ya","za","0a","1a","2a","3a","4a","5a","6a","7a","8a","9a","ab","bb","cb","db","eb","fb","gb","hb","ib","jb","kb","lb","mb","nb","ob","pb","qb","rb","sb","tb","ub","vb","wb","xb","yb","zb","0b","1b"]

The reason it works (and doesn't just loop infinitely) is that it defines part of itself before it uses itself. The first elements we include in allStrings are the single character extensions to the empty string "". So by the time we start iterating through the elements of allStrings to use as suffixes, the first couple elements allStrings are already defined and available. And the more suffixes we process, the more elements of allStrings are defined and available to use as suffixes.

It's very similar to a common haskellism of defining the fibonacci numbers corecursively:

 fibs = 1 : 1 : zipWith (+) fibs (tail fibs)

Don't be surprised if corecursion takes a while to wrap your head around. It's worth the effort to understand, though.

like image 56
rampion Avatar answered Oct 02 '22 07:10

rampion


This can be done by getting all combinations for strings of length 1,2,3..., then concating them all together.

I started with a combos function, which returned all list of strings with the given characters and the given length:

combos :: [Char] -> Int -> [String]

The fist case is simple, just turn the input chars into lists:

combos chars 1 = map (:[]) chars

The next case we want 'aa', 'ab', 'ac'... to 'zx', 'zy', 'zz'. This is the same as map ("a" ++) ["a", "b", "c"...] ++ map ("b" ++) ["a","b",...] right up to map ("z" ++) ["a","b"...].

["a".."z"] is the same as combos chars 1. That function can be written as:

combos chars 2 = concatMap (\front -> map (front ++) (combos chars 1)) (combos chars 1)

The next case we want 'aaa', 'aab', ... 'zzy', 'zzz'. This is actually very similar to combos 2, except we are using as the front combos chars 2 instead of combos chars 1:

combos chars 3 = concatMap (\front -> map (front ++) (combos chars 1)) (combos chars 2)

(note the combos chars 1 doesn't change, that just gives us the list of one character strings to append to the end.)

See the pattern here? It can now be generalized for all n as:

combos :: [Char] -> Int -> [String]
combos chars 1 = map (:[]) chars
combos chars n = concatMap (\front -> map (front ++) (combos chars 1)) $ combos chars (n - 1)

Finally, to get the infinite list of all strings, just concat all the results of combos together with the required characters and all lengths:

allCombos :: [String]
allCombos = concatMap (combos (['a'..'z'] ++ ['0'..'9'])) [1..]

Example output:

> take 100 allCombos
["a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z","0","1","2","3","4","5","6","7","8","9","aa","ab","ac","ad","ae","af","ag","ah","ai","aj","ak","al","am","an","ao","ap","aq","ar","as","at","au","av","aw","ax","ay","az","a0","a1","a2","a3","a4","a5","a6","a7","a8","a9","ba","bb","bc","bd","be","bf","bg","bh","bi","bj","bk","bl","bm","bn","bo","bp","bq","br","bs","bt","bu","bv","bw","bx","by","bz","b0","b1"]
like image 23
David Miani Avatar answered Oct 04 '22 07:10

David Miani


I came up with the following solution:

-- Gives us the concatenated cartesian product of two lists of strings
strCartProd :: [String] -> [String] -> [String]
strCartProd xs ys = [x ++ y | x <- xs, y <- ys]

-- Create initial list of strings (map chars to 1-char strings)
s :: [String]
s = map (:[]) $ ['a'..'z'] ++ ['0'..'9']

-- Iterate and concatenate the resulting lists
result :: [String]
result = id =<< iterate (strCartProd s) [""]
like image 32
Niklas B. Avatar answered Oct 04 '22 07:10

Niklas B.


This can be done simply by a predefined function replicateM from (Control.Monad).

string = concatMap (\n -> replicateM n (['a'..'z']++['0'..'9'])) [1..]
like image 21
Shivansh Kumar Avatar answered Oct 02 '22 07:10

Shivansh Kumar