I am currently trying to refresh my Haskell knowledge by solving some Hackerrank problems.
For example:
https://www.hackerrank.com/challenges/maximum-palindromes/problem
I've already implemented an imperative solution in C++ which got accepted for all test cases. Now I am trying to come up with a pure functional solution in (reasonably idiomatic) Haskell.
My current code is
module Main where
import Control.Monad
import qualified Data.ByteString.Char8 as C
import Data.Bits
import Data.List
import qualified Data.Map.Strict as Map
import qualified Data.IntMap.Strict as IntMap
import Debug.Trace
-- precompute factorials
compFactorials :: Int -> Int -> IntMap.IntMap Int
compFactorials n m = go 0 1 IntMap.empty
where
go a acc map
| a < 0 = map
| a < n = go a' acc' map'
| otherwise = map'
where
map' = IntMap.insert a acc map
a' = a + 1
acc' = (acc * a') `mod` m
-- precompute invs
compInvs :: Int -> Int -> IntMap.IntMap Int -> IntMap.IntMap Int
compInvs n m facts = go 0 IntMap.empty
where
go a map
| a < 0 = map
| a < n = go a' map'
| otherwise = map'
where
map' = IntMap.insert a v map
a' = a + 1
v = (modExp b (m-2) m) `mod` m
b = (IntMap.!) facts a
modExp :: Int -> Int -> Int -> Int
modExp b e m = go b e 1
where
go b e r
| (.&.) e 1 == 1 = go b' e' r'
| e > 0 = go b' e' r
| otherwise = r
where
r' = (r * b) `mod` m
b' = (b * b) `mod` m
e' = shift e (-1)
-- precompute frequency table
initFreqMap :: C.ByteString -> Map.Map Char (IntMap.IntMap Int)
initFreqMap inp = go 1 map1 map2 inp
where
map1 = Map.fromList $ zip ['a'..'z'] $ repeat 0
map2 = Map.fromList $ zip ['a'..'z'] $ repeat IntMap.empty
go idx m1 m2 inp
| C.null inp = m2
| otherwise = go (idx+1) m1' m2' $ C.tail inp
where
m1' = Map.update (\v -> Just $ v+1) (C.head inp) m1
m2' = foldl' (\m w -> Map.update (\v -> liftM (\c -> IntMap.insert idx c v) $ Map.lookup w m1') w m)
m2 ['a'..'z']
query :: Int -> Int -> Int -> Map.Map Char (IntMap.IntMap Int)
-> IntMap.IntMap Int -> IntMap.IntMap Int -> Int
query l r m freqMap facts invs
| x > 1 = (x * y) `mod` m
| otherwise = y
where
calcCnt cs = cr - cl
where
cl = IntMap.findWithDefault 0 (l-1) cs
cr = IntMap.findWithDefault 0 r cs
f1 acc cs
| even cnt = acc
| otherwise = acc + 1
where
cnt = calcCnt cs
f2 (acc1,acc2) cs
| cnt < 2 = (acc1 ,acc2)
| otherwise = (acc1',acc2')
where
cnt = calcCnt cs
n = cnt `div` 2
acc1' = acc1 + n
r = choose acc1' n
acc2' = (acc2 * r) `mod` m
-- calc binomial coefficient using Fermat's little theorem
choose n k
| n < k = 0
| otherwise = (f1 * t) `mod` m
where
f1 = (IntMap.!) facts n
i1 = (IntMap.!) invs k
i2 = (IntMap.!) invs (n-k)
t = (i1 * i2) `mod` m
x = Map.foldl' f1 0 freqMap
y = snd $ Map.foldl' f2 (0,1) freqMap
main :: IO()
main = do
inp <- C.getLine
q <- readLn :: IO Int
let modulo = 1000000007
let facts = compFactorials (C.length inp) modulo
let invs = compInvs (C.length inp) modulo facts
let freqMap = initFreqMap inp
forM_ [1..q] $ \_ -> do
line <- getLine
let [s1, s2] = words line
let l = (read s1) :: Int
let r = (read s2) :: Int
let result = query l r modulo freqMap facts invs
putStrLn $ show result
It passes all small and medium test cases but I am getting timeout with large test cases. The key to solve this problem is to precompute some stuff once at the beginning and use them to answer the individual queries efficiently.
Now, my main problem where I need help is:
The initital profiling shows that the lookup
operation of the IntMap
seems to be the main bottleneck. Is there better alternative to IntMap
for memoization? Or should I look at Vector
or Array
, which I believe will lead to more "ugly" code.
Even in current state, the code doesn't look nice (by functional standards) and as verbose as my C++ solution. Any tips to make it more idiomatic? Other than IntMap
usage for memoization, do you spot any other obvious problems which can lead to performance problems?
And is there any good sources, where I can learn how to use Haskell more effectively for competitive programming?
A sample large testcase, where the current code gets timeout:
input.txt output.txt
For comparison my C++ solution:
#include <vector>
#include <iostream>
#define MOD 1000000007L
long mod_exp(long b, long e) {
long r = 1;
while (e > 0) {
if ((e & 1) == 1) {
r = (r * b) % MOD;
}
b = (b * b) % MOD;
e >>= 1;
}
return r;
}
long n_choose_k(int n, int k, const std::vector<long> &fact_map, const std::vector<long> &inv_map) {
if (n < k) {
return 0;
}
long l1 = fact_map[n];
long l2 = (inv_map[k] * inv_map[n-k]) % MOD;
return (l1 * l2) % MOD;
}
int main() {
std::string s;
int q;
std::cin >> s >> q;
std::vector<std::vector<long>> freq_map;
std::vector<long> fact_map(s.size()+1);
std::vector<long> inv_map(s.size()+1);
for (int i = 0; i < 26; i++) {
freq_map.emplace_back(std::vector<long>(s.size(), 0));
}
std::vector<long> acc_map(26, 0);
for (int i = 0; i < s.size(); i++) {
acc_map[s[i]-'a']++;
for (int j = 0; j < 26; j++) {
freq_map[j][i] = acc_map[j];
}
}
fact_map[0] = 1;
inv_map[0] = 1;
for (int i = 1; i <= s.size(); i++) {
fact_map[i] = (i * fact_map[i-1]) % MOD;
inv_map[i] = mod_exp(fact_map[i], MOD-2) % MOD;
}
while (q--) {
int l, r;
std::cin >> l >> r;
std::vector<long> x(26, 0);
long t = 0;
long acc = 0;
long result = 1;
for (int i = 0; i < 26; i++) {
auto cnt = freq_map[i][r-1] - (l > 1 ? freq_map[i][l-2] : 0);
if (cnt % 2 != 0) {
t++;
}
long n = cnt / 2;
if (n > 0) {
acc += n;
result *= n_choose_k(acc, n, fact_map, inv_map);
result = result % MOD;
}
}
if (t > 0) {
result *= t;
result = result % MOD;
}
std::cout << result << std::endl;
}
}
UPDATE:
DanielWagner's answer has confirmed my suspicion that the main problem in my code was the usage of IntMap
for memoization. Replacing IntMap
with Array
made my code perform similar to DanielWagner's solution.
module Main where
import Control.Monad
import Data.Array (Array)
import qualified Data.Array as A
import qualified Data.ByteString.Char8 as C
import Data.Bits
import Data.List
import Debug.Trace
-- precompute factorials
compFactorials :: Int -> Int -> Array Int Int
compFactorials n m = A.listArray (0,n) $ scanl' f 1 [1..n]
where
f acc a = (acc * a) `mod` m
-- precompute invs
compInvs :: Int -> Int -> Array Int Int -> Array Int Int
compInvs n m facts = A.listArray (0,n) $ map f [0..n]
where
f a = (modExp ((A.!) facts a) (m-2) m) `mod` m
modExp :: Int -> Int -> Int -> Int
modExp b e m = go b e 1
where
go b e r
| (.&.) e 1 == 1 = go b' e' r'
| e > 0 = go b' e' r
| otherwise = r
where
r' = (r * b) `mod` m
b' = (b * b) `mod` m
e' = shift e (-1)
-- precompute frequency table
initFreqMap :: C.ByteString -> Map.Map Char (Array Int Int)
initFreqMap inp = Map.fromList $ map f ['a'..'z']
where
n = C.length inp
f c = (c, A.listArray (0,n) $ scanl' g 0 [0..n-1])
where
g x j
| C.index inp j == c = x+1
| otherwise = x
query :: Int -> Int -> Int -> Map.Map Char (Array Int Int)
-> Array Int Int -> Array Int Int -> Int
query l r m freqMap facts invs
| x > 1 = (x * y) `mod` m
| otherwise = y
where
calcCnt freqMap = cr - cl
where
cl = (A.!) freqMap (l-1)
cr = (A.!) freqMap r
f1 acc cs
| even cnt = acc
| otherwise = acc + 1
where
cnt = calcCnt cs
f2 (acc1,acc2) cs
| cnt < 2 = (acc1 ,acc2)
| otherwise = (acc1',acc2')
where
cnt = calcCnt cs
n = cnt `div` 2
acc1' = acc1 + n
r = choose acc1' n
acc2' = (acc2 * r) `mod` m
-- calc binomial coefficient using Fermat's little theorem
choose n k
| n < k = 0
| otherwise = (f1 * t) `mod` m
where
f1 = (A.!) facts n
i1 = (A.!) invs k
i2 = (A.!) invs (n-k)
t = (i1 * i2) `mod` m
x = Map.foldl' f1 0 freqMap
y = snd $ Map.foldl' f2 (0,1) freqMap
main :: IO()
main = do
inp <- C.getLine
q <- readLn :: IO Int
let modulo = 1000000007
let facts = compFactorials (C.length inp) modulo
let invs = compInvs (C.length inp) modulo facts
let freqMap = initFreqMap inp
replicateM_ q $ do
line <- getLine
let [s1, s2] = words line
let l = (read s1) :: Int
let r = (read s2) :: Int
let result = query l r modulo freqMap facts invs
putStrLn $ show result
First of all Haskell still isn't the best language for number crunching and High Performance Computing. If the competition is about the fastest implementation of some well-known algorithm or about HPC-style parallelization and speadup of some numerical problem I'd go with C++.
The reason is quite obvious. The facilities and elegance of Haskell are very different from the needs of most mainstream programming. Haskell just isn't the right tool for these jobs. One of the most common patterns in popular programming is runtime polymorphism.
Haskell, because it's higher level, it can actually manage much more complex data structures and do so in a correct way. It gives you tools to write data structures that are known to be more efficient for certain access patterns.
Haskell is a perfect choice for high-load concurrent applications, such as web backends. Maintainability. Haskell encourages using the type system to model the business domain and making the assumptions explicit. As a result, refactoring the code and adapting it to changing requirements is much easier.
I think you've shot yourself in the foot by trying to be too clever. Below I'll show a straightforward implementation of a slightly different algorithm that is about 5x faster than your Haskell code.
Here's the core combinatoric computation. Given a character frequency count for a substring, we can compute the number of maximum-length palindromes this way:
For the overcounting step, it's not super obvious to me whether it would be faster to store precomputed inverses for factorials, and take their product, or whether it's faster to just take the product of all the factorials and do one inverse operation at the very end. I'll do the latter, because it just intuitively seems faster to do one inversion per query than one lookup per repeated letter, but what do I know? Should be easy to test if you want to try to adapt the code yourself.
There's only one other quick insight I had vs. your code, which is that we can cache the frequency counts for prefixes of the input; then computing the frequency count for a substring is just pointwise subtraction of two cached counts. Your precomputation on the input I find to be a bit excessive in comparison.
Without further ado, let's see some code. As usual there's some preamble.
module Main where
import Control.Monad
import Data.Array (Array)
import qualified Data.Array as A
import Data.Map.Strict (Map)
import qualified Data.Map.Strict as M
import Data.Monoid
Like you, I want to do all my computations on cheap Int
s and bake in the modular operations where possible. I'll make a newtype
to make sure this happens for me.
newtype Mod1000000007 = Mod Int deriving (Eq, Ord)
instance Num Mod1000000007 where
fromInteger = Mod . (`mod` 1000000007) . fromInteger
Mod l + Mod r = Mod ((l+r) `rem` 1000000007)
Mod l * Mod r = Mod ((l*r) `rem` 1000000007)
negate (Mod v) = Mod ((1000000007 - v) `rem` 1000000007)
abs = id
signum = id
instance Integral Mod1000000007 where
toInteger (Mod n) = toInteger n
quotRem a b = (a * b^1000000005, 0)
I baked in the base of 1000000007
in several places, but it's easy to generalize by giving Mod
a phantom parameter and making a HasBase
class to pick the base. Ask a fresh question if you're not sure how and are interested; I'll be happy to do a more thorough writeup. There's a few more instances for Mod
that are basically uninteresting and primarily needed because of Haskell's wacko numeric class hierarchy:
instance Show Mod1000000007 where show (Mod n) = show n
instance Real Mod1000000007 where toRational (Mod n) = toRational n
instance Enum Mod1000000007 where
toEnum = Mod . (`mod` 1000000007)
fromEnum (Mod n) = n
Here's the precomputation we want to do for factorials...
type FactMap = Array Int Mod1000000007
factMap :: Int -> FactMap
factMap n = A.listArray (0,n) (scanl (*) 1 [1..])
...and for precomputing frequency maps for each prefix, plus getting a frequency map given a start and end point.
type FreqMap = Map Char Int
freqMaps :: String -> Array Int FreqMap
freqMaps s = go where
go = A.listArray (0, length s)
(M.empty : [M.insertWith (+) c 1 (go A.! i) | (i, c) <- zip [0..] s])
substringFreqMap :: Array Int FreqMap -> Int -> Int -> FreqMap
substringFreqMap maps l r = M.unionWith (-) (maps A.! r) (maps A.! (l-1))
Implementing the core computation described above is just a few lines of code, now that we have suitable Num
and Integral
instances for Mod1000000007
:
palindromeCount :: FactMap -> FreqMap -> Mod1000000007
palindromeCount facts freqs
= toEnum (max 1 mod2Freqs)
* (facts A.! sum div2Freqs)
`div` product (map (facts A.!) div2Freqs)
where
(div2Freqs, Sum mod2Freqs) = foldMap (\n -> ([n `quot` 2], Sum (n `rem` 2))) freqs
Now we just need a short driver to read stuff and pass it around to the appropriate functions.
main :: IO ()
main = do
inp <- getLine
q <- readLn
let freqs = freqMaps inp
facts = factMap (length inp)
replicateM_ q $ do
[l,r] <- map read . words <$> getLine
print . palindromeCount facts $ substringFreqMap freqs l r
That's it. Notably I made no attempt to be fancy about bitwise operations and didn't do anything fancy with accumulators; everything is in what I would consider idiomatic purely-functional style. The final count is about half as much code that runs about 5x faster.
P.S. Just for fun, I replaced the last line with print (l+r :: Int)
... and discovered that about half the time is spent in read
. Ouch! Seems there's still plenty of low-hanging fruit if this isn't fast enough yet.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With