Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Haskell complexity of an algorithm

Tags:

haskell

I encounter a simple problem on Codeforces, the problem is this. The thing I wanted to discuss is not about the problem, it about the languages we used, in this case, Python3 and Haskell.

In details, I have 2 version of my algorithm, one in Haskell and the other in Python3. Both of them in Functional programming style. The code looks like this

Python3

from operator import add
from itertools import accumulate
from functools import reduce
 
 
def floss(l):
    def e(u):
        a, b = u
        return b if a % 2 == 1 else -b
 
    return map(e, enumerate(l))
 
 
def flock(l):
    return accumulate(l, add)
 
 
def search(l):
    b = zip(l, l[1:])
 
    def equal(u):
        x, y = u
        return x == y
 
    c = any(map(equal, b))
    return 'YES\n' if c else 'NO\n'
 
 
def main():
    t = int(input())
 
    def solution(x):
        return search(sorted(list(flock(floss(x)))))
 
    def get():
        _ = input()
        b = [0] + [int(x) for x in input().split()]
        return b
 
    all_data = [get() for _ in range(t)]
    all_solution = map(solution, all_data)
    print(reduce(add, all_solution))
 
 
main()

Haskell

module Main (main) where
import Data.List (sort)
 
main :: IO ()
main = do
  x <- des
  putStrLn x
 
readInts :: IO [Int]
readInts = fmap (map read.words) getLine
 
flock :: [Int] -> [Int]
flock l = scanr (+) 0 l 
  
floss :: [Int] -> [Int]
floss l = map (e :: (Int, Int) -> Int) $ zip [0..] l where {
  e (u, v) = if mod u 2 == 0 then v else -v
  }
  
search :: [Int] -> String 
search l = if c then "YES\n" else "NO\n" where {
  b = zip l $ tail l;
  c = any (\(x, y) -> x == y) b;
  }
  
solution :: [Int] -> String 
solution = search.sort.flock.floss
 
des :: IO String
des = do 
  io <- readInts
  let t = head io
  all_data <- sequence $ replicate t $ do
    _ <- readInts
    b <- readInts
    return b
  let all_solution = map solution all_data
  let output = foldr (++) "" all_solution
  return output

Both of them relatively the same in term of algorithm. In fact, the Python3 passed the testcases with high complexity while Haskell code cannot. I wonder why my code in Haskell run slower than Python3 and I wanted to know the operation of Haskell that make the fault. One thing I found suspicious is the memory usage of my Haskell code is incredibly higher (2-8 times) than Python3 code.

I have just started study about FP recently, so there maybe some mistake that I made in the post.

Update #1: One thing that I found might be useful for bug detecting is that in the Haskell code is let output = foldr (++) "" all_solution. In a much worse code, I used foldl rather than foldr, which made the code extremely slow. I think that might make bug detecting task a little easier.

like image 659
Khang Truong Avatar asked Jun 21 '26 14:06

Khang Truong


1 Answers

From profiling, it looks like most of the time is being spent in readInts. A really stupid reimplementation that doesn't invoke the full overhead of read already triples the speed of the program in my tests:

readIntDumb :: String -> IO Int
readIntDumb = go 0 1 where
    go n sgn [] = pure (sgn * n)
    go n sgn (c:cs) = case c of
        '-' -> go n (negate sgn) cs
        d | '0' <= d && d <= '9' -> go (10*n + fromEnum d - fromEnum '0') sgn cs
        _ -> fail "whoops"

Moving to ByteString-based IO gets another factor of two:

import Data.ByteString.Char8 (ByteString)
import qualified Data.ByteString.Char8 as BS8

readIntsBS :: ByteString -> [Int]
readIntsBS bs = case BS8.readInt bs of
    Nothing -> []
    Just (n, bs') -> n : readIntsBS (BS8.dropWhile isSpace bs')

readInts :: IO [Int]
readInts = readIntsBS <$> BS8.getLine

At this point, profiling reveals that about half the runtime is due to sort. Switching to nubInt speeds that step up a lot:

import Data.Containers.ListUtils
search l = if nubInt l /= l then "YES" else "NO"
solution = search.flock.floss

Or you could implement a custom uniqueness check, though this gets only a small savings over using nubInt:

import qualified Data.IntSet as IS

search :: [Int] -> String 
search l = if uniqInt l then "NO" else "YES"

uniqInt :: [Int] -> Bool
uniqInt = go IS.empty where
    go seen [] = True
    go seen (n:ns) = case IS.alterF (,True) n seen of
        (False, seen') -> go seen' ns
        _ -> False

You'll also want to switch over from scanr to scanl. (As a rule of thumb, for folds you generally are choosing from foldr and foldl' based on what operation you're folding, but for scans you almost always want scanl or a minor variant of it like scanl1.) This almost doubles the speed.

flock = scanl (+) 0

At this point, the cumulative savings have dropped the runtime on my machine+test file from 25s to 0.8s; perhaps this is going far enough. Here's the complete final result, with a few minor tweaks (for style, not performance) not explicitly discussed above.

import Control.Monad
import Data.Bool
import Data.ByteString.Char8 (ByteString)
import Data.Char
import Data.Containers.ListUtils
import qualified Data.ByteString.Char8 as BS8
import qualified Data.IntSet as IS

main :: IO ()
main = do
    t <- readLn
    replicateM_ t $ do
        BS8.getLine
        putStrLn . solution . readIntsBS =<< BS8.getLine

solution :: [Int] -> String
solution = bool "YES" "NO" . uniqInt . scanl (+) 0 . zipWith ($) (cycle [id, negate])

readIntsBS :: ByteString -> [Int]
readIntsBS bs = case BS8.readInt bs of
    Nothing -> []
    Just (n, bs') -> n : readIntsBS (BS8.dropWhile isSpace bs')

uniqInt :: [Int] -> Bool
uniqInt = go IS.empty where
    go seen [] = True
    go seen (n:ns) = case IS.alterF (,True) n seen of
        (False, seen') -> go seen' ns
        _ -> False
like image 53
Daniel Wagner Avatar answered Jun 23 '26 13:06

Daniel Wagner