I am reading a csv file with pipes-csv library. I want to read first line and read the rest later. Unfortunately after Pipes.Prelude.head function returns. pipe is being closed somehow. Is there a way to read head of the csv first and read the rest later.
import qualified Data.Vector as V
import Pipes
import qualified Pipes.Prelude as P
import qualified System.IO as IO
import qualified Pipes.ByteString as PB
import qualified Data.Text as Text
import qualified Pipes.Csv as PCsv
import Control.Monad (forever)
showPipe :: Proxy () (Either String (V.Vector Text.Text)) () String IO b
showPipe = forever $ do
x::(Either String (V.Vector Text.Text)) <- await
yield $ show x
main :: IO ()
main = do
IO.withFile "./test.csv"
IO.ReadMode
(\handle -> do
let producer = (PCsv.decode PCsv.NoHeader (PB.fromHandle handle))
headers <- P.head producer
putStrLn "Header"
putStrLn $ show headers
putStrLn $ "Rows"
runEffect ( producer>->
(showPipe) >->
P.stdoutLn)
)
If we do not read the header first, we can read whole csv without any problem:
main :: IO ()
main = do
IO.withFile "./test.csv"
IO.ReadMode
(\handle -> do
let producer = (PCsv.decode PCsv.NoHeader (PB.fromHandle handle))
putStrLn $ "Rows"
runEffect ( producer>->
(showPipe) >->
P.stdoutLn)
)
Pipes.Csv has material for handling headers, but I think that this question is really looking for a more sophisticated use of Pipes.await or else Pipes.next. First next:
>>> :t Pipes.next
Pipes.next :: Monad m => Producer a m r -> m (Either r (a, Producer a m r))
next is the basic way of inspecting a producer. It is sort of like pattern matching on a list. With a list the two possibilities are [] and x:xs - here they are Left () and Right (headers, rows). The latter pair is what you are looking for. Of course an action (here in IO) is needed to get one's hands on it:
main :: IO ()
main = do
handle <- IO.openFile "./test.csv" IO.ReadMode
let producer :: Producer (V.Vector Text.Text) IO ()
producer = PCsv.decode PCsv.NoHeader (PB.fromHandle handle) >-> P.concat
e <- next producer
case e of
Left () -> putStrLn "No lines!"
Right (headers, rows) -> do
putStrLn "Header"
print headers
putStrLn $ "Rows"
runEffect ( rows >-> P.print)
IO.hClose handle
Since the Either values are distraction here, I eliminate Left values - the lines that don't parse - with P.concat
next does not act inside a pipeline, but directly on the Producer, which it treats as a sort of "effectful list" with a final return value at the end. The particular effect we got above can of course be achieved with await, which acts inside a pipeline. I can use it to intercept the first item that comes along in a pipeline, do some IO based on it, and then forward the remaining elements:
main :: IO ()
main = do
handle <- IO.openFile "./grades.csv" IO.ReadMode
let producer :: Producer (V.Vector Text.Text) IO ()
producer = PCsv.decode PCsv.NoHeader (PB.fromHandle handle) >-> P.concat
handleHeader :: Pipe (V.Vector Text.Text) (V.Vector Text.Text) IO ()
handleHeader = do
headers <- await -- intercept first value
liftIO $ do -- use it for IO
putStrLn "Header"
print headers
putStrLn $ "Rows"
cat -- pass along all later values
runEffect (producer >-> handleHeader >-> P.print)
IO.hClose handle
The difference is just that if producer is empty, I won't be able to declare this, as I do with No lines! in the previous program.
Note by the way that showPipe can be defined as P.map show, or simply as P.show (but with the specialized type you add.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With