I' ve got a problem with Haskell. I have text file looking like this: <pre class="prettyprint"><code>5. 7. [(1,2,3),(4,5,6),(7,8,9),(10,11,12)]. </code></pre> I haven't any idea how can I get the first 2 numbers (2 and 7 above) and the list from the last line. There are dots on the end of each line. I tried to build a parser, but function called 'readFile' return the Monad called IO String. I don't know how can I get information from that type of string. I prefer work on a array of chars. Maybe there is a function which can convert from 'IO String' to [Char]?

I think you have a fundamental misunderstanding about IO in Haskell. Particularly, you say this: <blockquote> Maybe there is a function which can convert from 'IO String' to [Char]? </blockquote> No, there isn't1, and the fact that there is no such function is one of the most important things about Haskell. Haskell is a very principled language. It tries to maintain a distinction between "pure" functions (which don't have any side-effects, and always return the same result when give the same input) and "impure" functions (which have side effects like reading from files, printing to the screen, writing to disk etc). The rules are: <ol> <li>You can use a pure function anywhere (in other pure functions, or in impure functions)</li> <li>You can only use impure functions inside other impure functions.</li> </ol> The way that code is marked as pure or impure is using the type system. When you see a function signature like <pre class="prettyprint"><code>digitToInt :: String -> Int </code></pre> you know that this function is pure. If you give it a <code>String</code> it will return an <code>Int</code> and moreover it will always return the same <code>Int</code> if you give it the same <code>String</code>. On the other hand, a function signature like <pre class="prettyprint"><code>getLine :: IO String </code></pre> is impure, because the return type of <code>String</code> is marked with <code>IO</code>. Obviously <code>getLine</code> (which reads a line of user input) will not always return the same <code>String</code>, because it depends on what the user types in. You can't use this function in pure code, because adding even the smallest bit of impurity will pollute the pure code. Once you go <code>IO</code> you can never go back. You can think of <code>IO</code> as a wrapper. When you see a particular type, for example, <code>x :: IO String</code>, you should interpret that to mean "<code>x</code> is an action that, when performed, does some arbitrary I/O and then returns something of type <code>String</code>" (note that in Haskell, <code>String</code> and <code>[Char]</code> are exactly the same thing). So how do you ever get access to the values from an <code>IO</code> action? Fortunately, the type of the function <code>main</code> is <code>IO ()</code> (it's an action that does some I/O and returns <code>()</code>, which is the same as returning nothing). So you can always use your <code>IO</code> functions inside <code>main</code>. When you execute a Haskell program, what you are doing is running the <code>main</code> function, which causes all the I/O in the program definition to actually be executed - for example, you can read and write from files, ask the user for input, write to stdout etc etc. You can think of structuring a Haskell program like this: <ul> <li>All code that does I/O gets the <code>IO</code> tag (basically, you put it in a <code>do</code> block)</li> <li>Code that doesn't need to perform I/O doesn't need to be in a <code>do</code> block - these are the "pure" functions.</li> <li>Your <code>main</code> function sequences together the I/O actions you've defined in an order that makes the program do what you want it to do (interspersed with the pure functions wherever you like).</li> <li>When you run <code>main</code>, you cause all of those I/O actions to be executed.</li> </ul> <hr> So, given all that, how do you write your program? Well, the function <pre class="prettyprint"><code>readFile :: FilePath -> IO String </code></pre> reads a file as a <code>String</code>. So we can use that to get the contents of the file. The function <pre class="prettyprint"><code>lines:: String -> [String] </code></pre> splits a <code>String</code> on newlines, so now you have a list of <code>String</code>s, each corresponding to one line of the file. The function <pre class="prettyprint"><code>init :: [a] -> [a] </code></pre> Drops the last element from a list (this will get rid of the final <code>.</code> on each line). The function <pre class="prettyprint"><code>read :: (Read a) => String -> a </code></pre> takes a <code>String</code> and turns it into an arbitrary Haskell data type, such as <code>Int</code> or <code>Bool</code>. Combining these functions sensibly will give you your program. Note that the only time you actually need to do any I/O is when you are reading the file. Therefore that is the only part of the program that needs to use the <code>IO</code> tag. The rest of the program can be written "purely". It sounds like what you need is the article The IO Monad For People Who Simply Don't Care, which should explain a lot of your questions. Don't be scared by the term "monad" - you don't need to understand what a monad is to write Haskell programs (notice that this paragraph is the only one in my answer that uses the word "monad", although admittedly I have used it four times now...) <hr> Here's the program that (I think) you want to write <pre class="prettyprint"><code>run :: IO (Int, Int, [(Int,Int,Int)]) run = do contents <- readFile "text.txt" -- use '<-' here so that 'contents' is a String let [a,b,c] = lines contents -- split on newlines let firstLine = read (init a) -- 'init' drops the trailing period let secondLine = read (init b) let thirdLine = read (init c) -- this reads a list of Int-tuples return (firstLine, secondLine, thirdLine) </code></pre> To answer <code>npfedwards</code> comment about applying <code>lines</code> to the output of <code>readFile text.txt</code>, you need to realize that <code>readFile text.txt</code> gives you an <code>IO String</code>, and it's only when you bind it to a variable (using <code>contents <-</code>) that you get access to the underlying <code>String</code>, so that you can apply <code>lines</code> to it. Remember: once you go <code>IO</code>, you never go back. <hr> 1 I am deliberately ignoring <code>unsafePerformIO</code> because, as implied by the name, it is very unsafe! Don't ever use it unless you really know what you are doing.

How can I parse the IO String in Haskell?

Tags:

string

io

parsing

haskell

monads

I' ve got a problem with Haskell. I have text file looking like this:

5. 7.  [(1,2,3),(4,5,6),(7,8,9),(10,11,12)].

I haven't any idea how can I get the first 2 numbers (2 and 7 above) and the list from the last line. There are dots on the end of each line.

I tried to build a parser, but function called 'readFile' return the Monad called IO String. I don't know how can I get information from that type of string.

I prefer work on a array of chars. Maybe there is a function which can convert from 'IO String' to [Char]?

210

asked Jun 27 '12 15:06

Simon

1 Answers

I think you have a fundamental misunderstanding about IO in Haskell. Particularly, you say this:

Maybe there is a function which can convert from 'IO String' to [Char]?

No, there isn't¹, and the fact that there is no such function is one of the most important things about Haskell.

Haskell is a very principled language. It tries to maintain a distinction between "pure" functions (which don't have any side-effects, and always return the same result when give the same input) and "impure" functions (which have side effects like reading from files, printing to the screen, writing to disk etc). The rules are:

You can use a pure function anywhere (in other pure functions, or in impure functions)
You can only use impure functions inside other impure functions.

The way that code is marked as pure or impure is using the type system. When you see a function signature like

digitToInt :: String -> Int

you know that this function is pure. If you give it a String it will return an Int and moreover it will always return the same Int if you give it the same String. On the other hand, a function signature like

getLine :: IO String

is impure, because the return type of String is marked with IO. Obviously getLine (which reads a line of user input) will not always return the same String, because it depends on what the user types in. You can't use this function in pure code, because adding even the smallest bit of impurity will pollute the pure code. Once you go IO you can never go back.

You can think of IO as a wrapper. When you see a particular type, for example, x :: IO String, you should interpret that to mean "x is an action that, when performed, does some arbitrary I/O and then returns something of type String" (note that in Haskell, String and [Char] are exactly the same thing).

So how do you ever get access to the values from an IO action? Fortunately, the type of the function main is IO () (it's an action that does some I/O and returns (), which is the same as returning nothing). So you can always use your IO functions inside main. When you execute a Haskell program, what you are doing is running the main function, which causes all the I/O in the program definition to actually be executed - for example, you can read and write from files, ask the user for input, write to stdout etc etc.

You can think of structuring a Haskell program like this:

All code that does I/O gets the IO tag (basically, you put it in a do block)
Code that doesn't need to perform I/O doesn't need to be in a do block - these are the "pure" functions.
Your main function sequences together the I/O actions you've defined in an order that makes the program do what you want it to do (interspersed with the pure functions wherever you like).
When you run main, you cause all of those I/O actions to be executed.

So, given all that, how do you write your program? Well, the function

readFile :: FilePath -> IO String

reads a file as a String. So we can use that to get the contents of the file. The function

lines:: String -> [String]

splits a String on newlines, so now you have a list of Strings, each corresponding to one line of the file. The function

init :: [a] -> [a]

Drops the last element from a list (this will get rid of the final . on each line). The function

read :: (Read a) => String -> a

takes a String and turns it into an arbitrary Haskell data type, such as Int or Bool. Combining these functions sensibly will give you your program.

Note that the only time you actually need to do any I/O is when you are reading the file. Therefore that is the only part of the program that needs to use the IO tag. The rest of the program can be written "purely".

It sounds like what you need is the article The IO Monad For People Who Simply Don't Care, which should explain a lot of your questions. Don't be scared by the term "monad" - you don't need to understand what a monad is to write Haskell programs (notice that this paragraph is the only one in my answer that uses the word "monad", although admittedly I have used it four times now...)

Here's the program that (I think) you want to write

run :: IO (Int, Int, [(Int,Int,Int)]) run = do   contents <- readFile "text.txt"   -- use '<-' here so that 'contents' is a String   let [a,b,c] = lines contents      -- split on newlines   let firstLine  = read (init a)    -- 'init' drops the trailing period   let secondLine = read (init b)       let thirdLine  = read (init c)    -- this reads a list of Int-tuples   return (firstLine, secondLine, thirdLine)

To answer npfedwards comment about applying lines to the output of readFile text.txt, you need to realize that readFile text.txt gives you an IO String, and it's only when you bind it to a variable (using contents <-) that you get access to the underlying String, so that you can apply lines to it.

Remember: once you go IO, you never go back.

¹ I am deliberately ignoring unsafePerformIO because, as implied by the name, it is very unsafe! Don't ever use it unless you really know what you are doing.

200

answered Sep 30 '22 11:09

Chris Taylor

Related questions
                            
                                Assign a nullptr to a std::string is safe?
                            
                                MySQL : left part of a string split by a separator string?
                            
                                Read a Text asset(text file from assets folder) as a String in Kotlin (Android)
                            
                                In Java, for a string x, what is the runtime cost of s.length()? Is it O(1) or O(n)?
                            
                                Efficient way to add spaces between characters in a string
                            
                                Why is True returned when checking if an empty string is in another?
                            
                                Convert string to hexadecimal on command line
                            
                                Rails 3 UTF-8 query string showing up in URL?
                            
                                When should std::string be used over character arrays?
                            
                                Insert a string before the extension in a filename
                            
                                Initialize array in method argument [duplicate]
                            
                                Converting CGFloat to String in Swift
                            
                                How can I remove the last character of a string in python? [duplicate]
                            
                                How does Python's triple-quote string work?
                            
                                Python string 'join' is faster (?) than '+', but what's wrong here?
                            
                                How to split a number into individual digits in c#? [duplicate]
                            
                                How to count frequency of characters in a string?
                            
                                jquery build http query string
                            
                                Add a space between characters in a String [duplicate]
                            
                                UTF8 vs. UTF16 vs. char* vs. what? Someone explain this mess to me!

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With