Right now I'm doing a split
on a string and assuming that the newline from the user is \r\n
like so:
string.split(/\r\n/)
What I'd like to do is split on either \r\n
or just \n
.
So how what would the regex be to split on either of those?
2.2. The “\n” character separates lines in Unix, Linux, and macOS. On the other hand, the “\r\n” character separates lines in Windows Environment. Finally, the “\r” character separates lines in Mac OS 9 and earlier.
You can use the Python string split() function to split a string (by a delimiter) into a list of strings. To split a string by newline character in Python, pass the newline character "\n" as a delimiter to the split() function.
split("\n");
Split(char[], StringSplitOptions) Method This method is used to splits a string into substrings based on the characters in an array. You can specify whether the substrings include empty array elements. Syntax: public String[] Split(char[] separator, StringSplitOptions option);
Did you try /\r?\n/
? The ?
makes the \r
optional.
Example usage: http://rubular.com/r/1ZuihD0YfF
Ruby has the methods String#each_line
and String#lines
returns an enum: http://www.ruby-doc.org/core-1.9.3/String.html#method-i-each_line
returns an array: http://www.ruby-doc.org/core-2.1.2/String.html#method-i-lines
I didn't test it against your scenario but I bet it will work better than manually choosing the newline chars.
# Split on \r\n or just \n
string.split( /\r?\n/ )
Although it doesn't help with this question (where you do need a regex), note that String#split
does not require a regex argument. Your original code could also have been string.split( "\r\n" )
.
\n is for unix
\r is for mac
\r\n is for windows format
To be safe for operating systems. I would do /\r?\n|\r\n?/
"1\r2\n3\r\n4\n\n5\r\r6\r\n\r\n7".split(/\r?\n|\r\n?/)
=> ["1", "2", "3", "4", "", "5", "", "6", "", "7"]
The alternation operator in Ruby Regexp
is the same as in standard regular expressions: |
So, the obvious solution would be
/\r\n|\n/
which is the same as
/\r?\n/
i.e. an optional \r
followed by a mandatory \n
.
Are you reading from a file, or from standard in?
If you're reading from a file, and the file is in text mode, rather than binary mode, or you're reading from standard in, you won't have to deal with \r\n
- it'll just look like \n
.
C:\Documents and Settings\username>irb
irb(main):001:0> gets
foo
=> "foo\n"
Perhaps do a split on only '\n' and remove the '\r' if it exists?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With