I need to count the number of lines in a string. Any line break can be character can be present in the string (CR, LF or CRLF).
So possible new line chars:
* \n
* \r
* \r\n
For example, with the following input:
This is [\n]
an string that [\r]
has four [\r\n]
lines
The method should return 4 lines. Do you know any built in function, or someone already implemented it?
static int GetLineCount(string input)
{
// could you provide a good implementation for this method?
// I want to avoid string.split since it performs really bad
}
NOTE: Performance is important for me, because I could read large strings.
int count = 0;
int len = input.Length;
for(int i = 0; i != len; ++i)
switch(input[i])
{
case '\r':
++count;
if (i + 1 != len && input[i + 1] == '\n')
++i;
break;
case '\n':
// Uncomment below to include all other line break sequences
// case '\u000A':
// case '\v':
// case '\f':
// case '\u0085':
// case '\u2028':
// case '\u2029':
++count;
break;
}
Simply scan through, counting the line-breaks, and in the case of \r
test if the next character is \n
and skip it if it is.
Performance is important for me, because I could read large strings.
If at all possible then, avoid reading large strings at all. E.g. if they come from streams this is pretty easy to do directly on a stream as there is no more than one-character read-ahead ever needed.
Here's another variant that doesn't count newlines at the very end of a string:
int count = 1;
int len = input.Length - 1;
for(int i = 0; i < len; ++i)
switch(input[i])
{
case '\r':
if (input[i + 1] == '\n')
{
if (++i >= len)
{
break;
}
}
goto case '\n';
case '\n':
// Uncomment below to include all other line break sequences
// case '\u000A':
// case '\v':
// case '\f':
// case '\u0085':
// case '\u2028':
// case '\u2029':
++count;
break;
}
This therefore considers ""
, "a line"
, "a line\n"
and "a line\r\n"
to each be one line only, and so on.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With