Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Detect line breaks in a `char[]`

I use to employ the following method to detect if a character is a whitespace:

Character.isWhiteSpace(char character);

Now I need to detect all the variants of line breaks (\n, \r, etc.) for all platforms (Linux, Windows, Mac OSX, etc.). Is there any similar way to detect if a character is a line break? If there is not, how can I detect all the possible variants?


Edit from comments: As I didn't know that line breaks can be represented by several characters, I add some context to the question.

I'm implementing the write(char[] buffer, int offset, int length) method in a Writer (see Javadoc). In addition to other operations, I need to detect line breaks inside the buffer. I'm trying to avoid creating an String from the buffer to preserve memory, as I've seen that sometimes the buffer is too big (several MB).

Is there any way to detect line breaks without creating a String?

like image 794
jeojavi Avatar asked Sep 18 '14 14:09

jeojavi


2 Answers

Use regex to do the work for you:

if (!String.valueOf(character).matches("."))

Without the DOTALL switch, the dot matches all characters except newlines, which according the documentation includes:

  • A newline (line feed) character ('\n'),
  • A carriage-return character followed immediately by a newline character ("\r\n"),
  • A standalone carriage-return character ('\r'),
  • A next-line character ('\u0085'),
  • A line-separator character ('\u2028'), or
  • A paragraph-separator character ('\u2029).

Note that line break sequences exist, eg \r\n, but you asked about individual characters. The regex solution would work with one or two char inputs.

like image 62
Bohemian Avatar answered Oct 20 '22 01:10

Bohemian


As I posted in my comments, the line separator is not always a "character", but a sequence of characters, depending on the platform. To be independent it would look like this:

public String[] splitLines(String input) {
    return input.split("(\r\n|\r|\n)");
}

Based on this answer:

Match linebreaks - \n or \r\n?

However, this means regex matching, not char matching... However getting a String out of a buffer should be achievable...

like image 1
Martin Avatar answered Oct 20 '22 00:10

Martin