Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split Java String by New Line

I'm trying to split text in a JTextArea using a regex to split the String by \n However, this does not work and I also tried by \r\n|\r|n and many other combination of regexes. Code:

public void insertUpdate(DocumentEvent e) {     String split[], docStr = null;     Document textAreaDoc = (Document)e.getDocument();      try {         docStr = textAreaDoc.getText(textAreaDoc.getStartPosition().getOffset(), textAreaDoc.getEndPosition().getOffset());     } catch (BadLocationException e1) {         // TODO Auto-generated catch block         e1.printStackTrace();     }      split = docStr.split("\\n"); } 
like image 834
dr.manhattan Avatar asked Jan 18 '09 10:01

dr.manhattan


People also ask

How do you separate a new line in Java?

2.2.The “\n” character separates lines in Unix, Linux, and macOS. On the other hand, the “\r\n” character separates lines in Windows Environment. Finally, the “\r” character separates lines in Mac OS 9 and earlier.

How do you add a new line to a string in Java?

In Windows, a new line is denoted using “\r\n”, sometimes called a Carriage Return and Line Feed, or CRLF. Adding a new line in Java is as simple as including “\n” , “\r”, or “\r\n” at the end of our string.

What does split \\ do in Java?

Java split() function is used to splitting the string into the string array based on the regular expression or the given delimiter. The resultant object is an array contains the split strings.

Can you divide a string in Java?

The string split() method breaks a given string around matches of the given regular expression. After splitting against the given regular expression, this method returns a string array.


2 Answers

This should cover you:

String lines[] = string.split("\\r?\\n"); 

There's only really two newlines (UNIX and Windows) that you need to worry about.

like image 110
cletus Avatar answered Oct 13 '22 20:10

cletus


String#split​(String regex) method is using regex (regular expressions). Since Java 8 regex supports \R which represents (from documentation of Pattern class):

Linebreak matcher
\R         Any Unicode linebreak sequence, is equivalent to \u000D\u000A|[\u000A\u000B\u000C\u000D\u0085\u2028\u2029]

So we can use it to match:

  • \u000D\000A -> \r\n pair
  • \u000A -> line feed (\n)
  • \u000B -> line tabulation (DO NOT confuse with character tabulation \t which is \u0009)
  • \u000C -> form feed (\f)
  • \u000D -> carriage return (\r)
  • \u0085 -> next line (NEL)
  • \u2028 -> line separator
  • \u2029 -> paragraph separator

As you see \r\n is placed at start of regex which ensures that regex will try to match this pair first, and only if that match fails it will try to match single character line separators.


So if you want to split on line separator use split("\\R").

If you don't want to remove from resulting array trailing empty strings "" use split(regex, limit) with negative limit parameter like split("\\R", -1).

If you want to treat one or more continues empty lines as single delimiter use split("\\R+").

like image 31
Pshemo Avatar answered Oct 13 '22 19:10

Pshemo