I'm trying to split text in a JTextArea
using a regex to split the String by \n
However, this does not work and I also tried by \r\n|\r|n
and many other combination of regexes. Code:
public void insertUpdate(DocumentEvent e) { String split[], docStr = null; Document textAreaDoc = (Document)e.getDocument(); try { docStr = textAreaDoc.getText(textAreaDoc.getStartPosition().getOffset(), textAreaDoc.getEndPosition().getOffset()); } catch (BadLocationException e1) { // TODO Auto-generated catch block e1.printStackTrace(); } split = docStr.split("\\n"); }
2.2.The “\n” character separates lines in Unix, Linux, and macOS. On the other hand, the “\r\n” character separates lines in Windows Environment. Finally, the “\r” character separates lines in Mac OS 9 and earlier.
In Windows, a new line is denoted using “\r\n”, sometimes called a Carriage Return and Line Feed, or CRLF. Adding a new line in Java is as simple as including “\n” , “\r”, or “\r\n” at the end of our string.
Java split() function is used to splitting the string into the string array based on the regular expression or the given delimiter. The resultant object is an array contains the split strings.
The string split() method breaks a given string around matches of the given regular expression. After splitting against the given regular expression, this method returns a string array.
This should cover you:
String lines[] = string.split("\\r?\\n");
There's only really two newlines (UNIX and Windows) that you need to worry about.
String#split(String regex)
method is using regex (regular expressions). Since Java 8 regex supports \R
which represents (from documentation of Pattern class):
Linebreak matcher
\R Any Unicode linebreak sequence, is equivalent to\u000D\u000A|[\u000A\u000B\u000C\u000D\u0085\u2028\u2029]
So we can use it to match:
\u000D\000A
-> \r\n
pair \n
)\t
which is \u0009
)\f
)\r
)As you see \r\n
is placed at start of regex which ensures that regex will try to match this pair first, and only if that match fails it will try to match single character line separators.
So if you want to split on line separator use split("\\R")
.
If you don't want to remove from resulting array trailing empty strings ""
use split(regex, limit)
with negative limit
parameter like split("\\R", -1)
.
If you want to treat one or more continues empty lines as single delimiter use split("\\R+")
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With