Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a standard API to check for line separators in Java?

Tags:

java

I'm using Java SE 6.

My program reads several kinds of files ranging from dos to unix and ascii to unicode, and I have to make sure that the line separators for the output file match the input files.

The way I do this is I read a sample line with BufferedReader read() function to search for the first line separator and save that line separator to a String. This way it can be used later whenever I need a new line.

I've inspected the Scanner class and noticed that possible line separators may include the following:

\r\n
\r
\n
\u2028
\u2029
\u0085

Is there a library function to check for these characters? Or even better, is there already a library function to check what the input's line separator looks like?

Are there any other ways around this?

EDIT: If possible I would like to use Java's standard API instead of 3rd party libraries, but all suggestions are most welcome.

EDIT: Just to clarify.
1) The input files do not depend on where this program is running. For example, if I'm running this program in Dos, I can still get a Unix input file.
2) My goal is not to read each line delimited with line separators -- that's simple. What I really need is to write an output file with the same line separators as the input file. For example, if I'm running this program in Dos, and I get a Unix input file, I want to be able to write my output file with Unix line separators. This is why I'm asking if there's a standard API to detect line separators based on input files, rather than running OS.

Thanks.

like image 385
Russell Avatar asked Oct 27 '10 20:10

Russell


1 Answers

The previous three answers don't really address the question. The OP wants to determine from a given file: what is the line separator used in this file?

This question can not be answered definitely for a given file, as the file might be using several line endings. This might seem contrived but it's possible.

So the best approach to me seems to be to parse the input file yourself, counting the occurrences of possible line ending character sequences and choosing the one that appears most often as the line separator of this file.

I have not come across a library that would implement this functionality.

like image 186
Philipp Jardas Avatar answered Oct 12 '22 22:10

Philipp Jardas