Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I detect DOS line breaks in a file?

I have a bunch of files. Some are Unix line endings, many are DOS. I'd like to test each file to see if if is dos formatted, before I switch the line endings.

How would I do this? Is there a flag I can test for? Something similar?

like image 245
chiggsy Avatar asked May 09 '10 18:05

chiggsy


People also ask

How do I know if a file is LF or CRLF?

use a text editor like notepad++ that can help you with understanding the line ends. It will show you the line end formats used as either Unix(LF) or Macintosh(CR) or Windows(CR LF) on the task bar of the tool. you can also go to View->Show Symbol->Show End Of Line to display the line ends as LF/ CR LF/CR.

How do I view a CRLF in a text file?

In Notepad++ go to the View > Show Symbol menu and select Show End of Line. Once you select View > Show Symbol > Show End of Line you can see the CR LF characters visually.

How can I tell if a file has Windows line endings?

If a file has DOS/Windows-style CR-LF line endings, then if you look at it using a Unix-based tool you'll see CR ('\r') characters at the end of each line. Other shells may provide a similar feature. A file can contain a mixture of Unix-style and Windows-style line endings.

How can I tell if a file is in DOS?

1) Open the file with vim. 2) Use the vim set command to show the file format. : set ff? The command returns fileformat=<dos/unix/mac> to indect current file format.


1 Answers

Python can automatically detect what newline convention is used in a file, thanks to the "universal newline mode" (U), and you can access Python's guess through the newlines attribute of file objects:

f = open('myfile.txt', 'U')
f.readline()  # Reads a line
# The following now contains the newline ending of the first line:
# It can be "\r\n" (Windows), "\n" (Unix), "\r" (Mac OS pre-OS X).
# If no newline is found, it contains None.
print repr(f.newlines)

This gives the newline ending of the first line (Unix, DOS, etc.), if any.

As John M. pointed out, if by any chance you have a pathological file that uses more than one newline coding, f.newlines is a tuple with all the newline codings found so far, after reading many lines.

Reference: http://docs.python.org/2/library/functions.html#open

If you just want to convert a file, you can simply do:

with open('myfile.txt', 'U') as infile:
    text = infile.read()  # Automatic ("Universal read") conversion of newlines to "\n"
with open('myfile.txt', 'w') as outfile:
    outfile.write(text)  # Writes newlines for the platform running the program
like image 145
Eric O Lebigot Avatar answered Oct 04 '22 02:10

Eric O Lebigot