Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Preserve end-of-line style when working with files in python

I am looking for a way to ensure that the end-of-line style of a file is maintained in python program while reading, editing and writing.

Python has universal file ending support, which can convert all line endings to \n when the file is read, and then convert them all to the system default when the file is written. In my case I would like to still do the initial conversion, but then write the file with the original EOL style rather than the system default.

Is there a standard way to do this kind of thing? If not, is there a standard way to detect the EOL style of a file?

Assuming that there is no standard way to do this, a possible work flow would be:

  1. Read in a file in binary mode.
  2. Decode into utf-8 (or whatever encoding is required).
  3. Detect EOL style.
  4. Convert all line endings to \n.

  5. Do stuff with the file.

  6. Convert all line endings to original style.

  7. Encode file.
  8. Write file in binary mode.

In this work flow, what is the best way to do step 2?

like image 607
amicitas Avatar asked Feb 28 '11 16:02

amicitas


People also ask

How do you print the last line of a file in Python?

Read Last Line of File With the readlines() Function in Python. The file. readlines() function reads all the lines of a file and returns them in the form of a list. We can then get the last line of the file by referencing the last index of the list using -1 as an index.

What do you put at the end of a Python file?

Ctrl + C on Windows can be used to terminate Python scripts and Ctrl + Z on Unix will suspend (freeze) the execution of Python scripts. If you press CTRL + C while a script is running in the console, the script ends and raises an exception.

How do you set the end of a line in Python?

The new line character in Python is \n . It is used to indicate the end of a line of text.


2 Answers

To preserve original line endings, use newline='' to read or write line endings untranslated.

with open('test.txt','r',newline='') as rf:
    content = rf.read()
content = content.replace('old text','new text')
with open('testnew.txt','w',newline='') as wf:
    wf.write(content)

Note that if the text manipulation itself deals with line endings, additional or alternative logic may be needed to detect and match original line endings.

The 'U' mode also works, but is deprecated.

Python Documentation: open

newline controls how universal newlines mode works (it only applies to text mode). It can be None, '', '\n', '\r', and '\r\n'. It works as follows:

• When reading input from the stream, if newline is None, universal newlines mode is enabled. Lines in the input can end in '\n', '\r', or '\r\n', and these are translated into '\n' before being returned to the caller. If it is '', universal newlines mode is enabled, but line endings are returned to the caller untranslated. If it has any of the other legal values, input lines are only terminated by the given string, and the line ending is returned to the caller untranslated.

• When writing output to the stream, if newline is None, any '\n' characters written are translated to the system default line separator, os.linesep. If newline is '' or '\n', no translation takes place. If newline is any of the other legal values, any '\n' characters written are translated to the given string.

like image 82
Steven Brown Avatar answered Sep 30 '22 19:09

Steven Brown


Use python's universal newline support:

f = open('randomthing.py', 'rU')
fdata = f.read()
newlines = f.newlines
print repr(newlines)

newlines contains the file's delimiter or a tuple of delimiters if the file uses a mix of delimiters.

like image 29
senderle Avatar answered Sep 30 '22 17:09

senderle