Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I change the default newline character when reading lines from a file in Python 3?

Tags:

A recent question about splitting a binary file using null characters made me think of a similar text-oriented question.

Given the following file:

Parse me using spaces, please.

Using Raku, I can parse this file using space (or any chosen character) as the input newline character, thus:

my $fh = open('spaced.txt', nl-in => ' ');

while $fh.get -> $line {
    put $line;
}

Or more concisely:

.put for 'spaced.txt'.IO.lines(nl-in => ' ');

Either of which gives the following result:

Parse
me
using
spaces,
please.

Is there something equivalent in Python 3?

The closest I could find required reading an entire file into memory:

for line in f.read().split('\0'):
    print line

Update: I found several other older questions and answers that seemed to indicate that this isn't available, but I figured there may have been new developments in this area in the last several years:
Python restrict newline characters for readlines()
Change newline character .readline() seeks

like image 587
Christopher Bottoms Avatar asked Aug 03 '17 16:08

Christopher Bottoms


People also ask

How do you change to a new line in Python?

In Python, the new line character “\n” is used to create a new line. When inserted in a string all the characters after the character are added to a new line. Essentially the occurrence of the “\n” indicates that the line ends here and the remaining characters would be displayed in a new line.

How do I change lines in Python 3?

The new line character in Python is \n . It is used to indicate the end of a line of text. You can print strings without adding a new line with end = <character> , which <character> is the character that will be used to separate the lines.

How do you go to the next line while reading a file in Python?

readline() returns the next line of the file which contains a newline character in the end. Also, if the end of the file is reached, it will return an empty string. Example: Python3.

Does readline () take in the \n at the end of line?

The readline method reads one line from the file and returns it as a string. The string returned by readline will contain the newline character at the end.


1 Answers

There is no builtin support to read a file splitted by a custom character.

However loading a file with the "U"-flag allows universal newline-character, which can be obtained by file.newlines. It keeps the newline-mode in the whole file.

Here is my generator to read a file, while not everything in memory:

def customReadlines(fileNextBuff, char):
    """
        \param fileNextBuff a function returning the next buffer or "" on EOF
        \param char a string with the lines are splitted, the char is not included in the yielded elements
    """
    lastLine = ""
    lenChar = len(char)
    while True:
         thisLine = fileNextBuff
         if not thisLine: break #EOF
         fnd = thisLine.find(char)
         while fnd != -1:
             yield lastLine + thisLine[:fnd]
             lastLine = ""
             thisLine = thisLine[fnd+lenChar:]
             fnd = thisLine.find(char)
         lastLine+= thisLine
    yield lastLine


### EXAMPLES ###

#open file.txt and print each part of the file ending with Null-terminator by loading a buffer of 256 characters
with open("file.bin", "r") as f:
    for l in customReadlines((lambda: f.read(0x100)), "\0"):
        print(l)

# open the file errors.log and split the file with a special string, while it loads a whole line at a time
with open("errors.log", "r") as f:
    for l in customReadlines(f.readline, "ERROR:")
        print(l)
        print(" " + '-' * 78) # some seperator
like image 194
cmdLP Avatar answered Oct 11 '22 12:10

cmdLP