Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to join all the lines together in a text file in python?

Tags:

python

I have a file and when I open it, it prints out some paragraphs. I need to join these paragraphs together with a space to form one big body of text.

for e.g.

for data in open('file.txt'):
    print data

has an output like this:

Hello my name is blah. What is your name?
Hello your name is blah. What is my name?

How can the output be like this?:

Hello my name is blah. What is your name? Hello your name is blah. What is my name?

I've tried replacing the newlines with a space like so:

for data in open('file.txt'):
      updatedData = data.replace('\n',' ')

but that only gets rid of the empty lines, it doesn't join the paragraphs

and also tried joining like so:

for data in open('file.txt'):
    joinedData = " ".join(data)

but that separates each character with a space, while not getting rid of the paragraph format either.

like image 449
user2353608 Avatar asked May 06 '13 06:05

user2353608


2 Answers

You could use str.join:

with open('file.txt') as f:
    print " ".join(line.strip() for line in f)  

line.strip() will remove all types of whitespaces from both ends of the line. You can use line.rstrip("\n") to remove only the trailing "\n".

If file.txt contains:

Hello my name is blah. What is your name?
Hello your name is blah. What is my name?

Then the output would be:

Hello my name is blah. What is your name? Hello your name is blah. What is my name?
like image 100
Ashwini Chaudhary Avatar answered Sep 18 '22 14:09

Ashwini Chaudhary


You are looping over individual lines and it is the print statement that is adding newlines. The following would work:

for data in open('file.txt'):
    print data.rstrip('\n'),

With the trailing comma, print doesn't add a newline, and the .rstrip() call removes just the trailing newline from the line.

Alternatively, you need to pass all read and stripped lines to ' '.join(), not each line itself. Strings in python are sequences to, so the string contained in line is interpreted as separate characters when passed on it's own to ' '.join().

The following code uses two new tricks; context managers and a list comprehension:

with open('file.txt') as inputfile:
    print ' '.join([line.rstrip('\n') for line in inputfile])

The with statement uses the file object as a context manager, meaning the file will be automatically closed when we are done with the block indented below the with statement. The [.. for .. in ..] syntax generates a list from the inputfile object where we turn each line into a version without a newline at the end.

like image 37
Martijn Pieters Avatar answered Sep 19 '22 14:09

Martijn Pieters