How to parse tsv file with python?

Tags:

python

csv

I have a tsv file which includes some newline data.

111 222 333 "aaa"
444 555 666 "bb
b"

Here b on the third line is a newline character of bb on the second line, so they are one data:

The fourth value of first line:

aaa

The fourth value of second line:

bb
b

If I use Ctrl+C and Ctrl+V paste to a excel file, it works well. But if I want to import the file using python, how to parse?

I have tried:

lines = [line.rstrip() for line in open(file.tsv)]
for i in range(len(lines)):
    value = re.split(r'\t', lines[i]))

But the result was not good:

enter image description here

I want:

enter image description here

520

asked Feb 21 '17 03:02

s_zhang

1 Answers

Just use the csv module. It knows about all the possible corner cases in CSV files like new lines in quoted fields. And it can delimit on tabs.

with open("file.tsv") as fd:
    rd = csv.reader(fd, delimiter="\t", quotechar='"')
    for row in rd:
        print(row)

will correctly output:

['111', '222', '333', 'aaa']
['444', '555', '666', 'bb\nb']

161

answered Sep 24 '22 04:09

Serge Ballesta

Related questions
                            
                                Celery not picking CELERY_ALWAYS_EAGER settings
                            
                                Holt-Winters time series forecasting with statsmodels
                            
                                'ABCMeta' object is not subscriptable when trying to annotate a hash variable
                            
                                Running a Python script for a user-specified amount of time
                            
                                Select Children of an Object With ForeignKey in Django?
                            
                                How to convert Python datetime dates to decimal/float years
                            
                                List copy not working? [duplicate]
                            
                                pip install with wipe option by default
                            
                                Can I do a "string contains X" with a percentage accuracy in python?
                            
                                Catch exceptions inside a class
                            
                                Syntax error on the colon in an if statement
                            
                                BaseHTTPRequestHandler with custom instance
                            
                                tkinter optionmenu first option vanishes
                            
                                How can I tell if Python setuptools is installed?
                            
                                Django - How to use custom template tag with 'if' and 'else' checks? [duplicate]
                            
                                Right function for normalizing input of sklearn SVM
                            
                                Selenium: Element not clickable ... Other Element Would Receive Click
                            
                                Why is a list access O(1) in Python?
                            
                                Leveraging "Copy-on-Write" to Copy Data to Multiprocessing.Pool() Worker Processes
                            
                                How to align axis label to the right or top in matplotlib?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With