Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading tab-delimited file with Pandas - works on Windows, but not on Mac

People also ask

How do I read a pandas TSV file?

How to read TSV file in pandas? TSV stands for Tab Separated File Use pandas which is a text file where each field is separated by tab (\t). In pandas, you can read the TSV file into DataFrame by using the read_table() function.

What is the correct reader function for CSV files in pandas?

To read a CSV file, call the pandas function read_csv () and pass the file path as input. Step 1: Import Pandas import pandas as pd. Step 2: Read the CSV # Read the csv file df = pd. read_csv("data1.csv") # First 5 rows df.

How do I read a tab-delimited file in Python?

You can use the csv module to parse tab seperated value files easily. import csv with open("tab-separated-values") as tsv: for line in csv. reader(tsv, dialect="excel-tab"): #You can also use delimiter="\t" rather than giving a dialect. ... Where line is a list of the values on the current row for each iteration.


The biggest clue is the rows are all being returned on one line. This indicates line terminators are being ignored or are not present.

You can specify the line terminator for csv_reader. If you are on a mac the lines created will end with \rrather than the linux standard \n or better still the suspenders and belt approach of windows with \r\n.

pandas.read_csv(filename, sep='\t', lineterminator='\r')

You could also open all your data using the codecs package. This may increase robustness at the expense of document loading speed.

import codecs

doc = codecs.open('document','rU','UTF-16') #open for reading with "universal" type set

df = pandas.read_csv(doc, sep='\t')

Another option would be to add engine='python' to the command pandas.read_csv(filename, sep='\t', engine='python')