When I use pandas to process my data, here is an error like title. My data's column is not equivalent, So I sort it in an descending order. The first line is the longest and next line is shorter and so on. When the file is small, pandas can process it successfully. But after I write all my data in the file, it can't process and show me this error.
here is my code:
def sequencein(filepath):
print (filepath)
print("time", time.time())
data = pd.read_table(filepath, header=None)
print("time", time.time())
matr = data.values
print("sequence shape:", matr.shape)
return matr
file's end of the line is shown below: enter image description here
The documentation says there are two engines:
engine : {‘c’, ‘python’}, optional
Parser engine to use. The C engine is faster while the python engine is currently more feature-complete.
The problem seems to appear only with the 'c' engine, which is selected automatically for larger files.
So, you could try
data = pd.read_table(filepath, header=None, engine='python')
I have solved this problem by myself. I just modified data = pd.read_table(filepath, header=None)
to data = pd.read_table(filepath)
. Then I added a header line in my data file and it worked.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With