I could not find such an option in the documentation. A measuring device spits out everything in Excel:
<> A B C
1
2
3
When I delete the "<>" characters manually everything works fine. Is there a way to circumvent that (without conversion to csv)?
I do:
import pandas as pd
df = pd.read_excel(filename,sheetname,skiprows=0,header=0,index_col=0)
skiprow = 1 does not do the trick since pandas uses the first row as names. If I supply names = list(range(1, 4)) the first data row is lost.
Expanding on Peruz's answer:-
For your case, using regex
df = pd.read_csv(filename, sep="(?<!<>)\s+", engine='python')
This should read in the columns properly, except that the first column would be named <> A
To change this, simply alter the first column name
df.columns = pd.Series(df.columns.str.replace("<>\s", ""))
In the regex expression, \s+ matches any number of space characters except when preceded by whatever is mentioned in the negative lookaround denoted by (?<!charceters_to_ignore)
Another option would be:
f = open(fname, 'r')
line1 = f.readline()
data1 = pd.read_csv(f, sep='\s+', names=line1.replace(' #', '').split(), dtype=np.float)
You might have a different separator though.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With