Ignore character while importing with pandas

Question

I could not find such an option in the documentation. A measuring device spits out everything in Excel:

    <>    A    B    C
 1
 2
 3

When I delete the "<>" characters manually everything works fine. Is there a way to circumvent that (without conversion to csv)?

I do:

import pandas as pd 
df = pd.read_excel(filename,sheetname,skiprows=0,header=0,index_col=0)

skiprow = 1 does not do the trick since pandas uses the first row as names. If I supply names = list(range(1, 4)) the first data row is lost.

Aritra · Accepted Answer

Expanding on Peruz's answer:-

For your case, using regex

df = pd.read_csv(filename, sep="(?<!<>)\s+", engine='python')

This should read in the columns properly, except that the first column would be named <> A

To change this, simply alter the first column name

df.columns = pd.Series(df.columns.str.replace("<>\s", ""))

In the regex expression, \s+ matches any number of space characters except when preceded by whatever is mentioned in the negative lookaround denoted by (?<!charceters_to_ignore)

wander95 · Answer

Another option would be:

f = open(fname, 'r')
line1 = f.readline()
data1 = pd.read_csv(f, sep='\s+', names=line1.replace(' #', '').split(), dtype=np.float)

You might have a different separator though.

Ignore character while importing with pandas

Tags:

python

pandas

csv

Moritz

2 Answers

Aritra

wander95

Recent Activity

Donate For Us

Ignore character while importing with pandas

Tags:

python

pandas

csv

Moritz

2 Answers

Aritra

wander95

Related questions

Recent Activity

Donate For Us