Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

why pandas dataframe intoroduces additional digits to columns names after reading them from excel?

I have an excel sheet which contains data generated by running a program with different configurations. Two sample outputs are as follows. My first problem is pd.read_excel() introduces incremental digit suffices to the column names after the first group is read. For example the first dataframe has the correct column names "A,B,C,D" while the second dataframe has "A.1,B.1,C.1,D.1" and the third "A.2,B.2,C.2,D.2" and so on. How do I make sure that all have the same column names?

My data on excel

My data on excel

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
file_locn = ''r'C:\Users\me\Desktop\output.xlsx'''
df = pd.read_excel(file_locn, sheet_name='1', skiprows=1)
print(df)
df_256 = df[df.columns[0:5]]
df_128 = df[df.columns[6:11]]
df_64 = df[df.columns[12:17]]

print(df_256)
print(df_128)
print(df_64)

Output:

print(df)

print(df)

print(df_256)

print(df_256)

print(df_128)

print(df_128)

like image 537
hope Avatar asked Oct 27 '25 08:10

hope


1 Answers

If duplicated columns names pandas add .Number for create unique columns names.

Here is possible create MultiIndex in columns like:

df = pd.read_excel(file_locn, sheet_name='1', header=[0,1])

And then use DataFrame.xs, if first level is parsed like strings use:

df_256 = df.xs('256', axis=1, level=0)
df_128 = df.xs('128', axis=1, level=0)
df_64 = df.xs('64', axis=1, level=0)

If parsed like integers:

df_256 = df.xs(256, axis=1, level=0)
df_128 = df.xs(128, axis=1, level=0)
df_64 = df.xs(64, axis=1, level=0)
like image 145
jezrael Avatar answered Oct 29 '25 23:10

jezrael



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!