Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Pandas, two rows as column headers?

I have seen how to work with a double index, but I have not seen how to work with a two-row column headers. Is this possible?

For example, row 1 is a repetitive series of dates: 2016, 2016, 2015, 2015

Row 2 is a repetitive series of data. Dollar Sales, Unit Sales, Dollar Sales, Unit Sales.

So each "Dollar Sales" heading is actually tied to the date in the row above.

Subsequent rows are individual items with data.

Is there a way to do a groupby or some way that I can have two column headers? Ultimately, I want to line up the "Dollar Sales" as a series by date so that I can make a nice graph. Unfortunately there are multiple columns before the next "Dollar Sales" value. (More than just the one "Unit Sales" column). Also if I delete the date row above, there is no link between which "Dollar Sales" are tied to each date.

like image 545
Stephen Avatar asked Dec 06 '16 21:12

Stephen


People also ask

How do I replace a column header in a pandas Dataframe?

When you print the dataframe using the df.head () method, you can see that the pandas dataframe is having two column headers for each column. If you have the potential headers at any of the header rows, you can replace the header with the nth row.

How to create a Dataframe with multiple header rows in Python?

The read_csv () method accepts the parameter header. You can pass header= [0, 1] to make the first two rows from the CSV file as a header of the dataframe. Using this way, you can create a dataframe with multiple header rows.

How to add a row in pandas Dataframe?

DataFrame.loc [] method is used to retrieve rows from Pandas DataFrame. Rows can also be selected by passing integer location to an iloc [] function. As shown in the output image, two series were returned since there was only one parameter both of the times. In Order to add a Row in Pandas DataFrame, we can concat the old dataframe with new one.

How do I convert a row to a column header in Python?

Convert Row to Column Header Using DataFrame.rename () You can use DataFrame.rename () to rename the header and use loc [] or iloc [] to remove the first row from the data. Use this approach even if you wanted to convert the middle or any nth row to a column header.


1 Answers

If using pandas.read_csv() or pandas.read_table(), you can provide a list of indices for the header argument, to specify the rows you want to use for column headers. Python will generate the pandas.MultiIndex for you in df.columns:

df = pandas.read_csv('DollarUnitSales.csv', header=[0,1])

You can also use more than two rows, or non-consecutive rows, to specify the column headers:

df = pandas.read_table('DataSheet1.csv', header=[0,2,3])
like image 84
Kevin Avatar answered Nov 21 '22 13:11

Kevin