Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

set multi index of an existing data frame in pandas

Tags:

python

pandas

I have a DataFrame that looks like

  Emp1    Empl2           date       Company 0    0        0     2012-05-01         apple 1    0        1     2012-05-29         apple 2    0        1     2013-05-02         apple 3    0        1     2013-11-22         apple 18   1        0     2011-09-09        google 19   1        0     2012-02-02        google 20   1        0     2012-11-26        google 21   1        0     2013-05-11        google 

I want to pass the company and date for setting a MultiIndex for this DataFrame. Currently it has a default index. I am using df.set_index(['Company', 'date'], inplace=True)

df = pd.DataFrame() for c in company_list:         row = pd.DataFrame([dict(company = '%s' %s, date = datetime.date(2012, 05, 01))])         df = df.append(row, ignore_index = True)         for e in emp_list:             dataset  = pd.read_sql("select company, emp_name, date(date), count(*) from company_table where  = '"+s+"' and emp_name = '"+b+"' group by company, date, name LIMIT 5 ", con)                 if len(dataset) == 0:                 row = pd.DataFrame([dict(sitename='%s' %s, name = '%s' %b, date = datetime.date(2012, 05, 01), count = np.nan)])                 dataset = dataset.append(row, ignore_index=True)             dataset = dataset.rename(columns = {'count': '%s' %b})             dataset = dataset.groupby(['company', 'date', 'emp_name'], as_index = False).sum()              dataset = dataset.drop('emp_name', 1)             df = pd.merge(df, dataset, how = '')             df = df.sort('date', ascending = True)             df.fillna(0, inplace = True)  df.set_index(['Company', 'date'], inplace=True)             print df 

But when I print this DataFrame, it prints None. I saw this solution from stackoverflow it self. Is this not the correct way of doing it. Also I want to shuffle the positions of the columns company and date so that company becomes the first index, and date becomes the second in Hierarchy. Any ideas on this?

like image 241
user3527975 Avatar asked Jun 04 '14 15:06

user3527975


People also ask

Can DataFrame have two indexes?

In this example, we will be creating multi-index from dataframe using pandas. We will be creating manual data and then using pd. dataframe, we will create a dataframe with the set of data. Now using the Multi-index syntax we will create a multi-index with a dataframe.

How do I change the index of a DataFrame in pandas?

To reset the index in pandas, you simply need to chain the function . reset_index() with the dataframe object. On applying the . reset_index() function, the index gets shifted to the dataframe as a separate column.

How do I change multiple indexes to single index in pandas?

Output: Now, the dataframe has Hierarchical Indexing or multi-indexing. To revert the index of the dataframe from multi-index to a single index using the Pandas inbuilt function reset_index(). Returns: (Data Frame or None) DataFrame with the new index or None if inplace=True.


1 Answers

When you pass inplace in makes the changes on the original variable and returns None, and the function does not return the modified dataframe, it returns None.

is_none = df.set_index(['Company', 'date'], inplace=True) df  # the dataframe you want is_none # has the value None 

so when you have a line like:

df = df.set_index(['Company', 'date'], inplace=True) 

it first modifies df... but then it sets df to None!

That is, you should just use the line:

df.set_index(['Company', 'date'], inplace=True) 
like image 135
Andy Hayden Avatar answered Sep 20 '22 15:09

Andy Hayden