set multi index of an existing data frame in pandas

Tags:

pandas

I have a DataFrame that looks like

  Emp1    Empl2           date       Company 0    0        0     2012-05-01         apple 1    0        1     2012-05-29         apple 2    0        1     2013-05-02         apple 3    0        1     2013-11-22         apple 18   1        0     2011-09-09        google 19   1        0     2012-02-02        google 20   1        0     2012-11-26        google 21   1        0     2013-05-11        google

I want to pass the company and date for setting a MultiIndex for this DataFrame. Currently it has a default index. I am using df.set_index(['Company', 'date'], inplace=True)

Click to copy

df = pd.DataFrame() for c in company_list:         row = pd.DataFrame([dict(company = '%s' %s, date = datetime.date(2012, 05, 01))])         df = df.append(row, ignore_index = True)         for e in emp_list:             dataset  = pd.read_sql("select company, emp_name, date(date), count(*) from company_table where  = '"+s+"' and emp_name = '"+b+"' group by company, date, name LIMIT 5 ", con)                 if len(dataset) == 0:                 row = pd.DataFrame([dict(sitename='%s' %s, name = '%s' %b, date = datetime.date(2012, 05, 01), count = np.nan)])                 dataset = dataset.append(row, ignore_index=True)             dataset = dataset.rename(columns = {'count': '%s' %b})             dataset = dataset.groupby(['company', 'date', 'emp_name'], as_index = False).sum()              dataset = dataset.drop('emp_name', 1)             df = pd.merge(df, dataset, how = '')             df = df.sort('date', ascending = True)             df.fillna(0, inplace = True)  df.set_index(['Company', 'date'], inplace=True)             print df

But when I print this DataFrame, it prints None. I saw this solution from stackoverflow it self. Is this not the correct way of doing it. Also I want to shuffle the positions of the columns company and date so that company becomes the first index, and date becomes the second in Hierarchy. Any ideas on this?

241

asked Jun 04 '14 15:06

user3527975

1 Answers

When you pass inplace in makes the changes on the original variable and returns None, and the function does not return the modified dataframe, it returns None.

Click to copy

is_none = df.set_index(['Company', 'date'], inplace=True) df  # the dataframe you want is_none # has the value None

so when you have a line like:

Click to copy

df = df.set_index(['Company', 'date'], inplace=True)

it first modifies df... but then it sets df to None!

That is, you should just use the line:

Click to copy

df.set_index(['Company', 'date'], inplace=True)

135

answered Sep 20 '22 15:09

Andy Hayden

Related questions
                            
                                Does `anaconda` create a separate PYTHONPATH variable for each new environment?
                            
                                Correct way to set new column in pandas DataFrame to avoid SettingWithCopyWarning
                            
                                How do you access an authenticated Google App Engine service from a (non-web) python client?
                            
                                Why does pip freeze report some packages in a fresh virtualenv created with --no-site-packages?
                            
                                Can you perform multi-threaded tasks within Django?
                            
                                How do I transpose dataframe in pandas without index?
                            
                                What does the "yield from" syntax do in asyncio and how is it different from "await"
                            
                                Tab completion in Python's raw_input()
                            
                                Big-O of list slicing
                            
                                What does Django's @property do?
                            
                                Simplest way of checking for string that contains a string in list? [duplicate]
                            
                                Cross-correlation (time-lag-correlation) with pandas?
                            
                                How to apply "first" and "last" functions to columns while using group by in pandas?
                            
                                Python pytz timezone function returns a timezone that is off by 9 minutes
                            
                                Can Cython compile to an EXE?
                            
                                Python how to read N number of lines at a time
                            
                                Process very large (>20GB) text file line by line
                            
                                How to fill specific positional arguments with partial in python?
                            
                                StringIO and compatibility with 'with' statement (context manager)
                            
                                Django, filter by specified month and year in date range

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

set multi index of an existing data frame in pandas

Tags:

python

pandas

user3527975

People also ask

1 Answers

Andy Hayden

Recent Activity

Donate For Us