When creating a blank dataframe in Pandas there appears to be at least 2 ways to set the index name.
df = pd.DataFrame(columns=['col1', 'col2'])
df.index.name = 'index name'
df = pd.DataFrame(columns=['index name', 'col1', 'col2'])
df.set_index('index name', inplace=True)
Is one preferred over the other? Is there a third way to do it in 1 line of code instead of 2?
I think here is best use method chaining:
The pandas core team now encourages the use of method chaining. This is a style of programming in which you chain together multiple method calls into a single statement. This allows you to pass intermediate results from one method to the next rather than storing the intermediate results using variables.
Another soution DataFrame.rename_axis
:
df = pd.DataFrame(columns=['col1', 'col2']).rename_axis('index name')
Or change your second solution:
df = pd.DataFrame(columns=['index name', 'col1', 'col2']).set_index('index name')
inplace
is not recommended - link:
The pandas core team discourages the use of the inplace parameter, and eventually it will be deprecated (which means "scheduled for removal from the library"). Here's why:
inplace won't work within a method chain.
The use of inplace often doesn't prevent copies from being created, contrary to what the name implies.
Removing the inplace option would reduce the complexity of the pandas codebase.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With