Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas DataFrame column naming conventions

Tags:

python

pandas

Is there any commonly used Pandas DataFrame column naming convention? Is PEP8 recommended here (ex. instance variables)?

Concious that lots of data is loaded from external sources with headers but I'm curious what is the correct approach when I have to name/rename the columns on my own?

like image 854
wmatt Avatar asked Dec 24 '17 22:12

wmatt


People also ask

How do you name columns in pandas DataFrame?

One way of renaming the columns in a Pandas Dataframe is by using the rename() function.

How do I fix column names in pandas?

One way to rename columns in Pandas is to use df. columns from Pandas and assign new names directly. For example, if you have the names of columns in a list, you can assign the list to column names directly. This will assign the names in the list as column names for the data frame “gapminder”.

How do you capitalize column names in pandas?

To capitalize the column names, we can simply invoke the upper() method on the Index object in which the column names are stored. The Series. str. upper() when invoked on the dataframe.


1 Answers

Some people tend to use snake_case (lower case with underscores) so that they can access the column using period like this df.my_column

I tend to always access columns using the df['my_column'] syntax because it avoids confusion with DataFrame methods and properties, and it easier to extend to slices and fancy indexing, so the snake case is not necessary.

In short, I think you should use whatever is clearest to a potential reader.

like image 161
blokeley Avatar answered Sep 19 '22 00:09

blokeley