Is there any commonly used Pandas DataFrame column naming convention? Is PEP8 recommended here (ex. instance variables)?
Concious that lots of data is loaded from external sources with headers but I'm curious what is the correct approach when I have to name/rename the columns on my own?
One way of renaming the columns in a Pandas Dataframe is by using the rename() function.
One way to rename columns in Pandas is to use df. columns from Pandas and assign new names directly. For example, if you have the names of columns in a list, you can assign the list to column names directly. This will assign the names in the list as column names for the data frame “gapminder”.
To capitalize the column names, we can simply invoke the upper() method on the Index object in which the column names are stored. The Series. str. upper() when invoked on the dataframe.
Some people tend to use snake_case (lower case with underscores) so that they can access the column using period like this df.my_column
I tend to always access columns using the df['my_column']
syntax because it avoids confusion with DataFrame methods and properties, and it easier to extend to slices and fancy indexing, so the snake case is not necessary.
In short, I think you should use whatever is clearest to a potential reader.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With