Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Pandas: How to set Dataframe Column value as X-axis labels

Say I have data in following format:

Region   Men   Women
City1    10   5
City2    50   89

When I load it in Dataframe and plot graph, it shows index as X-axis labels instead of Region name. How do I get names on X-axis?

So far I tried:

import pandas as pd
import matplotlib.pyplot as plt    
plt.style.use('ggplot')
ax = df[['Men','Women']].plot(kind='bar', title ="Population",figsize=(15,10),legend=True, fontsize=12)
ax.set_xlabel("Areas",fontsize=12)
ax.set_ylabel("Population",fontsize=12)
plt.show()

Currently it shows x ticks as 0,1,2..

like image 788
Volatil3 Avatar asked Jul 31 '16 11:07

Volatil3


People also ask

How do I change the X-axis labels on a panda?

You can set the labels on that object. Or, more succinctly: ax. set(xlabel="x label", ylabel="y label") . Alternatively, the index x-axis label is automatically set to the Index name, if it has one.

How do I change a column to a label in a DataFrame?

Rename column/index name (label): rename() You can use the rename() method of pandas. DataFrame to change column/index name individually. Specify the original name and the new name in dict like {original name: new name} to columns / index parameter of rename() .

How to change the column name and row index in pandas Dataframe?

Pandas Dataframe type has two attributes called ‘columns’ and ‘index’ which can be used to change the column names as well as the row indexes. Create a DataFrame using dictionary. Method #1: Changing the column name and row index using df.columns and df.index attribute.

What is pandas Dataframe?

Pandas DataFrame are rectangular grids which are used to store data. It is easy to visualize and work with data when stored in dataFrame. It consists of rows and columns. Each row is a measurement of some instance while column is a vector which contains data for some specific attribute/variable.

How to change the column names of a Dataframe in Python?

Create a DataFrame using dictionary. Method #1: Changing the column name and row index using df.columns and df.index attribute. In order to change the column names, we provide a Python list containing the names for column df.columns= ['First_col', 'Second_col', 'Third_col', .....].

How to set tick labels to the values of a Dataframe?

To set the ticklabels to the values of some dataframe column, you would need to set the tickpositions to the index of the dataframe and the labels as the values from said column. This shows of course all labels as 2002, since all values from the year column are 2002. (Not sure if that makes sense though.) Show activity on this post.


3 Answers

Since you're using pandas, it looks like you can pass the tick labels right to the DataFrame's plot() method. (docs). (e.g. df.plot(..., xticks=<your labels>))

Additionally, since pandas uses matplotlib, you can control the labels that way.

For example with plt.xticks() (example) or ax.set_xticklabels()

Regarding the rotation, the last two methods allow you to pass a rotation argument along with the labels. So something like:

ax.set_xticklabels(<your labels>, rotation=0)

should force them to lay horizontally.

like image 159
jedwards Avatar answered Oct 09 '22 01:10

jedwards


plot.bar() method inherits its arguments from plot(), which has rot argument:

from the docs:

rot : int, default None

Rotation for ticks (xticks for vertical, yticks for horizontal plots)

it also uses per default index as ticks for x axis:

use_index : boolean, default True

Use index as ticks for x axis

In [34]: df.plot.bar(x='Region', rot=0, title='Population', figsize=(15,10), fontsize=12)
Out[34]: <matplotlib.axes._subplots.AxesSubplot at 0xd09ff28>

alternatively you can set index explicitly - it might be useful for multi-level indexes (axes):

df.set_index('Region').plot.bar(rot=0, title='Population', figsize=(15,10), fontsize=12)

enter image description here

like image 43
MaxU - stop WAR against UA Avatar answered Oct 09 '22 02:10

MaxU - stop WAR against UA


I had a lot of trouble finding an answer I really liked for this, the below function achieves it quite well, and is very adaptable,

def plot_vals_above_titles(data_frame, columns):
    import random
    y_vals = {}

    fig = plt.figure()
    plt.grid(True)

    for index, row in data_frame.iterrows():
        x_coord = 0

        for col in columns:
            # add some jitter to move points off vertical line
            jitter = random.uniform(-0.1,.1)
            x_coord += jitter

            plt.scatter(
                x = x_coord,
                y = row[col]
                )

            x_coord -= jitter
            x_coord+=1

    # rename the xticks with column names
    x_vals = range(0, len(columns))
    plt.xticks(x_vals, columns)

Below is an example of my result, though I set a new color for each value in a separate column in the dataframe

My columns were titled ['A','B','C','D','E']

like image 2
nbenz Avatar answered Oct 09 '22 00:10

nbenz