Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas plotting - x axis gets transformed to floats

I am trying to plot my data grouped by year, and for each year, i want to count the number of users. Below, i just transformed the date column from float to integer.

This is my plotenter image description here

If you see the x-axis, my year ticker seems to have become a float and the each ticker is 0.5 tick apart.

How do i make this purely an integer?


Changing the groupby has the same result: enter image description here


ticks are still 2 spaces apart after converting the year column to a string format

df['year'] = df['year'].astype(str)

: enter image description here

like image 392
jxn Avatar asked Feb 09 '18 22:02

jxn


People also ask

How do I change the X-axis in pandas?

To set Dataframe column value as X-axis labels in Python Pandas, we can use xticks in the argument of plot() method.

How do I change the X-axis scale in MatPlotLib?

MatPlotLib with Python To change the range of X and Y axes, we can use xlim() and ylim() methods.

How do I change the range of the X-axis with Datetimes in MatPlotLib?

To change the range of X-axis with datetimes, use set_xlim() with range of datetimes. To change the range of Y-axis, use set_ylim() method. To display the figure, use show() method.


1 Answers

The expectation that using integer data will lead a matplotlib axis to only show integers is not justified. At the end, each axis is a numeric float axis.

The ticks and labels are determined by locators and formatters. And matplotlib does not know that you want to plot only integers.

Some possible solutions:

Tell the default locator to use integers

The default locator is a AutoLocator, which accepts an attribute integer. So you may set this attribute to True:

ax.locator_params(integer=True)

Example:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

data = pd.DataFrame({"year" : [2010,2011,2012,2013,2014],
                     "count" :[1000,2200,3890,5600,8000] })

ax = data.plot(x="year",y="count")
ax.locator_params(integer=True)

plt.show()

Using a fixed locator

You may just tick only the years present in the dataframe by using ax.set_ticks().

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

data = pd.DataFrame({"year" : [2010,2011,2012,2013,2014],
                     "count" :[1000,2200,3890,5600,8000] })

data.plot(x="year",y="count")
plt.gca().set_xticks(data["year"].unique())
plt.show()

Convert year to date

You may convert the year column to a date. For dates much nicer ticklabeling takes place automatically.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

data = pd.DataFrame({"year" : [2010,2011,2012,2013,2014],
                     "count" :[1000,2200,3890,5600,8000] })

data["year"] = pd.to_datetime(data["year"].astype(str), format="%Y")
ax = data.plot(x="year",y="count")

plt.show()

In all cases you would get something like this:

enter image description here

like image 75
ImportanceOfBeingErnest Avatar answered Oct 19 '22 18:10

ImportanceOfBeingErnest