Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas.DataFrame: .hist() vs .plot.hist() methods

Tags:

python

pandas

On pandas.DataFrame in 0.19 there are two hist methods:

DataFrame.hist

DataFrame.plot.hist

At first I thought they were the same, but actually they take different arguments. Is one going to be deprecated in a future release, is there a different use case for each, or what's the story?

like image 808
Kiv Avatar asked Nov 24 '16 00:11

Kiv


People also ask

How do you plot a histogram in pandas DataFrame?

In order to plot a histogram using pandas, chain the . hist() function to the dataframe. This will return the histogram for each numeric column in the pandas dataframe.

Does pandas have a plot function?

Pandas uses the plot() method to create diagrams. We can use Pyplot, a submodule of the Matplotlib library to visualize the diagram on the screen. Read more about Matplotlib in our Matplotlib Tutorial.

What is the default plot () type in pandas plotting?

With a DataFrame , pandas creates by default one line plot for each of the columns with numeric data.


2 Answers

I don't have a definitive answer for you. One thing I noticed is that DataFrame.hist returns a list of axes objects and DataFrame.plot.hist returns only one. For example:

# Making up data
df = pd.DataFrame({'value1': np.random.normal(1, 1, 99),
                   'value2': [-1]*33 + [0]*33 + [1]*33})

df.hist()

enter image description here

df.plot.hist()

enter image description here

like image 139
AlexG Avatar answered Oct 29 '22 04:10

AlexG


Looking at the docs, http://pandas.pydata.org/pandas-docs/version/0.17.0/generated/pandas.DataFrame.plot.hist.html and http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.hist.html, it looks like plot.hist is a function that takes a few histogram specific options but then passes on all other keyword args to plot(), while hist takes a large number of keyword args directly. I would guess that this is primarily to create a simpler, more consistent API, i.e. rather than having 15 different functions that each take a large number of kwargs, just focus on the specialized args while the rest are consistent with plot()

cf:

New in version 0.17.0: Each plot kind has a corresponding method on the DataFrame.plot accessor: df.plot(kind='line') is equivalent to df.plot.line()

In addition, the plot* functions return axes, which could be useful for chaining and other things.

like image 45
OldGeeksGuide Avatar answered Oct 29 '22 06:10

OldGeeksGuide