I wanted to draw a histogram of some data. sorry that I could not attach a sample histogram as I don't have enough reputation. Hope that my description of the problem I am facing will be understood by you. I am using python pandas and I realize that any NaN value is treated as a 0 by pandas. Is there any method that I can use to include the count of Nan value in the histogram? What I mean is that the x-axis should have the NaN value as well. Please help... Thank you very much.
This is what Pandas documentation gives: na_values : scalar, str, list-like, or dict, optional Additional strings to recognize as NA/NaN. If dict passed, specific per-column NA values. By default the following values are interpreted as NaN: '', '#N/A', '#N/A N/A', '#NA', '-1.
You can filter out rows with NAN value from pandas DataFrame column string, float, datetime e.t.c by using DataFrame. dropna() and DataFrame. notnull() methods. Python doesn't support Null hence any missing data is represented as None or NaN.
I was looking for the same thing. I ended up with the following solution:
figure = plt.figure(figsize=(6,9), dpi=100);
graph = figure.add_subplot(111);
freq = pandas.value_counts(data)
bins = freq.index
x=graph.bar(bins, freq.values) #gives the graph without NaN
graphmissing = figure.add_subplot(111)
y = graphmissing.bar([0], freq[numpy.NaN]) #gives a bar for the number of missing values at x=0
figure.show()
This gave me a histogram with a column at 0 showing the number of missing values in the data.
Did you try replacing NaN with some other unique value and then plot the histogram?
x= some unique value
plt.hist(df.replace(np.nan, x)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With