I try to use ggplot for python I have the following data:
power_data = [[ 4.13877565e+04, 2.34652000e-01],
[ 4.13877565e+04, 2.36125000e-01],
[ 4.13877565e+04, 2.34772000e-01],
...
[ 4.13882896e+04, 2.29006000e-01],
[ 4.13882896e+04, 2.29019000e-01],
[ 4.13882896e+04, 2.28404000e-01]]
And I want to represent it in ggplot with this:
print ggplot(aes(x='TIME', y='Watts'), data=power_data) + \
geom_point(color='lightblue') + \
geom_line(alpha=0.25) + \
stat_smooth(span=.05, color='black') + \
ggtitle("Power comnsuption over 13 hours") + \
xlab("Time") + \
ylab("Watts")
but get the error:
File "C:\PYTHON27\lib\site-packages\ggplot\ggplot.py", line 59, in __init__
for ae, name in self.aesthetics.iteritems():
AttributeError: 'list' object has no attribute 'iteritems'
>>>
I don't know what the line aes(x='TIME', y='Watts')
should be doing.
How can I format the power_data
list so I can use it with ggplot, I want first column reprezentedon a time x
axis and second column on a power y
axis?
If I am trying with the meat
example it doesn't show nothing it only shows
>>> print (ggplot(aes(x='date', y='beef'), data=meat) + \
... geom_line())
<ggplot: (20096197)>
>>>
What should I do to further show the graphic?
Using ggplot in Python allows you to build visualizations incrementally, first focusing on your data and then adding and tuning components to improve its graphical representation.
Plotnine is the competitor of R in Python. Its syntax is probably 95% similar or more to ggplot2. It has a table release and is still active on Github. It already does a lot of what users love most about ggplot2 such as access to various geoms, declarative syntax and faceting for example.
Using ggplot2 with a matrix ggplot only works with data frames, so we need to convert this matrix into data frame form, with one measurement in each row. We can convert to this “long” form with the melt function in the library reshape2 .
Both packages achieved very similar results. But the contour lines, labels, and legend in matplotlib are superior to ggplot2.
There were 3 important steps that I missed:
1) First the data needs to be in a format like this:
[{'TIME': 41387.756495162001, 'Watts': 0.234652},
{'TIME': 41387.756500821, 'Watts': 0.236125},
{'TIME': 41387.756506480997, 'Watts': 0.23477200000000001},
{'TIME': 41387.756512141001, 'Watts': 0.23453099999999999},
...
{'TIME': 41387.756574386003, 'Watts': 0.23558699999999999},
{'TIME': 41387.756580046, 'Watts': 0.23508899999999999},
{'TIME': 41387.756585706004, 'Watts': 0.235041},
{'TIME': 41387.756591365003, 'Watts': 0.23541200000000001},
{'TIME': 41387.756597013002, 'Watts': 0.23461699999999999},
{'TIME': 41387.756602672998, 'Watts': 0.23483899999999999}]
2) Then the data needs to be decorated with DataFrame
with
powd = DataFrame(data2)
3) Without the plt.show(1)
the plot will not show
Here is the code to solve the above:
from pandas import DataFrame
data2 = []
for i in range(0,len(power_data)):
data2.append({'TIME': power_data[i][0], 'Watts': power_data[i][1]})
powd = DataFrame(data2)
print powd
# the above can be changed with this line:
# powd = DataFrame(power_data, columns=['TIME', 'Watts'])
# see sugestion in comments
print ggplot(aes(x='TIME', y='Watts'), data=powd) + \
geom_point(color='lightblue') + \
geom_line(alpha=0.25) + \
stat_smooth(span=.05, color='black') + \
ggtitle("Power comnsuption over 13 hours") + \
xlab("Time") + \
ylab("Watts")
Or alternatively in one pass without the for
as provided in the comment:
powd = DataFrame(power_data, columns=['TIME', 'Watts'])
print ggplot(aes(x='TIME', y='Watts'), data=powd) + \
geom_point(color='lightblue') + \
geom_line(alpha=0.25) + \
stat_smooth(span=.05, color='black') + \
ggtitle("Power comnsuption over 13 hours") + \
xlab("Time") + \
ylab("Watts")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With