Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

"No data available" in Zeppelin charts

I'm having problems creating visualizations with Zeppelin. I've got a dataset with about 600 million records. It's stored in an HDFS cluster and I'm able to load as a Spark dataframe:

%spark.pyspark
input_hdfs_path = u'hdfs://cluster-master:9000/data/CDR_*.parquet'
df = spark.read.format('parquet').load(input_hdfs_path)
df.registerTempTable("df")

I'm interested in creating histograms on the length of the CDR (field CDR_LENGTH):

%sql
select ROUND(CDR_LENGTH, -2) as duration, count(*) as count
from df
group by 1
order by 1

I do get the appropriate results in the Table tab (with two columns, duration and count), but when going to the bar chart tab (or any other graphic tab), it simply says "No data available". Can you figure out what I'm doing wrong? Thanks

like image 837
mmonjas Avatar asked Dec 10 '22 09:12

mmonjas


1 Answers

you can find settings on the right side of chart buttons, then you define Keys, Groups, Values as you like.

like image 123
Daepa Avatar answered Jan 13 '23 11:01

Daepa