Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Visualizing large data sets with Hadoop [closed]

I'm looking for a framework, a combination of frameworks, best-practices, or a tutorial about visualizing large data sets with Hadoop.

I am not looking for a framework to visualize the mechanics of running Hadoop jobs or managing disk space on Hadoop. I am looking for an approach or a guideline for visualizing the data contained within HDFS using graphs and charts, etc.

For example, let's say I have a set of data points stored in multiple files in HDFS, and I would like to show a histogram of the data. Is my only option to write a custom map/reduce job that would try and figure out which points fall into which bucket, write the totals to a file, and then use a plotting library to visualize that?

Do I need to roll out a custom solution, or is there anyone else doing this sort of thing out there? I've trying looking online, but I haven't been able to find something that directly relates to this.

Thank you for your help

like image 367
oneself Avatar asked Nov 13 '22 20:11

oneself


1 Answers

We do something like this at Datameer. The files would take a few more processing steps to get to our visualizations, but we run natively on Hadoop so the files would not be far away.

like image 91
Superboggly Avatar answered Dec 18 '22 14:12

Superboggly