Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Big data visualization using "search, show context, and expand on demand" concept [closed]

I'm trying to visualize a really huge network (3M nodes and 13M edges) stored in a database. For real-time interactivity, I plan to show only a portion of the graph based on user queries and expand it on demand. For instance, when a user clicks a node, I expand its neighborhood. (This is called "Search, Show Context, Expand on Demand" on this paper).

I have looked into several visualization tools, including Gephi, D3, etc. They take a text file as input, but I don't have any idea how they can connect a database and update the graph based on users' interaction.

The linked paper implemented a system like that, but they didn't describe the tools they were using.

How can I visualize such data with above criteria?

like image 892
Yang Avatar asked Feb 19 '14 21:02

Yang


People also ask

What is data visualization explain with an example?

Data visualization is the representation of data through use of common graphics, such as charts, plots, infographics, and even animations. These visual displays of information communicate complex data relationships and data-driven insights in a way that is easy to understand.

What data visualization means?

Data visualization is the practice of translating information into a visual context, such as a map or graph, to make data easier for the human brain to understand and pull insights from. The main goal of data visualization is to make it easier to identify patterns, trends and outliers in large data sets.

What do data visualizations reveal?

Data visualization is the process of turning raw data into visual representations. Typically, those visualizations are in the form of charts and graphs. The purpose of data visualization is to make data easier and faster to understand, even by people who are not trained in analytics or typically good with numbers.

What are the problems of big data visualization?

One of the biggest problems of the century we live is a big data problem. One of its main problems is the visualization of the results of the analysis. The article reviewed and interpreted the history and phases of visualization, classification of visualization methods, existing approaches, problems of big data visualization, visualization tools.

What are the essential data visualization techniques?

Here are 10 essential data visualization techniques you should know. 1. Know Your Audience This is one of the most overlooked yet vital concepts around. In the grand scheme of things, the World Wide Web and Information Technology as a concept are in its infancy - and data visualization is an even younger branch of digital evolution.

Why are reports and visualizations important?

Therefore, reports and visualizations have to be easily understood and meaningful. It is increasingly beneficial for professionals to be able to use data to make decisions and visuals to tell stories that communicate how data informs the question of person, subject, time, place, and method [1].

Can data visualization shorten business meetings?

Recent studies discovered that the use of visualizations in data analytics could shorten business meetings by 24%. Moreover, a business intelligence strategy with visualization capabilities boasts a ROI of $13.01 back on every dollar spent.


1 Answers

There are several solutions out there, but basically every one is using the same approach:

  1. create layer on top of your source to let you query at high level
  2. create a front end layer to talk with the level explained above
  3. use the visualization tool you want

As miro marchi pointed, there are several solutions to achieve this goal, some of them locked to particular data sources others with much more freedom but that would require some coding skills.

Datasource

I would start with the choice of the source type: from the type of data probably I would choice either Neo4J, Titan or OrientDB (if you fancy something more exotic with some sort of flexibility). All of them offer a JSON REST API, the former with a proprietary system and language (Cypher) and the other two using the Blueprint / Rexster system. Neo4J supports the Blueprint stack as well if you like Gremlin over Cypher.

For other solutions, such other NoSQL or SQL db probably you have to code a layer above with the relative REST API, but it will work as well - I wouldn't recommend that for the kind of data you have though.

Now, only the third point is left and here you have several choices.

Generic Viz tools

  • Sigma.js it's a free and open source tool for graph visualization quite nice. Linkurious is using a fork version of it as far as I know in their product.

  • Keylines it's a commercial graph visualization tool, with advanced stylings, analytics and layouts, and they provide copy/paste demos if you are using Neo4J or Titan. It is not free, but it does support even older browsers - IE7 onwards...

  • VivaGraph it's another free and open source tool for graph visualization tool - but it has a smaller community compared to SigmaJS.

  • D3.js it's the factotum for data visualization, you can do basically every kind of visualization based on that, but the learning curve is quite steep.

  • Gephi is another free and open source desktop solution, you have to use an external plugin with that probably but it does support most of the formats out there - graphML, CSV, Neo4J, etc...

Vendor specific

  • Linkurious it's a commercial Neo4J specific complete tool to search/investigate data.

  • Neo4J web-admin console - even if it's basic they've improved a lot with the newer version 2.x.x, based on D3.js.

There are also other solutions that I probably forgot to mention, but the ones above should offer a good variety.

Other nodes

The JS tools above will visualize well up to 1500/2000 nodes at once, due to JS limits.
If you want to visualize bigger stuff - while expanding - I would to recommend desktop solutions such Gephi.

Disclaimer

I'm part of the the Keylines dev team.

like image 72
MarcoL Avatar answered Sep 21 '22 19:09

MarcoL