Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How Neo4j stores data internally?

Tags:

My question is from the view of developer (not specifically respect to User) and may be bit messy. I want to know that how the structure of Nodes and Relationships is get stored in database logically. Like, when I say that I have bla bla information. Where? - then the answer is, in BOOK, either in form of Grid or lines on a page. In case of RDBMS, data is stored in Grid/Tabular format. But I am unable to understand how graph is get stored in Neo4j/graph database. I am using neo4j client 2.1.2.

like image 978
Vincenzo Avatar asked Jun 23 '14 12:06

Vincenzo


People also ask

How is data stored in Neo4j?

Properties are stored as a linked list of property records, each holding a key and value and pointing to the next property. Each node and relationship references its first property record. The Nodes also reference the first relationship in its relationship chain. Each Relationship references its start and end node.

How is data stored in graph database?

Graph data is kept in store files, each of which contain data for a specific part of the graph, such as nodes, relationships, labels and properties. Dividing the storage in this way facilitates highly performant graph traversals (as detailed above).

Which is used for store and retrieve data from Neo4j?

Cypher Query Language − Neo4j provides a powerful declarative query language known as Cypher. It uses ASCII-art for depicting graphs. Cypher is easy to learn and can be used to create and retrieve relations between data without using the complex queries like Joins.

Is Neo4j an in-memory database?

Memgraph uses an in-memory storage engine while Neo4j implements a traditional on-disk storage solution.


1 Answers

http://www.slideshare.net/thobe/an-overview-of-neo4j-internals is very outdated but this gives you a good overview of Neo4j logical representation.

A node references:

  • its first label (my guess is that labels are stored as a singly linked list)
  • its first property (properties are organized as a singly linked list)
  • its start/end relationships

Relationships are organized as doubly linked lists. A relationship points to:

  • its first property (same as nodes)
  • the predecessor and successor relationship of its start node
  • the predecessor and successor relationship of its end node

Because of this chaining structure, the notion of traversal (i.e. THE way of querying data) easily emerges. That's why a graph database like Neo4j excels at traversing graph-structured data.

My rough guess would be also, since Neo4j version 2.1 (and its newly introduced dense node management), nodes' relationships are segregated by type. By doing so, if a node N is for example a start node for 5 relationships of type A and for 5 million rels of type B, traversing rels of type A for N remains O(n=5).

like image 78
fbiville Avatar answered Oct 07 '22 23:10

fbiville