have a question on graph databases, can some one help me please? I'm handling quite a lot of data in mysql about 5M records a day sent by a router like device, access points, wireless bridges. The data is usually health data, gps etc... these are devices on vehicles. How do you handle time based data in graph databases? Has anyone applied neo4j for time-based data? It would be great to know how you query intervals and how you'd go about modelling.
I guess I can create a node for every single time i receive data with properties set each time like changed gps, health? It would be a time based graph - does that sound right? well with 5M rows mysql isn't performing bad - but as router gets new functionality new data comes through and I need to create new models again which isn't bad but not great. i want something which is semi structured and makes relating different things like why the user got kicked out is because of an access point associated to the router is down. My usual queries would be to raise alerts to say one of the device is down or if there is a reduced throughput etc. Would neo4j help me in marrying up these relationships better than mysql?
Would love to know what you guys think, any comments + thoughts appreciated.
We don't really care about dates for our query so we'll just use the current time to work around this issue. We can get the current time by calling the datetime() function.
Neo4j's connected data technology allows data to be represented as a graph. Combined with the event streaming platform from Confluent enables real time data relationships; resulting in powerful, new capabilities for enterprise customers.
Neo4j has some upper bound limit for the graph size and can support tens of billions of nodes, properties, and relationships in a single graph. No security is provided at the data level and there is no data encryption. Security auditing is not available in Neo4j.
Neo4j uses a property graph database model. A graph data structure consists of nodes (discrete objects) that can be connected by relationships. Example 1.
Please refer to the following GraphGist for a tutorial on how to do time-based graph storage using time scales.
http://gist.neo4j.org/?github-kbastani%2Fgists%2F%2Fmeta%2FTimeScaleEventMetaModel.adoc
In the time scale graph that is modeled above, a shortest path traversal from a blue colored node to the transparent colored node constitutes a unique time identity in bits.
The identity traced by the red path is 0→1→0→1→0→0. The reverse path is 0→0→1→0→1→0 or simply 001010, a unique identity in bits.
MATCH p=shortestPath((n1:d)-[:child_of*]->(n2:y)) WHERE n1.key = 'd10' RETURN DISTINCT reduce(s = '' , n IN nodes(p)| n.tempo + s) AS TimeIdentity ORDER BY TimeIdentity
The Cypher query above models a shortest path traversal from blue colored node to transparent colored node. This is a bit string that represents a time identity that can be ordered by event depending on its position on the time scale event subgraph.
Please see the time scale event subgraph below:
The image above represents a time scale connected to a series of events (met). Events, represented as triangular nodes in the image, are also connected to a hierarchy of features (John, Sally, Pam, Anne) which are then further generalized into classes (Person).
Now you can run a Cypher query like the one I listed earlier which will then order the events by time of occurrence as a bit string. Note: That you should apply a timestamp to the node that retrieves the actual time. Each blue node represents a time separated event but not necessarily the actual time, just a representation of events that happened in an order.
MATCH p=(p0:person)-[:event]->(ev)-[:event]->(p1:person) WITH p, ev MATCH time_identity = (d0:d)<-[:event]-(ev) WITH d0, p MATCH p1=(d0)-[:child_of*]->(y0:y) RETURN extract(x IN nodes(p)| coalesce(x.name, x.future)) AS Interaction, reduce(s = '' , n IN nodes(p1)| n.tempo + s) AS TimeIdentity ORDER BY TimeIdentity
The hierarchies in the time scale allow you to group events and to see representations at higher levels. So selecting all green nodes below an orange node selects 4 possible events (represented by blue nodes).
Let me know if you have any questions, and be sure to visit the GraphGist to see more details and actual live examples of the time scale event subgraph.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With