Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does it make sense to use neo4j to index a file system

I am working on a Java based backup client that scans for files on the file system and populates a Sqlite database with the directories and file names that it find to backup. Would it make sense to use neo4j instead of sqlite? Will it be more perfomant and easier to use for this application. I was thinking because a filesystem is a tree (or graph if you consider symbolic links), a gaph database may be suitable? The sqlite database schema defines only 2 tables, one for directories (full path and other info) and one for files (name only with foreign key to containing directory in directory table), so its relatively simple.

The application needs to index many millions of files so the solution needs to be fast.

like image 366
Hannes de Jager Avatar asked Jun 21 '11 08:06

Hannes de Jager


People also ask

Why would you create an index using Neo4j?

In neo4j you can create index for both property and nodes. Indexing is data structure that helps faster performance on retrieval operation on database. There is special features in neo4j indexing once you create indexing that index will manage itself and keep it up to date whenever changes made on the database.

What are the weaknesses of Neo4j?

Additionally, Neo4j has scalability weaknesses related to scaling writes, hence if your application is expected to have very large write throughputs, then Neo4j is not for you.

Does Neo4j have indexes?

There are different types of indexes available in Neo4j but they are not all compatible with the same property predicates. Indexes are commonly used for MATCH and OPTIONAL MATCH clauses that combine a label predicate with a property predicate.

When should I use Neo4j?

To handle a growing volume of connected data, you can go for Neo4j, a non-relational graph database that's optimized for managing relationships. The Neo4j database can help you build high-performance and scalable applications that use large volumes of connected data.


1 Answers

As long as you can perform the DB operations essentially using string matching on the stored file system paths, using a relational databases makes sense. The moment the data model gets more complex and you actually can't do your queries with string matching but need to traverse a graph, using a graph database will make this much easier.

like image 83
nawroth Avatar answered Nov 06 '22 12:11

nawroth