Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Neo4j 2.0 unique constraint error "node already exists" when it does not

I am having a bit of trouble with neo4j unique constraints, where a CREATE cypher statement is failing to execute due to the node already existing. Problem is, it doesn't (exist). Also, this exact process with this exact data worked yesterday.

My neo4j version is community 2.0.0 (release) on ubuntu 12.04.3. This is my current situation:

My constraints:

tas@vtas:~$ neo4j-shell
neo4j-sh (?)$ schema
Indexes
  ON :ConsumerUser(tokens) ONLINE
  ON :Id(uuid)             ONLINE (for uniqueness constraint) #relevant
  ON :User(email)          ONLINE (for uniqueness constraint)
  ON :User(username)       ONLINE (for uniqueness constraint) 

Constraints
  ON (user:User) ASSERT user.email IS UNIQUE
  ON (user:User) ASSERT user.username IS UNIQUE
  ON (id:Id) ASSERT id.uuid IS UNIQUE                         #relevant

:Id.uuid should be unique.

I don't have any data:

neo4j-sh (?)$ dump
begin
create index on :`ConsumerUser`(`tokens`);
create index on :`Id`(`uuid`);
create index on :`User`(`email`);
create index on :`User`(`username`);
;

(also verified with cypher MATCH (n) return n;)

The problem query:

neo4j-sh (?)$ cypher 2.0 CREATE (i:Id{uuid:2});
CypherExecutionException: Node 82 already exists with label Id and property "uuid"=[2]

Things I have tried

  • tail -f /var/lib/neo4j/data/log/*.log /var/lib/neo4j/data/graph.db/*.log /var/log/neo4j/*.log for errors: nothing logged at all
  • Restarting neo4j (service neo4j-service restart)
  • The above tail while restarting (only remotely interesting line: [main] INFO org.neo4j.kernel.AutoConfigurator - WARNING! Physical memory(1017MB) is less than assigned JVM memory(4185MB). Continuing but with available JVM memory set to available physical memory)
  • deleting the indexes (/var/lib/neo4j/data/graph.db/index/ and /var/lib/neo4j/data/graph.db/index.db) and restarting
  • restoring the above, restarting
  • Search SO
  • Search neo4j's github issues

Nothing has helped so far.

Things I will not try

  • Solution here: neo4j constraint deleted node because: dropping the constraint in production isn't an option. Also, that was a few versions ago and different use case:
    1. constraint on String[] instead of Int
    2. something was logged
  • upgrade to 2.0.1 in a fingers-crossed-this-may-fix-it fashion without knowing that this has been addressed explicitly (i need to know why this is happening)

Additional Information

  • I have ulimit -n and ulimit -Hn set to 40K

  • neo4j-sh (?)$ dbinfo -g Kernel

    {
      "KernelStartTime": "Fri Feb 21 13:53:57 GMT 2014",
      "KernelVersion": "Neo4j - Graph Database Kernel (neo4j-kernel), version: 2.0.0",
      "MBeanQuery": "org.neo4j:instance=kernel#0,name=*",
      "ReadOnly": false,
      "StoreCreationDate": "Fri Feb 14 18:43:27 GMT 2014",
      "StoreDirectory": "/var/lib/neo4j/data/graph.db",
      "StoreId": "a3351846c194229c",
      "StoreLogVersion": 21
    }
    
  • I've seen this: https://github.com/neo4j/neo4j/issues/1069 but it seems resolved.

  • This is on a VirtualBox VM on a MacOSX 10.6 host

I'm at a loss, time for my first SO question.

The easy answer is "just wipe everything and start again" (or just re-do the constraint), but that isn't really acceptable (what if this happens in production?).

Any ideas?

like image 222
Tasos Bitsios Avatar asked Feb 21 '14 14:02

Tasos Bitsios


1 Answers

Your DB is corrupt. Internally, Neo4j has a reference to this node, but you deleted the node, so this reference points to nothing. You can't delete it, because it doesn't exist, and you can't create it, because it THINKS it exists. (This was most likely caused by in improper/unexpected shutdown of the database. Remember to make sure this machine has a battery backup in production)

This is why you ALWAYS BACKUP DATA IN PRODUCTION! If a shard becomes corrupt, than you can just purge it and reload the data. Minimal downtime, and no need to understand how it's corrupt, just that it is. If you don't have backups (and you should have offsite backups), than you will need to export your data to CSV, purge the db, and load the CSV data back in. By purge, I mean completely erase the old db directory, and let Neo4j create a new one.

(Do not try an salvage the database without doing a clean purge, as once a DB becomes corrupt, you have no way of knowing what or how it has been compromised.)

like image 73
Tezra Avatar answered Oct 28 '22 00:10

Tezra