Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to fix a MarkLogic "File too large" forest merge error?

I'm running MarkLogic version 8.0-6.1.

The host OS is Red Hat Enterprise Linux Server release 6.8 (Santiago).

The data is stored on a local disk that has 90% free space.

The server runs fairly well but it throws the following error sporadically.

SVC-FILWRT: File write error: write '/var/opt/MarkLogic/Forests/clickstream-1/0000008a/ListData': File too large

Any thoughts on the root cause and possible fix?

like image 442
Gary Russo Avatar asked Feb 01 '17 18:02

Gary Russo


1 Answers

Stands should normally not get that big. I can imagine two cases how they could occur, though not 100% certain they are true:

  • You have upgraded a large database with a low number of forests from a version before merge max size was introduced, preventing MarkLogic from purging the deleted fragments straight-away

  • You have ran some large transactions, causing in-memory stands to exceed the merge max size, resulting in a similar situation once persisted to disk

This doesn't have to be a bad thing, unless you hit a file write error though, of course. Deleted fragments in such large stands may linger longer than usual, but if sufficient fragments get deleted, MarkLogic will eventually merge them out anyway.

If you like to get rid of the large stands sooner, you could try putting the old forest into delete-only mode, forcing new updates to move elsewhere, and then 'touching' all documents inside that forest, to get them migrated to one of the other forests. Once that forest only contains deleted fragments, you then simply take that forest out (unassign it from the db), and delete it. After that you could potentially recreate it, and assign the empty forest to the database again. It might trigger a rebalance, but that should setting down eventually, with more evenly balanced stands across all forests of your database.

Anyway, it is probably wise to use more than one forest from the start if you anticipate certain growth, or large transactions.

For those who would like to dive deeper into the technical side, I'd recommend reading the Inside MarkLogic paper:

https://developer.marklogic.com/inside-marklogic

The Data Management section in particular is relevant to databases, forests, and stands.

HTH!

like image 88
grtjn Avatar answered Nov 15 '22 12:11

grtjn