Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Zookeeper auto-purge does not work

After around 6 months of using zookeeper in develop phase, though it works fine but the size of its data directory grew up to 6 GIG! and it is still increasing. Some of the system specifications are listed below:

zookeeper version: 3.4.6
number of clients: < 10
number of znodes: < 400
also ...
there are 90 log.* files in dataDir/version-2
there is no snapshot.* file in dataDir/version-2 !

Searching through google for this problem I found auto-purge option in Advanced Configuration section of the ZooKeeper Administrator's Guide page. Then I rolled zookeeper out using the following conifguration (zoo.cfg):

tickTime=2000
dataDir=/home/faghani/software/zookeeper/zkdata
clientPort=2181
authProvider.1=org.apache.zookeeper.server.auth.SASLAuthenticationProvider requireClientAuthScheme=sasl
autopurge.snapRetainCount=3
autopurge.purgeInterval=1

But no change was occurred even when purgeInterval got many times expired, i.e. still the size of zookeeper data directory is 6G and no file was deleted. Here is a ls -laht on ${dataDir}/version-2. There is a strange point here, Nautilus says that the size of data directory is 6G but ls -laht says it is just 3.4G!

faghani@node255:~/software/zookeeper/zkdata/version-2$ ls -laht  
total 3.4G  
-rw-rw-r-- 1 faghani faghani  65M Dec 20 10:09 log.1061d  
drwx------ 2 faghani faghani 4.0K Dec 20 10:09 .  
-rw-rw-r-- 1 faghani faghani  65M Dec 19 17:28 log.105f2  
-rw-rw-r-- 1 faghani faghani  65M Dec 15 18:37 log.105c1  
-rw-rw-r-- 1 faghani faghani  65M Dec 14 16:17 log.105bc  
-rw-rw-r-- 1 faghani faghani  65M Dec  9 18:08 log.10576  
drwx------ 3 faghani faghani 4.0K Dec  9 16:57 ..    
-rw-rw-r-- 1 faghani faghani  65M Dec  9 16:56 log.10565
-rw-rw-r-- 1 faghani faghani  65M Dec  8 18:31 log.1048c
and many more until ...  
-rw------- 1 faghani faghani  65M Sep  2 16:41 log.1d03  

Also the following command (as suggested in Maintenance section) made no effect on the files in data directory.

java -cp zookeeper.jar:lib/slf4j-api-1.7.5.jar:lib/slf4j-log4j12-1.7.5.jar:lib/log4j-1.2.16.jar:conf org.apache.zookeeper.server.PurgeTxnLog <dataDir> <snapDir> -n <count>

By the way, I found this question but unfortunately there is no solution for it in that page.

Questions:

1- Where are the snapshot.* files?
2- If SASL settings can hinder auto-purging? (I think no)
3- Is something gone wrong in configuration?

EDIT: It seems that the solution is something around the snapCount property. Default value of this property is 100000, just decrease it to a very small number, e.g. 10, and test the system.

like image 390
faghani Avatar asked Dec 21 '15 07:12

faghani


1 Answers

You can use that zkCleanup.sh script which is inside the source folder (./bin sub-folder). In case you can't find it, you can find it here. The usage of this script:

zkCleanup.sh <snapshotDir> -n <count>
for example:
# ./zkCleanup.sh /tmp/zookeeper -n 6 

<snapshotDir> is the zookeeper snapshot files location, in my case, my snapshot files are in folder /tmp/zookeeper/version-2/

<count> is the retained log & snapshot file numbers, The value of <count> should typically be greater than 3.

For more details, you can refer this document: Ongoing Data Directory Cleanup.

This can be run as a cron job on the ZooKeeper server machines to clean up the logs daily. In my case, I clean up one time every week:

0 7 * * 0 ( cd /root/otter/zookeeper/zookeeper-3.4.10/bin && ./zkCleanup.sh /tmp/zookeeper -n 5 ) >> /tmp/zookeeper/cron.log 2>&1

You can add this by crontab -e but remember to change the frequency according to your requirement.

like image 104
gary Avatar answered Sep 25 '22 05:09

gary