After around 6 months of using zookeeper in develop phase, though it works fine but the size of its data directory grew up to 6 GIG! and it is still increasing. Some of the system specifications are listed below:
zookeeper version: 3.4.6
number of clients: < 10
number of znodes: < 400
also ...
there are 90 log.* files in dataDir/version-2
there is no snapshot.* file in dataDir/version-2 !
Searching through google for this problem I found auto-purge option in Advanced Configuration section of the ZooKeeper Administrator's Guide
page. Then I rolled zookeeper out using the following conifguration (zoo.cfg):
tickTime=2000
dataDir=/home/faghani/software/zookeeper/zkdata
clientPort=2181
authProvider.1=org.apache.zookeeper.server.auth.SASLAuthenticationProvider requireClientAuthScheme=sasl
autopurge.snapRetainCount=3
autopurge.purgeInterval=1
But no change was occurred even when purgeInterval
got many times expired, i.e. still the size of zookeeper data directory is 6G and no file was deleted. Here is a ls -laht
on ${dataDir}/version-2
. There is a strange point here, Nautilus
says that the size of data directory is 6G but ls -laht
says it is just 3.4G!
faghani@node255:~/software/zookeeper/zkdata/version-2$ ls -laht
total 3.4G
-rw-rw-r-- 1 faghani faghani 65M Dec 20 10:09 log.1061d
drwx------ 2 faghani faghani 4.0K Dec 20 10:09 .
-rw-rw-r-- 1 faghani faghani 65M Dec 19 17:28 log.105f2
-rw-rw-r-- 1 faghani faghani 65M Dec 15 18:37 log.105c1
-rw-rw-r-- 1 faghani faghani 65M Dec 14 16:17 log.105bc
-rw-rw-r-- 1 faghani faghani 65M Dec 9 18:08 log.10576
drwx------ 3 faghani faghani 4.0K Dec 9 16:57 ..
-rw-rw-r-- 1 faghani faghani 65M Dec 9 16:56 log.10565
-rw-rw-r-- 1 faghani faghani 65M Dec 8 18:31 log.1048c
and many more until ...
-rw------- 1 faghani faghani 65M Sep 2 16:41 log.1d03
Also the following command (as suggested in Maintenance section) made no effect on the files in data directory.
java -cp zookeeper.jar:lib/slf4j-api-1.7.5.jar:lib/slf4j-log4j12-1.7.5.jar:lib/log4j-1.2.16.jar:conf org.apache.zookeeper.server.PurgeTxnLog <dataDir> <snapDir> -n <count>
By the way, I found this question but unfortunately there is no solution for it in that page.
Questions:
1- Where are the snapshot.* files?
2- If SASL settings can hinder auto-purging? (I think no)
3- Is something gone wrong in configuration?
EDIT: It seems that the solution is something around the snapCount property. Default value of this property is 100000, just decrease it to a very small number, e.g. 10, and test the system.
You can use that zkCleanup.sh script which is inside the source folder (./bin
sub-folder). In case you can't find it, you can find it here. The usage of this script:
zkCleanup.sh <snapshotDir> -n <count>
for example:
# ./zkCleanup.sh /tmp/zookeeper -n 6
<snapshotDir>
is the zookeeper snapshot files location, in my case, my snapshot files are in folder /tmp/zookeeper/version-2/
<count>
is the retained log & snapshot file numbers, The value of <count>
should typically be greater than 3.
For more details, you can refer this document: Ongoing Data Directory Cleanup.
This can be run as a cron job on the ZooKeeper server machines to clean up the logs daily. In my case, I clean up one time every week:
0 7 * * 0 ( cd /root/otter/zookeeper/zookeeper-3.4.10/bin && ./zkCleanup.sh /tmp/zookeeper -n 5 ) >> /tmp/zookeeper/cron.log 2>&1
You can add this by crontab -e
but remember to change the frequency according to your requirement.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With