Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Disk Space getting filled up due to jobcache in tmp directory of nutch linux instance

I am a newbie. We have setup solr environment and we see that in nutch we are facing an issue. Disk space is being 100% utilized. When we debug it we see that the jobcache in the below location is utilizing more space (70% appx.).

"/tmp/hadoop-root/mapred/local/taskTracker/root/jobcache/".

I have searched many forums to understand what exactly does this jobcache folder contains.

Can anyone help me in understanding what does this jobcache folder contains and how can I restrict this tmp folder to not to utilize the space.

What effect will it have if I remove the jobcache folder and again create it by using mkdir command?

Thanks in advance.

like image 948
user2197873 Avatar asked Nov 21 '25 00:11

user2197873


1 Answers

The directory name you mentioned is /tmp/hadoop-root/mapred/local/taskTracker/root/jobcache/. This directory is used by the TaskTracker (slave) daemons to localize job files when the tasks are run on the slaves. When a job completes, the directories under the jobCache must get automatically cleaned up. This email chain http://mail-archives.apache.org/mod_mbox/hadoop-user/201301.mbox/%3C26850_1357828735_0MGE0023YZCTOO30_99DD75DC8938B743BBBC2CA54F7224A706D2E1AF@NYSGMBXB06.a.wcmc-ad.net%3E discussed a similar problem.

like image 80
iqstatic Avatar answered Nov 23 '25 23:11

iqstatic