Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

WARN snappy.LoadSnappy: Snappy native library not loaded

whatever I do, I can't get rid of this error. I know snappy is a fast and therefore a preferable compression/decompression library over the other options. I would like to use this library for my processing. As far as I know Google uses this internally for their BigTables, MapReduce (basically for all their killer applications). I did research on my own. People suggest not to use it, or java-snappy as an option, but I want to stick with the hadoop snappy. I have the corresponding library in my setup. (I mean under lib)

Could someone fix this error? I see that jobs are done successfully regardless this error.

****hdfs://localhost:54310/user/hduser/gutenberg
12/06/01 18:18:54 INFO input.FileInputFormat: Total input paths to process : 3
12/06/01 18:18:54 INFO util.NativeCodeLoader: Loaded the native-hadoop library
12/06/01 18:18:54 WARN snappy.LoadSnappy: Snappy native library not loaded
12/06/01 18:18:54 INFO mapred.JobClient: Running job: job_201206011229_0008
12/06/01 18:18:55 INFO mapred.JobClient:  map 0% reduce 0%
12/06/01 18:19:08 INFO mapred.JobClient:  map 66% reduce 0%
12/06/01 18:19:14 INFO mapred.JobClient:  map 100% reduce 0%
12/06/01 18:19:17 INFO mapred.JobClient:  map 100% reduce 22%
12/06/01 18:19:23 INFO mapred.JobClient:  map 100% reduce 100%
12/06/01 18:19:28 INFO mapred.JobClient: Job complete: job_201206011229_0008
12/06/01 18:19:28 INFO mapred.JobClient: Counters: 29
12/06/01 18:19:28 INFO mapred.JobClient:   Job Counters 
12/06/01 18:19:28 INFO mapred.JobClient:     Launched reduce tasks=1
12/06/01 18:19:28 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=22810
12/06/01 18:19:28 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
12/06/01 18:19:28 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
12/06/01 18:19:28 INFO mapred.JobClient:     Launched map tasks=3
12/06/01 18:19:28 INFO mapred.JobClient:     Data-local map tasks=3
12/06/01 18:19:28 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=14345
12/06/01 18:19:28 INFO mapred.JobClient:   File Output Format Counters 
12/06/01 18:19:28 INFO mapred.JobClient:     Bytes Written=880838
12/06/01 18:19:28 INFO mapred.JobClient:   FileSystemCounters
12/06/01 18:19:28 INFO mapred.JobClient:     FILE_BYTES_READ=2214849
12/06/01 18:19:28 INFO mapred.JobClient:     HDFS_BYTES_READ=3671878
12/06/01 18:19:28 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=3775339
12/06/01 18:19:28 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=880838
12/06/01 18:19:28 INFO mapred.JobClient:   File Input Format Counters 
12/06/01 18:19:28 INFO mapred.JobClient:     Bytes Read=3671517
12/06/01 18:19:28 INFO mapred.JobClient:   Map-Reduce Framework
12/06/01 18:19:28 INFO mapred.JobClient:     Map output materialized bytes=1474341
12/06/01 18:19:28 INFO mapred.JobClient:     Map input records=77932
12/06/01 18:19:28 INFO mapred.JobClient:     Reduce shuffle bytes=1207328
12/06/01 18:19:28 INFO mapred.JobClient:     Spilled Records=255962
12/06/01 18:19:28 INFO mapred.JobClient:     Map output bytes=6076095
12/06/01 18:19:28 INFO mapred.JobClient:     CPU time spent (ms)=12100
12/06/01 18:19:28 INFO mapred.JobClient:     Total committed heap usage (bytes)=516882432
12/06/01 18:19:28 INFO mapred.JobClient:     Combine input records=629172
12/06/01 18:19:28 INFO mapred.JobClient:     SPLIT_RAW_BYTES=361
12/06/01 18:19:28 INFO mapred.JobClient:     Reduce input records=102322
12/06/01 18:19:28 INFO mapred.JobClient:     Reduce input groups=82335
12/06/01 18:19:28 INFO mapred.JobClient:     Combine output records=102322
12/06/01 18:19:28 INFO mapred.JobClient:     Physical memory (bytes) snapshot=605229056
12/06/01 18:19:28 INFO mapred.JobClient:     Reduce output records=82335
12/06/01 18:19:28 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=2276663296
12/06/01 18:19:28 INFO mapred.JobClient:     Map output records=629172

P.S.: Currently, I am working with a small dataset where fast compression and decompression does not really matter. But once I have a working workflow, I will load it with large datasets.

like image 557
Bob Avatar asked Jun 04 '12 07:06

Bob


1 Answers

You'll see this error message if the shared library (.so) for snappy is not located on the LD_LIBARAY_PATH / java.library.path. If you have the libraries installed in the correct location then you shouldn't see the above error messages.

If you do have the .so installed in the same folder as the hadoop native lib (libhadoop.so), then the above 'error' could be related to the node you are submitting your jobs from (like you say, your job doesn't error and this looks like a message on the client side).

Can you share some details of your job configuration (where you configure your output format, and the associated compression options).

like image 83
Chris White Avatar answered Nov 15 '22 23:11

Chris White