If you try this:
spark-submit \
  --packages "org.apache.hadoop:hadoop-aws:2.7.4" \
  pyspark-example.py
You will get a large amount of noise output as spark-submit resolves all the dependencies of the hadoop-aws package and downloads them. You get slightly less output if the package is already downloaded, but it's still a lot:
org.apache.hadoop:hadoop-aws:2.7.4 pyspark-example.py
Ivy Default Cache set to: /home/ec2-user/.ivy2/cache
The jars for the packages stored in: /home/ec2-user/.ivy2/jars
:: loading settings :: url = jar:file:/hadoop/spark/spark-2.2.1-bin-hadoop2.7/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
org.apache.hadoop#hadoop-aws added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
    confs: [default]
    found org.apache.hadoop#hadoop-aws;2.7.4 in central
    found org.apache.hadoop#hadoop-common;2.7.4 in central
    found org.apache.hadoop#hadoop-annotations;2.7.4 in central
    found com.google.guava#guava;11.0.2 in central
    found com.google.code.findbugs#jsr305;3.0.0 in central
    found commons-cli#commons-cli;1.2 in central
    found org.apache.commons#commons-math3;3.1.1 in central
    found xmlenc#xmlenc;0.52 in central
    found commons-httpclient#commons-httpclient;3.1 in central
    found commons-logging#commons-logging;1.1.3 in central
    found commons-codec#commons-codec;1.4 in central
    found commons-io#commons-io;2.4 in central
    found commons-net#commons-net;3.1 in central
    found commons-collections#commons-collections;3.2.2 in central
    found javax.servlet#servlet-api;2.5 in central
    found org.mortbay.jetty#jetty;6.1.26 in central
    found org.mortbay.jetty#jetty-util;6.1.26 in central
    found org.mortbay.jetty#jetty-sslengine;6.1.26 in central
    found com.sun.jersey#jersey-core;1.9 in central
    found com.sun.jersey#jersey-json;1.9 in central
    found org.codehaus.jettison#jettison;1.1 in central
    found com.sun.xml.bind#jaxb-impl;2.2.3-1 in central
    found javax.xml.bind#jaxb-api;2.2.2 in central
    found javax.xml.stream#stax-api;1.0-2 in central
    found javax.activation#activation;1.1 in central
    found org.codehaus.jackson#jackson-core-asl;1.9.13 in central
    found org.codehaus.jackson#jackson-mapper-asl;1.9.13 in central
    found org.codehaus.jackson#jackson-jaxrs;1.9.13 in central
    found org.codehaus.jackson#jackson-xc;1.9.13 in central
    found com.sun.jersey#jersey-server;1.9 in central
    found asm#asm;3.2 in central
    found log4j#log4j;1.2.17 in central
    found net.java.dev.jets3t#jets3t;0.9.0 in central
    found org.apache.httpcomponents#httpclient;4.2.5 in central
    found org.apache.httpcomponents#httpcore;4.2.5 in central
    found com.jamesmurty.utils#java-xmlbuilder;0.4 in central
    found commons-lang#commons-lang;2.6 in central
    found commons-configuration#commons-configuration;1.6 in central
    found commons-digester#commons-digester;1.8 in central
    found commons-beanutils#commons-beanutils;1.7.0 in central
    found commons-beanutils#commons-beanutils-core;1.8.0 in central
    found org.slf4j#slf4j-api;1.7.10 in central
    found org.apache.avro#avro;1.7.4 in central
    found com.thoughtworks.paranamer#paranamer;2.3 in central
    found org.xerial.snappy#snappy-java;1.0.4.1 in central
    found org.apache.commons#commons-compress;1.4.1 in central
    found org.tukaani#xz;1.0 in central
    found com.google.protobuf#protobuf-java;2.5.0 in central
    found com.google.code.gson#gson;2.2.4 in central
    found org.apache.hadoop#hadoop-auth;2.7.4 in central
    found org.apache.directory.server#apacheds-kerberos-codec;2.0.0-M15 in central
    found org.apache.directory.server#apacheds-i18n;2.0.0-M15 in central
    found org.apache.directory.api#api-asn1-api;1.0.0-M20 in central
    found org.apache.directory.api#api-util;1.0.0-M20 in central
    found org.apache.zookeeper#zookeeper;3.4.6 in central
    found org.slf4j#slf4j-log4j12;1.7.10 in central
    found io.netty#netty;3.6.2.Final in central
    found org.apache.curator#curator-framework;2.7.1 in central
    found org.apache.curator#curator-client;2.7.1 in central
    found com.jcraft#jsch;0.1.54 in central
    found org.apache.curator#curator-recipes;2.7.1 in central
    found org.apache.htrace#htrace-core;3.1.0-incubating in central
    found org.mortbay.jetty#servlet-api;2.5-20081211 in central
    found javax.servlet.jsp#jsp-api;2.1 in central
    found jline#jline;0.9.94 in central
    found junit#junit;4.11 in central
    found org.hamcrest#hamcrest-core;1.3 in central
    found com.fasterxml.jackson.core#jackson-databind;2.2.3 in central
    found com.fasterxml.jackson.core#jackson-annotations;2.2.3 in central
    found com.fasterxml.jackson.core#jackson-core;2.2.3 in central
    found com.amazonaws#aws-java-sdk;1.7.4 in central
    found joda-time#joda-time;2.9.9 in central
    [2.9.9] joda-time#joda-time;[2.2,)
:: resolution report :: resolve 2170ms :: artifacts dl 65ms
    :: modules in use:
    asm#asm;3.2 from central in [default]
    com.amazonaws#aws-java-sdk;1.7.4 from central in [default]
    com.fasterxml.jackson.core#jackson-annotations;2.2.3 from central in [default]
    com.fasterxml.jackson.core#jackson-core;2.2.3 from central in [default]
    com.fasterxml.jackson.core#jackson-databind;2.2.3 from central in [default]
    com.google.code.findbugs#jsr305;3.0.0 from central in [default]
    com.google.code.gson#gson;2.2.4 from central in [default]
    com.google.guava#guava;11.0.2 from central in [default]
    com.google.protobuf#protobuf-java;2.5.0 from central in [default]
    com.jamesmurty.utils#java-xmlbuilder;0.4 from central in [default]
    com.jcraft#jsch;0.1.54 from central in [default]
    com.sun.jersey#jersey-core;1.9 from central in [default]
    com.sun.jersey#jersey-json;1.9 from central in [default]
    com.sun.jersey#jersey-server;1.9 from central in [default]
    com.sun.xml.bind#jaxb-impl;2.2.3-1 from central in [default]
    com.thoughtworks.paranamer#paranamer;2.3 from central in [default]
    commons-beanutils#commons-beanutils;1.7.0 from central in [default]
    commons-beanutils#commons-beanutils-core;1.8.0 from central in [default]
    commons-cli#commons-cli;1.2 from central in [default]
    commons-codec#commons-codec;1.4 from central in [default]
    commons-collections#commons-collections;3.2.2 from central in [default]
    commons-configuration#commons-configuration;1.6 from central in [default]
    commons-digester#commons-digester;1.8 from central in [default]
    commons-httpclient#commons-httpclient;3.1 from central in [default]
    commons-io#commons-io;2.4 from central in [default]
    commons-lang#commons-lang;2.6 from central in [default]
    commons-logging#commons-logging;1.1.3 from central in [default]
    commons-net#commons-net;3.1 from central in [default]
    io.netty#netty;3.6.2.Final from central in [default]
    javax.activation#activation;1.1 from central in [default]
    javax.servlet#servlet-api;2.5 from central in [default]
    javax.servlet.jsp#jsp-api;2.1 from central in [default]
    javax.xml.bind#jaxb-api;2.2.2 from central in [default]
    javax.xml.stream#stax-api;1.0-2 from central in [default]
    jline#jline;0.9.94 from central in [default]
    joda-time#joda-time;2.9.9 from central in [default]
    junit#junit;4.11 from central in [default]
    log4j#log4j;1.2.17 from central in [default]
    net.java.dev.jets3t#jets3t;0.9.0 from central in [default]
    org.apache.avro#avro;1.7.4 from central in [default]
    org.apache.commons#commons-compress;1.4.1 from central in [default]
    org.apache.commons#commons-math3;3.1.1 from central in [default]
    org.apache.curator#curator-client;2.7.1 from central in [default]
    org.apache.curator#curator-framework;2.7.1 from central in [default]
    org.apache.curator#curator-recipes;2.7.1 from central in [default]
    org.apache.directory.api#api-asn1-api;1.0.0-M20 from central in [default]
    org.apache.directory.api#api-util;1.0.0-M20 from central in [default]
    org.apache.directory.server#apacheds-i18n;2.0.0-M15 from central in [default]
    org.apache.directory.server#apacheds-kerberos-codec;2.0.0-M15 from central in [default]
    org.apache.hadoop#hadoop-annotations;2.7.4 from central in [default]
    org.apache.hadoop#hadoop-auth;2.7.4 from central in [default]
    org.apache.hadoop#hadoop-aws;2.7.4 from central in [default]
    org.apache.hadoop#hadoop-common;2.7.4 from central in [default]
    org.apache.htrace#htrace-core;3.1.0-incubating from central in [default]
    org.apache.httpcomponents#httpclient;4.2.5 from central in [default]
    org.apache.httpcomponents#httpcore;4.2.5 from central in [default]
    org.apache.zookeeper#zookeeper;3.4.6 from central in [default]
    org.codehaus.jackson#jackson-core-asl;1.9.13 from central in [default]
    org.codehaus.jackson#jackson-jaxrs;1.9.13 from central in [default]
    org.codehaus.jackson#jackson-mapper-asl;1.9.13 from central in [default]
    org.codehaus.jackson#jackson-xc;1.9.13 from central in [default]
    org.codehaus.jettison#jettison;1.1 from central in [default]
    org.hamcrest#hamcrest-core;1.3 from central in [default]
    org.mortbay.jetty#jetty;6.1.26 from central in [default]
    org.mortbay.jetty#jetty-sslengine;6.1.26 from central in [default]
    org.mortbay.jetty#jetty-util;6.1.26 from central in [default]
    org.mortbay.jetty#servlet-api;2.5-20081211 from central in [default]
    org.slf4j#slf4j-api;1.7.10 from central in [default]
    org.slf4j#slf4j-log4j12;1.7.10 from central in [default]
    org.tukaani#xz;1.0 from central in [default]
    org.xerial.snappy#snappy-java;1.0.4.1 from central in [default]
    xmlenc#xmlenc;0.52 from central in [default]
    ---------------------------------------------------------------------
    |                  |            modules            ||   artifacts   |
    |       conf       | number| search|dwnlded|evicted|| number|dwnlded|
    ---------------------------------------------------------------------
    |      default     |   72  |   1   |   0   |   0   ||   72  |   0   |
    ---------------------------------------------------------------------
:: retrieving :: org.apache.spark#spark-submit-parent
    confs: [default]
    0 artifacts copied, 72 already retrieved (0kB/17ms)
hadoop-aws is a relatively common package that enables Spark to interact with S3 via S3A. Every time someone runs spark-submit with that package, they are greeted with the above wall of text.
Is there a way to quiet all this output unless there is a problem? The solutions discussed here, like setting log4j.rootCategory=ERROR, don't seem to affect the above output.
Extracting from comments:
Since Spark uses Ivy API, it's should be possible to change the default logger by calling the following before Ivy is instantiated
org.apache.ivy.util.Message.setDefaultLogger(new org.apache.ivy.util.DefaultMessageLogger(org.apache.ivy.util.Message.Message.MSG_WARN));
I used warn here but it can be any of the message levels.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With