AWS Glue jobs log output and errors to two different CloudWatch logs, /aws-glue/jobs/error
and /aws-glue/jobs/output
by default. When I include print()
statements in my scripts for debugging, they get written to the error log (/aws-glue/jobs/error
).
I have tried using:
log4jLogger = sparkContext._jvm.org.apache.log4j log = log4jLogger.LogManager.getLogger(__name__) log.warn("Hello World!")
but "Hello World!" doesn't show up in either of the logs for the test job I ran.
Does anyone know how to go about writing debug log statements to the output log (/aws-glue/jobs/output
)?
TIA!
EDIT:
It turns out the above actually does work. What was happening was that I was running the job in the AWS Glue Script editor window which captures Command-F key combinations and only searches in the current script. So when I tried to search within the page for the logging output it seemed as if it hadn't been logged.
NOTE: I did discover through testing the first responder's suggestion that AWS Glue scripts don't seem to output any log message with a level less than WARN!
AWS Glue now provides continuous logs to track real-time progress of executing Apache Spark stages in ETL jobs . You can access different log streams for Apache Spark driver and executors in Amazon CloudWatch and filter out highly verbose Apache Spark log messages making it easier to monitor and debug your ETL jobs.
Try to use built-in python logger from logging
module, by default it writes messages to standard output stream.
import logging MSG_FORMAT = '%(asctime)s %(levelname)s %(name)s: %(message)s' DATETIME_FORMAT = '%Y-%m-%d %H:%M:%S' logging.basicConfig(format=MSG_FORMAT, datefmt=DATETIME_FORMAT) logger = logging.getLogger(<logger-name-here>) logger.setLevel(logging.INFO) ... logger.info("Test log message")
I know the article is not new but maybe it could be helpful for someone: For me logging in glue works with the following lines of code:
# create glue context glueContext = GlueContext(sc) # set custom logging on logger = glueContext.get_logger() ... #write into the log file with: logger.info("s3_key:" + your_value)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With