I am attempting to create a runtime layer to enable using R as the runtime for lambda functions using their new runtime API.
To do this, I have created a layer that contains all the dependencies needed for R, and then a second layer containing R itself. I built these layers using the same Amazon AMI that lambda runs on. I tested my build by zipping up my layers, creating a fresh instance, and then downloading and unzipping the layers into that new instance (putting everything in /opt, which also happens to be where I installed R and its dependencies when I built them). I used an instance type with minimal resources (2 cpu, 4GB RAM). As I understand it, this should then very closely approximate the lambda environment.
I have a little test script (test.r) that simply prints a message to stdout. This runs fine in the test environment. Here is the script:
cat("hello from planet lambdar")
And here is how it is invoked in the bootstrap script in my layer:
SCRIPT=$LAMBDA_TASK_ROOT/$(echo "$_HANDLER" | cut -d. -f1).r
echo "About to run $SCRIPT"
/opt/R/bin/Rscript $SCRIPT
From the logging below, it is apparent that the name of the script gets sent and parsed correctly. I have previously confirmed that the script test.r lands in /var/task as expected. But running this script via lambda results in a segmentation fault:
START RequestId: 2c1b8801-f903-11e8-a32d-796c039278f1 Version: $LATEST
About to run /var/task/test.r
/opt/bootstrap: line 18: 18 Segmentation fault (core dumped) /opt/R/bin/Rscript $SCRIPT
How do I debug this segmentation fault given that the process runs fine on a minimal EC2 instance running the same Amazon AMI used by lambda, loaded with the same set of tools and dependencies I created for the the layers I added to my lambda function?
In this case, it turned out that I was overly aggressive in copying shared libraries linked to the R executable into my layer. I took everything listed by
ldd /opt/R/lib/libR.so
and copied it to /opt/lib
The problem is that many of those libraries were already in the AMI, and their presence in a different location caused problems (perhaps related to the library cache?).
By moving only the two libraries that were not in the AMI (but added when I installed the build tools, which of course are not in the Lambda environment), the segfault went away. These two libraries are:
/usr/lib64/libgfortran.so.3
/usr/lib64/libquadmath.so.0
To answer the deeper question here, namely how to debug segfaults in the Lambda environment, I found inspiration here, and included something like this in my bootstrap
script to print the backtrace from the core dump:
gdb -q -n -ex bt -batch /opt/R/bin/Rscript /temp/core.N.XXXX
Where core.N.XXXX was the name of the core dump file (which can be discovered by putting echo $(ls /tmp)
in your bootstrap
script). The cloudwatch logs will then contain at least some hints from the backtrace.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With