I have to run pdf2image on my Python Lambda Function in AWS, but it requires poppler and poppler-utils to be installed on the machine.
I have tried to search in many different places how to do that but could not find anything or anyone that have done that using lambda functions.
Would any of you know how to generate poppler binaries, put it on my Lambda package and tell Lambda to use that?
Thank you all.
AWS lambda runs under an execution environment which includes software and libraries if anything you need is not there you need to install it to create an execution environment.Check the below link for more info , https://docs.aws.amazon.com/lambda/latest/dg/current-supported-versions.html
for poppler follow this steps to create your own binary https://github.com/skylander86/lambda-text-extractor/blob/master/BuildingBinaries.md
My approach was to use the AWS Linux 2 image as a base to ensure maximum compatibility with the Lambda environment, compile openjpeg and poppler in the container build and build a zip containing the binaries and libraries needed which can then by used as a layer.
This enables you to write your code in it's own lambda which pulls in the poppler dependencies as a layer, simplifying build and deployment.
The contents of the layer will be unpacked into /opt/
. This means the contents will automatically be available because by default in the lambda environment
$PATH
is /usr/local/bin:/usr/bin/:/bin:/opt/bin
$LD_LIBRARY_PATH
is /lib64:/usr/lib64:$LAMBDA_RUNTIME_DIR:$LAMBDA_RUNTIME_DIR/lib:$LAMBDA_TASK_ROOT:$LAMBDA_TASK_ROOT/lib:/opt/lib
Dockerfile
:
# https://www.petewilcock.com/using-poppler-pdftotext-and-other-custom-binaries-on-aws-lambda/
ARG POPPLER_VERSION="21.10.0"
ARG POPPLER_DATA_VERSION="0.4.11"
ARG OPENJPEG_VERSION="2.4.0"
FROM amazonlinux:2
ARG POPPLER_VERSION
ARG POPPLER_DATA_VERSION
ARG OPENJPEG_VERSION
WORKDIR /root
RUN yum update -y
RUN yum install -y \
cmake \
cmake3 \
fontconfig-devel \
gcc \
gcc-c++ \
gzip \
libjpeg-devel \
libpng-devel \
libtiff-devel \
make \
tar \
xz \
zip
RUN curl -o poppler.tar.xz https://poppler.freedesktop.org/poppler-${POPPLER_VERSION}.tar.xz
RUN tar xf poppler.tar.xz
RUN curl -o poppler-data.tar.gz https://poppler.freedesktop.org/poppler-data-${POPPLER_DATA_VERSION}.tar.gz
RUN tar xf poppler-data.tar.gz
RUN curl -o openjpeg.tar.gz https://codeload.github.com/uclouvain/openjpeg/tar.gz/refs/tags/v${OPENJPEG_VERSION}
RUN tar xf openjpeg.tar.gz
WORKDIR poppler-data-${POPPLER_DATA_VERSION}
RUN make install
WORKDIR /root
RUN mkdir openjpeg-${OPENJPEG_VERSION}/build
WORKDIR openjpeg-${OPENJPEG_VERSION}/build
RUN cmake .. -DCMAKE_BUILD_TYPE=Release
RUN make
RUN make install
WORKDIR /root
RUN mkdir poppler-${POPPLER_VERSION}/build
WORKDIR poppler-${POPPLER_VERSION}/build
RUN cmake3 .. -DCMAKE_BUILD_TYPE=release -DBUILD_GTK_TESTS=OFF -DBUILD_QT5_TESTS=OFF -DBUILD_QT6_TESTS=OFF \
-DBUILD_CPP_TESTS=OFF -DBUILD_MANUAL_TESTS=OFF -DENABLE_BOOST=OFF -DENABLE_CPP=OFF -DENABLE_GLIB=OFF \
-DENABLE_GOBJECT_INTROSPECTION=OFF -DENABLE_GTK_DOC=OFF -DENABLE_QT5=OFF -DENABLE_QT6=OFF \
-DENABLE_LIBOPENJPEG=openjpeg2 -DENABLE_CMS=none -DBUILD_SHARED_LIBS=OFF
RUN make
RUN make install
WORKDIR /root
RUN mkdir -p package/{lib,bin,share}
RUN cp -d /usr/lib64/libexpat* package/lib
RUN cp -d /usr/lib64/libfontconfig* package/lib
RUN cp -d /usr/lib64/libfreetype* package/lib
RUN cp -d /usr/lib64/libjbig* package/lib
RUN cp -d /usr/lib64/libjpeg* package/lib
RUN cp -d /usr/lib64/libpng* package/lib
RUN cp -d /usr/lib64/libtiff* package/lib
RUN cp -d /usr/lib64/libuuid* package/lib
RUN cp -d /usr/lib64/libz* package/lib
RUN cp -rd /usr/local/lib/* package/lib
RUN cp -rd /usr/local/lib64/* package/lib
RUN cp -d /usr/local/bin/* package/bin
RUN cp -rd /usr/local/share/poppler package/share
WORKDIR package
RUN zip -r9 ../package.zip *
And to run...
docker build -t poppler .
docker run --name poppler -d -t poppler cat
docker cp poppler:/root/package.zip .
Then upload package.zip
as a layer using the console or aws cli.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With