Hey everyone! I'm running a Node.js app deployed as an AWS Lambda function that generates PDFs using Pug and Puppeteer, then sends them out via email. However, I'm facing issues with high memory usage because Puppeteer loads Chromium, which is quite resource-heavy.
I'm considering creating a separate service for PDF generation that the original Lambda function can call to get the generated PDFs. Since I don't generate PDFs too frequently, I want this service to operate 'on demand', similar to a Lambda. I'm not very experienced with serverless architectures or AWS yet, so I could use some advice on how to set this up. I've heard about using layers or Docker, but I'm unsure if that's the right direction. Any suggestions would be greatly appreciated!
5 Answers
I totally understand your struggle with Puppeteer in Lambda. It’s definitely memory-intensive! One workaround is to skip running Chromium in the Lambda altogether and just use an API service for PDF generation, like YakPDF. It lets you POST HTML and receive a PDF back, keeping your Lambda slim and only charging you for the PDFs generated.
I've built similar functions multiple times, but we often have to avoid third-party PDF APIs for privacy reasons. In my experience, Puppeteer in Lambda is the most reliable method, even if it uses more resources. To handle spikes in PDF requests, consider pre-generating and storing PDFs in S3. This way, users can download them instantly when needed, which is great for things like billing statements.
Puppeteer can be really heavy for Lambda due to Chromium usage. A solid alternative is to use a third-party PDF generation API like PDFBolt, which lets you send HTML or JSON data and get a PDF in return. This way, your Lambda remains lightweight since it only needs to call the API instead of running Chrome itself.
One quick trick is to just increase the memory allocated to your Lambda function, which can go up to 10GB these days. It might cost a bit, but it's often worth it to get past this problem and focus on other tasks.
If you want to go deeper, you could consider using libraries like PDFKit that don't rely on Chromium. It’s more complex but could save resources. You can also package this in a Lambda layer if you’re using tools like SAM or CDK to deploy your app.
Another route is to look into Docker for breaking out various components or even using a different language that's lighter, like Go or Rust.
You could create another Lambda function just for PDF generation and route it through an API Gateway. That way, your original Lambda can call this new function, making it easier to manage resources. Just set up a POST or GET request to handle the PDF generation when needed.

Related Questions
How To: Running Codex CLI on Windows with Azure OpenAI
Set Wordpress Featured Image Using Javascript
How To Fix PHP Random Being The Same
Why no WebP Support with Wordpress
Replace Wordpress Cron With Linux Cron
Customize Yoast Canonical URL Programmatically