In the past 2 years I found myself having to install and configure Puppeteer(opens new window) to capture webpage screenshots on so many occasions.

So I thought it would be great to have a generic API that captures webpage screenshots that I can reuse across multiple projects. The existence of serverless platforms like Vercel(opens new window) made it all the more easier to do this, even for personal projects.

# The need for screenshotting web pages

I regularly find need to programmatically take screenshots of web pages.

In each of these use cases I installed Puppeteer(opens new window) and wrote similar code: Go to the web page, take a screenshot, output the image. But Puppeteer is quite a heavy dependency, weighting over 100 MB in node_modules.

In cases where the code is used in dynamic sites, I would had to set up a Google Cloud function(opens new window) or an AWS Lambda function(opens new window) for that use cases.

Wouldn’t it be great if, instead of setting up Puppeteer every time, there is an API that I can immediately use? Well, there is a plenty(opens new window) of(opens new window) existing(opens new window) offerings(opens new window), but most of them are paid and the free ones went down(opens new window). Even some of the paid ones went down (the links that are struck out are now dead).

Since I wanted to re-use it across multiple projects, spanning multiple years to come, I want some degree of control. The service should:

  • Run on my own domain name. I don't want to have to go through all my projects and migrate to a new service, just because the old service is sunsetted.
  • Be affordable or free. It’s for personal use and is non-commercial. I don’t want to spend too much money on it.
  • Personal yet multi-tenant. I may use it on a project that is shared with others. In a rare case that I need to revoke an access to one project, it should not disrupt other projects. Therefore, it should support multiple keys.
  • Secure. Others should not be able to use the service to screenshot arbitrary web pages without my permission.
  • Single-shot API. Service consumers should be able to construct the image URL without having to make any extra API requests.

# Introducing personal-puppeteer

So this is what I created. Here’s how it works:

First, let’s say I want to generate a social card image for the URL at https://capture.the.spacet.me/. As an API consumer, I would:

  1. Generate a request.
  2. Cryptographically-sign the request into a JWT and construct an image URL.
  3. Send the image URL to client (in a <meta property="og:image"> tag).

Diagram for steps 1-3

  1. The browser would then make a request to the service, which would, on request, take a screenshot
  2. …and return the image back to the browser.

Diagram for steps 4-5

The service maintains a list of tenants which are allowed to use the service. This allows the service to be reused in multiple projects without them having to share the same secret key.

A list of tenants

# Under the hood

I was able to quickly build the first version of this service thanks to Vercel’s Edge Network(opens new window) and Serverless Functions(opens new window).

Diagram for the components inside the service

The first time a request is received:

  1. It would enter Vercel’s network.
  2. Since this was the first time the request was processed, it would be a cache MISS.
  3. Vercel would then call the underlying serverless function.
  4. Which in turn validates the request and captures a screenshot of the webpage.

Diagram for cache miss scenario (1)

  1. The image is returned from the serverless function with a very aggressive caching(opens new window) header.
  2. Vercel would return the response and put it into its cache.

Diagram for cache miss scenario (2)

The next time the request is received, it would be served by Vercel’s cache(opens new window).

Diagram for cache miss scenario (3)

# Use cases unlocked

# Automatic social image generation for all my web projects

If I create a web project or a blog post, but don’t want to spend the time crafting an og:image for each page, I can just use personal-puppeteer to generate a default og:image from the webpage’s screenshot.

Now when I share my article on Facebook, people would see the webpage’s screenshot.

A Facebook post where I posted an article. The preview image is the screenshot of the article’s contents.

Although it may not look as good as a handcrafted image, I think this is still way better than using a generic image or a profile picture as a default preview image.

So far, I am doing this for:

# Sending procedurally generated images in chat rooms

By using data: URLs, an HTML page can be embedded in the JWT. This can be useful when I want to create a chat bot that can generate infographics.

GIF demo

# Adding web page screenshot to README.md

The URL can be embedded into other websites to provide an auto-updating screenshot.

An image of the documentation website embedded in README

# Open source

This project is open source(opens new window), so you can run your own instance too!