Building a Screen Capture Pipeline
Written: 18 Feb, 2026
Recently, I needed to document six iterations of a UI animation with identical framing and timing. Manual screen recording using Mac’s native Screenshot app simply didn’t cut it! The window shifts between takes, scroll speed varies, and trimming dead time at the start/end is a cumbersome chore.
I asked Claude to automate it for me: Puppeteer + CDP + ffmpeg.
The pieces
Puppeteer launches headless Chrome with a locked viewport. Dimensions, device scale, and interactions are all programmable.
CDP (Chrome DevTools Protocol) has Page.startScreencast, which streams every frame as a base64 PNG. Not periodic screenshots; every frame the browser actually paints, synced to the render cycle.
ffmpeg stitches the PNGs into h264 mp4.
Browser setup
import puppeteer from "puppeteer";
const browser = await puppeteer.launch({
headless: true,
args: ["--force-device-scale-factor=2"],
});
const page = await browser.newPage();
await page.setViewport({
width: 800,
height: 200,
deviceScaleFactor: 2,
});
deviceScaleFactor: 2 produces retina-quality frames at 2x the viewport dimensions (something that I couldn’t manage to do with the native Screenshot app). The viewport itself is your crop region. Capture only what matters.
Use headless. On my first attempt I ran headed mode and accidentally resized the window mid-capture. The viewport shrank, the page’s backdrop layer bled through. Headless removes that variable.
Smooth scrolling
window.scrollTo jumps instantly. Scroll-triggered animations need interpolated movement over time, frame by frame. This requestAnimationFrame loop eases the scroll position along a cubic curve:
async function smoothScroll(page, targetY, durationMs) {
await page.evaluate(
(target, dur) => {
return new Promise((resolve) => {
const start = window.scrollY;
const distance = target - start;
const startTime = performance.now();
function easeInOutCubic(t) {
return t < 0.5
? 4 * t * t * t
: 1 - Math.pow(-2 * t + 2, 3) / 2;
}
function step(now) {
const progress = Math.min((now - startTime) / dur, 1);
window.scrollTo(0, start + distance * easeInOutCubic(progress));
if (progress < 1) requestAnimationFrame(step);
else resolve();
}
requestAnimationFrame(step);
});
},
targetY,
durationMs,
);
}
The easing curve accelerates and decelerates like a trackpad swipe. Swap in any easing function. Longer durations give animations more time to settle on camera.
For hover or click interactions, Puppeteer’s page.hover and page.click work.
Capturing frames
Start the screencast, write each frame to disk, acknowledge it so Chrome sends the next:
const client = await page.createCDPSession();
let frameIndex = 0;
const framesDir = "/tmp/capture-frames";
await client.send("Page.startScreencast", {
format: "png",
quality: 100,
maxWidth: 1600, // 2x your viewport width
maxHeight: 400, // 2x your viewport height
everyNthFrame: 1,
});
client.on("Page.screencastFrame", async (event) => {
const filename = `${framesDir}/frame-${String(frameIndex).padStart(5, "0")}.png`;
fs.writeFileSync(filename, Buffer.from(event.data, "base64"));
frameIndex++;
await client.send("Page.screencastFrameAck", {
sessionId: event.sessionId,
});
});
everyNthFrame: 1 captures every composited frame. Set it higher to skip frames for smaller output.
Without screencastFrameAck, the stream stalls. Chrome waits for confirmation before sending the next frame.
Stop the screencast when the interaction is done:
await client.send("Page.stopScreencast");
Encoding
Sequentially-numbered PNGs go into ffmpeg:
ffmpeg -framerate 30 \
-i frame-%05d.png \
-c:v libx264 \
-pix_fmt yuv420p \
-crf 18 \
-preset slow \
-vf "scale=800:200:flags=lanczos" \
-movflags +faststart \
output.mp4
Flag by flag:
-crf 18— visually lossless. For small viewports, file size stays tiny regardless of quality setting.lanczos— sharpest downscale filter. Keeps text and thin UI elements crisp when going from 2x capture to 1x output.-movflags +faststart— moves mp4 metadata to the front. Browsers start playback before the download finishes.-pix_fmt yuv420p— Safari won’t play the file without this.-preset slow— better compression, slower encode. Negligible for clips under a megabyte.
Full script
Everything wired together as a single function:
import puppeteer from "puppeteer";
import { spawn } from "child_process";
import fs from "fs";
async function capture({ url, width, height, scrollSequence, output }) {
const framesDir = ".capture-frames";
fs.mkdirSync(framesDir, { recursive: true });
const browser = await puppeteer.launch({
headless: true,
args: ["--force-device-scale-factor=2"],
});
const page = await browser.newPage();
await page.setViewport({ width, height, deviceScaleFactor: 2 });
const client = await page.createCDPSession();
await page.goto(url, { waitUntil: "networkidle0" });
// Start screencast
let frameIndex = 0;
await client.send("Page.startScreencast", {
format: "png",
quality: 100,
maxWidth: width * 2,
maxHeight: height * 2,
everyNthFrame: 1,
});
client.on("Page.screencastFrame", async (event) => {
const path = `${framesDir}/frame-${String(frameIndex++).padStart(5, "0")}.png`;
fs.writeFileSync(path, Buffer.from(event.data, "base64"));
await client.send("Page.screencastFrameAck", {
sessionId: event.sessionId,
});
});
// Run the interaction sequence
for (const action of scrollSequence) {
if (action.scroll != null) {
await smoothScroll(page, action.scroll, action.duration ?? 1000);
}
if (action.wait) {
await new Promise((r) => setTimeout(r, action.wait));
}
}
await client.send("Page.stopScreencast");
await browser.close();
// Encode
await new Promise((resolve, reject) => {
const proc = spawn("ffmpeg", [
"-y", "-framerate", "30",
"-i", `${framesDir}/frame-%05d.png`,
"-c:v", "libx264", "-pix_fmt", "yuv420p",
"-crf", "18", "-preset", "slow",
"-vf", `scale=${width}:${height}:flags=lanczos`,
"-movflags", "+faststart",
output,
]);
proc.on("close", (code) =>
code === 0 ? resolve() : reject(new Error(`ffmpeg exit ${code}`))
);
});
fs.rmSync(framesDir, { recursive: true });
}
Usage:
await capture({
url: "http://localhost:4321",
width: 700,
height: 100,
output: "header-dock.mp4",
scrollSequence: [
{ wait: 500 },
{ scroll: 400, duration: 800 },
{ wait: 1200 },
{ scroll: 0, duration: 800 },
{ wait: 1000 },
],
});
I used this to record six iterations of a magnetic header animation. The script patched spring values in the component source, waited for HMR, captured each config. Same viewport, same timing, directly comparable.
Things that tripped me up
If you’re modifying source files between captures and relying on a dev server, the HMR reload will fight with Puppeteer’s navigation. I kept getting ERR_ABORTED until I added a 3-second delay after each file write and wrapped page.goto in a retry.
CDP screencast doesn’t give you a fixed number of frames. It captures whatever the browser composites, so if an animation is still settling when you stop the screencast, you’ll clip the tail end. I learned to add generous pauses after each interaction.
One more: a 1200×800 viewport at 2x means each PNG is 2400×1600. A few hundred of those eat disk space quickly. The script cleans up after encoding, but I’d still keep an eye on temp storage for longer recordings.