Turning new product features into short-form videos automatically using AI

Short-form video is some of the highest-performing marketing content right now. Reels, Shorts, LinkedIn clips. Every platform rewards it. Every algorithm promotes it.

The problem is making it. For a software product, every new feature ideally gets its own clip: a tight 15 to 30 second demo that shows the thing working. Multiply that across features, updates, and campaigns, and you’re looking at a serious production bottleneck. Screen-record, trim, add branding, write a caption, export, upload, repeat. Even a simple reel takes a surprising amount of time to produce well.

I wanted to collapse that into an on-demand step. Describe what the demo should show, and get back a branded, captioned MP4 ready for review.

Here’s how I built that.

The core idea

The system has three layers, each doing one job:

Claude is the director. It reads a plain-language description of the demo (“log in, create a new entry, show the confirmation screen”) and translates it into precise browser actions.
Playwright is the camera operator. Playwright is browser automation software. It opens a real browser, logs in, clicks, types, scrolls, and moves an on-screen cursor like a human would, while recording the entire session.
ffmpeg is the editor. The raw screen recording gets composited into a fixed, on-brand vertical template: logo up top, a headline hook, the live product footage in the middle, a caption and call-to-action at the bottom.

What comes out the other end is a finished 1080×1920 MP4, plus a contact sheet (a grid of key stills) so I can approve the whole thing at a glance.

Why a real browser matters

This isn’t a mockup generator or a prototype tool. Playwright drives an actual browser session against the live product. Real data loads. Real animations play. Hover states, transitions, loading spinners: they all show up because they’re actually happening.

That matters for trust. Viewers can tell the difference between a polished screen recording of real software and a stitched-together set of screenshots. The uncanny valley of fake product demos is a real thing, and this sidesteps it entirely.

It also means the recordings stay accurate automatically. When the UI changes, the next video captures the current version. No reshooting, no “this screenshot is from three releases ago” problem.

Reusable recipes

The system isn’t one monolithic script. It’s a small library of capture patterns, each one designed for a different kind of demo:

Scroll through a page (showcase a long form, a directory, a report)
Zoom into a single control (highlight a specific feature or setting)
Walk through a multi-step workflow (create, edit, save)
Perform a front-end interaction (search, filter, bulk actions)
Drag to reorder (show drag-and-drop working in real time)
Open a settings dialog (reveal configuration options)
Fly through a 3D map (the fun one, more on this below)

Each new video is mostly a config change, not new code. You specify the URL, the steps, the headline text, and which recipe to use. A new feature can become a reel in minutes rather than hours.

The 3D map flythrough

The most cinematic recipe so far. I have a GravityView map showing store locations built from form entries. The reel opens on a flat 2D satellite view, then the camera lifts and dives down into a photorealistic 3D Manhattan skyline, weaving between skyscrapers.

The smooth camera move isn’t hand-animated. I’m using Google Maps’ own 3D camera controls, and the AI simply tells it where to fly: starting altitude, target coordinates, pitch angle, bearing. Google’s renderer does the heavy lifting of making it look cinematic.

This is the “work smarter” beat of the whole project. Rather than building a custom 3D animation pipeline, I found that the mapping platform already had beautiful camera movement built in. I just needed something to drive it programmatically.

One wrinkle: photorealistic 3D tiles only render with real graphics hardware. This recipe runs a visible (headed) browser instead of an invisible (headless) one, so the GPU can do its thing. A small constraint, but worth knowing.

Honest lessons from behind the scenes

A real browser means real-world quirks. Pop-ups appear. Cookie banners load. Elements take longer to render than expected. The automation needs to handle all of that gracefully, or the recording captures a half-loaded screen.

Animations require special treatment. If you want a smooth scroll or a drag interaction to look fluid in the final video, you can’t just jump between states. The browser actions need to be broken into small incremental steps, frame by frame, so the screen recorder captures clean motion rather than a blur or a jump cut.

Timing is everything. The difference between a professional-looking demo and an amateur one is often just pacing: how long the camera lingers on a result, how fast the cursor moves, whether there’s a beat after a click before the next action. Getting this right took iteration.

Human in the loop

The system never auto-publishes. It produces a reviewable file and I check brand voice, technical accuracy, and whether the demo actually tells a coherent story. Then I post it.

This isn’t a limitation. It’s a design choice. AI handles the production grunt work (driving the browser, recording, compositing, branding). I keep the judgment calls: is this the right feature to highlight? Does the pacing work? Is the headline compelling?

The contact sheet makes this fast. Instead of watching every video end to end, I scan a grid of stills, confirm nothing looks broken, and approve. The review step takes seconds rather than minutes.

From pipeline to autopilot

The final step was packaging the whole thing into a Claude Code skill. Now when a new product release is published, Claude can check the changelog, identify video-worthy features, and invoke the skill to produce the short-form video automatically.

I don’t even need to write the brief anymore. The AI reads what shipped, decides which features would make compelling demos, picks the right capture recipe, and generates the videos. I review the output and post. The production bottleneck isn’t just faster; for routine feature releases, it’s essentially gone.

What this changes for a marketing team

Before this pipeline, making a product demo reel took a meaningful chunk of time. Now it’s an on-demand step that takes minutes of active time (reviewing the output, posting).

The math changes meaningfully. This is systems-level thinking applied to content production. Instead of picking three or four features per quarter that “deserve” a video, you can cover every notable update. The backlog of “I should make a video for that” shrinks to zero because the marginal cost of each video dropped from hours to minutes.

The product itself becomes the star, on brand every time, and video output scales without scaling effort.

If you’re running a software product and struggling to keep up with short-form video demand, the combination of an AI coding assistant and browser automation is worth exploring. The tools are mature enough that this isn’t theoretical. I’m shipping videos from this pipeline today.

The core idea

Why a real browser matters

Reusable recipes

The 3D map flythrough

Honest lessons from behind the scenes

Human in the loop

From pipeline to autopilot

What this changes for a marketing team

Casey Burridge

Reconsidering the WordPress business model: what are plugin licenses actually paying for?

How I built an autonomous marketing team with AI agents

WordPress product marketers must embrace systems thinking: History tells us why