Screen Studio-style effects for screen recordings, from the command line — auto-zoom on clicks, automatic speed-up of idle time, keystroke overlays, cursor smoothing, and a 9:16 vertical export that follows the action. Headless, scriptable: Python + ffmpeg + a single-file Swift event logger. No app, no subscription.
macOS-only, end to end. This is not just the logger:
render.pyencodes with Apple'sh264_videotoolbox,polish.pyloads/System/Library/Fonts/Helvetica.ttc, and the capture path relies onscreencaptureand the Swiftevents-loglogger — all macOS. It is not portable to Linux/Windows as written.
Screen Studio's signature effects need to know where you clicked and what you typed. Pixels alone can't tell you that — so this tool has two halves:
events-log(Swift) runs while you record and logs cursor positions (60 Hz), clicks, and keystrokes to a JSONL file. It drops all key events while macOS secure input is active (password fields, sudo) and is meant to run only for the duration of a recording.polish.py(Python) post-processes the recording using that log:
| Flag | Effect | Needs events? |
|---|---|---|
--speedup |
Compress idle spans (no input AND frozen pixels — a playing animation is never compressed) | no (pixel-only fallback) |
--zoom |
Eased auto-zoom onto click clusters | yes |
--keys |
Accumulating keystroke chips, rendered like real typing | yes |
--smooth-cursor |
Replace the jittery real cursor with an eased synthetic one (record cursor-free for best results) | yes |
--vertical |
Additional 1080×1920 output whose crop window follows the action | best with |
pip install -r requirements.txt # numpy + Pillow (both required)
brew install ffmpeg # H.264 encode/decode (system dependency, not pip)
swiftc -O events-log.swift -o events-log # build the Swift event logger (once)numpy is required: render.py — the renderer every export shells out to — uses it for
per-frame camera math, backdrop gradients, and click-sound mixing. ffmpeg is a system
dependency (install via Homebrew), not a pip package.
# build the logger (once)
swiftc -O events-log.swift -o events-log
# record anything (screencapture, OBS, ScreenCaptureKit...) while logging:
./events-log demo.events.jsonl &
LOGGER=$!
screencapture -v -V 30 demo.mov # or any recorder
kill $LOGGER
# polish:
python3 polish.py demo.mov --events demo.events.jsonl \
--speedup --zoom --keys --vertical
# -> demo.polished.mov + demo.polished.vertical.mov (1080x1920)--speedup alone works on any existing screen recording — no event log needed.
Requirements: macOS (see the macOS-only note above), Python 3 with numpy + Pillow
(pip install -r requirements.txt), and ffmpeg. The --speedup-only polish.py path
itself uses no numpy, but the headline renderer (render.py) does, so numpy is required
for any export.
Pairs with macos-screen-recorder-system-audio (sck-record --no-cursor) for system-audio capture and cursor-free footage for --smooth-cursor.
render.py is a non-destructive, single-pass camera renderer — the Screen Studio
approach. The recording is never modified; a render spec (zoom regions + ease ramp +
fps + aspect) drives a virtual camera that zooms/pans over the original high-res
frames, sampled with LANCZOS, exported to a smaller target — so a zoom still
reads ≥1:1 source pixels and stays crisp (measured ~1.3× sharper than cropping a finished
video, more with native-retina capture). Easing is cosine ease-in/out per region
(smooth, predictable, no overshoot), tuned by --ramp (ease-in/out duration in seconds),
output at 60fps, and it does horizontal (16:9/1:1) and 9:16 vertical
(--aspect 9:16, zoom + follow) from the same code.
python3 render.py SRC.mp4 --regions regions.json --out out.mp4 \
--aspect 16:9 --fps 60 --ramp 0.5 --cursor --speedup
# vertical: same command with --aspect 9:16regions.json = [{"t0","t1","z","cx","cy"}]. (polish.py is the older ffmpeg-filter
path — it does the --speedup/idle-detection work and needs no numpy itself, but it is
not a substitute for render.py: every Studio export and every high-quality zoom/pan
render goes through render.py, which requires numpy.)
python3 studio.py [recording.mp4] opens a local web UI: an NLE-style timeline with a
fixed ruler (bar width = source duration; edits never rescale it, so upstream content is
always planted). Auto-detected zoom regions are draggable blocks — drag to move, drag
edges to retime, scroll over one to set its zoom level, double-click an empty track to add,
double-click a block to delete (or select + Delete key). Idle spans are auto-detected as the
intersection of input-gaps and frozen pixels (no keyboard/mouse input and no pixel
change — a playing animation is never flagged idle) and become
speed blocks: their source range is locked (which footage is sped never changes) but
their rate is editable — select one for an inspector speed slider, or drag its right
edge = rate-stretch (FCP retime handle / Premiere Rate Stretch). Changing a rate ripples
everything downstream along the planted ruler; empty track grows at the right as the cut
gets shorter. Configurable ease transition, default zoom, and frame styling (backdrop
gradient + padding + rounded corners + drop shadow). Preview is live canvas compositing;
export uses render.py and outputs 60fps at chosen aspect. Synthetic cursor is always
smooth; click ripple + a real recorded click sound (CC0, freesound #735771) are included.
OS-chosen free port, all local.
Three additions modelled on competitor apps: GIF export (a GIF button — two-pass
palette transcode of the render, for README/social embeds), style presets (save the
current look — aspect/fit/background/padding/radius/shadow/ramp/zoom/toggles — to a named
preset and reapply it across recordings; persisted server-side in presets.json), and
activity-aware auto-zoom — auto-zoom now fires on standalone typing bursts (centred
on the cursor during the burst), not only click clusters, so tabbing into a field and typing
still zooms (scroll/drag aren't in the event log, so they're out of scope).
studio.py edits one recording (zoom/speed/cursor effects). sequence.py is the other
half: a rudimentary multi-clip NLE — arrange several clips end-to-end on one track, trim
each, reorder by drag, export the concatenation. Hard cuts only; no per-clip effects (that's
studio.py's job).
python3 sequence.py clip1.mp4 clip2.mov clip3.mp4 # seed the timeline from the CLI
python3 sequence.py /path/to/folder # or a folder of videos
python3 sequence.py # or start empty, add clips in-browser- Add clips — CLI args / a folder seed the track; + Add clips uploads more from the browser at any time (drag the file picker). Each clip becomes a thumbnailed block.
- Trim — drag a clip's left/right edge to set its in/out point (source range). A
trimflag marks shortened clips. The timeline uses a fixed px-per-second scale that's frozen during a drag, so the dragged edge tracks the cursor 1:1 (no rescale-feedback jump); neighbours slide via a CSS transition. - Reorder — drag a clip body; it follows the cursor (transition off) while the others slide to their new slots (transition on), then snaps into place on drop (Movie-Maker style).
- Scrub / play — click empty track to scrub; ▶ / spacebar plays straight through the
sequence. Playback is double-buffered: two
<video>elements, one active and one pre-seeking + buffering the upcoming clip, so crossing a cut is an instant element swap (no src-reload hiccup). The buffer is refreshed after trim/reorder/delete and on add. - Delete — select a clip + Delete key, or the Delete-clip button.
- Export video — concatenates the trimmed clips with ffmpeg into one 1080p 60fps mp4
(real progress ring, auto-download; lower-fps sources are frame-duped to 60). Output frame
defaults to 1080p, orientation picked from the first clip and overridable with the
16:9 / 9:16 / 1:1 selector (1920×1080 · 1080×1920 · 1080×1080). Clips of differing
resolution/aspect are scaled-to-fit and pillar/letterboxed into that frame; clips with
no audio get synthesized silence so mixed-audio sequences concat cleanly. Encoder is
h264_videotoolbox(HW) with alibx264fallback, same asrender.py(also 60fps) — every export in this tool is 60fps. - Export GIF — a
GIFbutton transcodes the concatenated mp4 to an animated GIF (two-pass palette, 18fps, ≤720px) for README/social embeds. - Export FCPXML — like
studio.py, also emits an editable FCPXML 1.9 handoff (fcpxml.to_fcpxml_sequence): the trimmed clips laid on one spine asasset-clips (oneasset+formatper source), opening non-destructively in Resolve / Final Cut / Premiere. Every editor window in this tool offers both video and FCPXML export.
OS-chosen free port, all local. Reuses polish.probe (durations/audio detection) and
render.has_videotoolbox / _ENC_FLAGS (encoder selection).
The logger needs Accessibility / Input Monitoring for clicks + keys (System Settings → Privacy & Security); without the grant it degrades to cursor-move sampling. The recorder needs Screen Recording. This is a demo-production tool: run the logger only while recording, and treat the events file like the recording itself.
make-fixture.py synthesizes a fake screen recording with scripted cursor travel, clicks, typing, and idle spans, plus a ground-truth events.jsonl — every feature is validated against known truth:
python3 make-fixture.py fixture.mp4 fixture.events.jsonl
python3 polish.py fixture.mp4 --events fixture.events.jsonl --speedup --zoom --keys --verticalMIT