Skip to content

connerkward/screen-studio-alternative

Repository files navigation

screen-studio-alternative

Screen Studio-style effects for screen recordings, from the command line — auto-zoom on clicks, automatic speed-up of idle time, keystroke overlays, cursor smoothing, and a 9:16 vertical export that follows the action. Headless, scriptable: Python + ffmpeg + a single-file Swift event logger. No app, no subscription.

macOS-only, end to end. This is not just the logger: render.py encodes with Apple's h264_videotoolbox, polish.py loads /System/Library/Fonts/Helvetica.ttc, and the capture path relies on screencapture and the Swift events-log logger — all macOS. It is not portable to Linux/Windows as written.

How it works

Screen Studio's signature effects need to know where you clicked and what you typed. Pixels alone can't tell you that — so this tool has two halves:

  1. events-log (Swift) runs while you record and logs cursor positions (60 Hz), clicks, and keystrokes to a JSONL file. It drops all key events while macOS secure input is active (password fields, sudo) and is meant to run only for the duration of a recording.
  2. polish.py (Python) post-processes the recording using that log:
Flag Effect Needs events?
--speedup Compress idle spans (no input AND frozen pixels — a playing animation is never compressed) no (pixel-only fallback)
--zoom Eased auto-zoom onto click clusters yes
--keys Accumulating keystroke chips, rendered like real typing yes
--smooth-cursor Replace the jittery real cursor with an eased synthetic one (record cursor-free for best results) yes
--vertical Additional 1080×1920 output whose crop window follows the action best with

Install

pip install -r requirements.txt    # numpy + Pillow (both required)
brew install ffmpeg                # H.264 encode/decode (system dependency, not pip)
swiftc -O events-log.swift -o events-log   # build the Swift event logger (once)

numpy is required: render.py — the renderer every export shells out to — uses it for per-frame camera math, backdrop gradients, and click-sound mixing. ffmpeg is a system dependency (install via Homebrew), not a pip package.

Quick start

# build the logger (once)
swiftc -O events-log.swift -o events-log

# record anything (screencapture, OBS, ScreenCaptureKit...) while logging:
./events-log demo.events.jsonl &
LOGGER=$!
screencapture -v -V 30 demo.mov     # or any recorder
kill $LOGGER

# polish:
python3 polish.py demo.mov --events demo.events.jsonl \
  --speedup --zoom --keys --vertical
# -> demo.polished.mov + demo.polished.vertical.mov (1080x1920)

--speedup alone works on any existing screen recording — no event log needed.

Requirements: macOS (see the macOS-only note above), Python 3 with numpy + Pillow (pip install -r requirements.txt), and ffmpeg. The --speedup-only polish.py path itself uses no numpy, but the headline renderer (render.py) does, so numpy is required for any export.

Pairs with macos-screen-recorder-system-audio (sck-record --no-cursor) for system-audio capture and cursor-free footage for --smooth-cursor.

High-quality renderer (render.py)

render.py is a non-destructive, single-pass camera renderer — the Screen Studio approach. The recording is never modified; a render spec (zoom regions + ease ramp + fps + aspect) drives a virtual camera that zooms/pans over the original high-res frames, sampled with LANCZOS, exported to a smaller target — so a zoom still reads ≥1:1 source pixels and stays crisp (measured ~1.3× sharper than cropping a finished video, more with native-retina capture). Easing is cosine ease-in/out per region (smooth, predictable, no overshoot), tuned by --ramp (ease-in/out duration in seconds), output at 60fps, and it does horizontal (16:9/1:1) and 9:16 vertical (--aspect 9:16, zoom + follow) from the same code.

python3 render.py SRC.mp4 --regions regions.json --out out.mp4 \
  --aspect 16:9 --fps 60 --ramp 0.5 --cursor --speedup
# vertical: same command with --aspect 9:16

regions.json = [{"t0","t1","z","cx","cy"}]. (polish.py is the older ffmpeg-filter path — it does the --speedup/idle-detection work and needs no numpy itself, but it is not a substitute for render.py: every Studio export and every high-quality zoom/pan render goes through render.py, which requires numpy.)

Studio UI (local web app)

python3 studio.py [recording.mp4] opens a local web UI: an NLE-style timeline with a fixed ruler (bar width = source duration; edits never rescale it, so upstream content is always planted). Auto-detected zoom regions are draggable blocks — drag to move, drag edges to retime, scroll over one to set its zoom level, double-click an empty track to add, double-click a block to delete (or select + Delete key). Idle spans are auto-detected as the intersection of input-gaps and frozen pixels (no keyboard/mouse input and no pixel change — a playing animation is never flagged idle) and become speed blocks: their source range is locked (which footage is sped never changes) but their rate is editable — select one for an inspector speed slider, or drag its right edge = rate-stretch (FCP retime handle / Premiere Rate Stretch). Changing a rate ripples everything downstream along the planted ruler; empty track grows at the right as the cut gets shorter. Configurable ease transition, default zoom, and frame styling (backdrop gradient + padding + rounded corners + drop shadow). Preview is live canvas compositing; export uses render.py and outputs 60fps at chosen aspect. Synthetic cursor is always smooth; click ripple + a real recorded click sound (CC0, freesound #735771) are included. OS-chosen free port, all local.

Three additions modelled on competitor apps: GIF export (a GIF button — two-pass palette transcode of the render, for README/social embeds), style presets (save the current look — aspect/fit/background/padding/radius/shadow/ramp/zoom/toggles — to a named preset and reapply it across recordings; persisted server-side in presets.json), and activity-aware auto-zoom — auto-zoom now fires on standalone typing bursts (centred on the cursor during the burst), not only click clusters, so tabbing into a field and typing still zooms (scroll/drag aren't in the event log, so they're out of scope).

Sequence UI — multi-clip editor (sequence.py)

studio.py edits one recording (zoom/speed/cursor effects). sequence.py is the other half: a rudimentary multi-clip NLE — arrange several clips end-to-end on one track, trim each, reorder by drag, export the concatenation. Hard cuts only; no per-clip effects (that's studio.py's job).

python3 sequence.py clip1.mp4 clip2.mov clip3.mp4   # seed the timeline from the CLI
python3 sequence.py /path/to/folder                  # or a folder of videos
python3 sequence.py                                  # or start empty, add clips in-browser
  • Add clips — CLI args / a folder seed the track; + Add clips uploads more from the browser at any time (drag the file picker). Each clip becomes a thumbnailed block.
  • Trim — drag a clip's left/right edge to set its in/out point (source range). A trim flag marks shortened clips. The timeline uses a fixed px-per-second scale that's frozen during a drag, so the dragged edge tracks the cursor 1:1 (no rescale-feedback jump); neighbours slide via a CSS transition.
  • Reorder — drag a clip body; it follows the cursor (transition off) while the others slide to their new slots (transition on), then snaps into place on drop (Movie-Maker style).
  • Scrub / play — click empty track to scrub; ▶ / spacebar plays straight through the sequence. Playback is double-buffered: two <video> elements, one active and one pre-seeking + buffering the upcoming clip, so crossing a cut is an instant element swap (no src-reload hiccup). The buffer is refreshed after trim/reorder/delete and on add.
  • Delete — select a clip + Delete key, or the Delete-clip button.
  • Export video — concatenates the trimmed clips with ffmpeg into one 1080p 60fps mp4 (real progress ring, auto-download; lower-fps sources are frame-duped to 60). Output frame defaults to 1080p, orientation picked from the first clip and overridable with the 16:9 / 9:16 / 1:1 selector (1920×1080 · 1080×1920 · 1080×1080). Clips of differing resolution/aspect are scaled-to-fit and pillar/letterboxed into that frame; clips with no audio get synthesized silence so mixed-audio sequences concat cleanly. Encoder is h264_videotoolbox (HW) with a libx264 fallback, same as render.py (also 60fps) — every export in this tool is 60fps.
  • Export GIF — a GIF button transcodes the concatenated mp4 to an animated GIF (two-pass palette, 18fps, ≤720px) for README/social embeds.
  • Export FCPXML — like studio.py, also emits an editable FCPXML 1.9 handoff (fcpxml.to_fcpxml_sequence): the trimmed clips laid on one spine as asset-clips (one asset+format per source), opening non-destructively in Resolve / Final Cut / Premiere. Every editor window in this tool offers both video and FCPXML export.

OS-chosen free port, all local. Reuses polish.probe (durations/audio detection) and render.has_videotoolbox / _ENC_FLAGS (encoder selection).

Permissions

The logger needs Accessibility / Input Monitoring for clicks + keys (System Settings → Privacy & Security); without the grant it degrades to cursor-move sampling. The recorder needs Screen Recording. This is a demo-production tool: run the logger only while recording, and treat the events file like the recording itself.

Testing

make-fixture.py synthesizes a fake screen recording with scripted cursor travel, clicks, typing, and idle spans, plus a ground-truth events.jsonl — every feature is validated against known truth:

python3 make-fixture.py fixture.mp4 fixture.events.jsonl
python3 polish.py fixture.mp4 --events fixture.events.jsonl --speedup --zoom --keys --vertical

License

MIT

About

Screen Studio-style screen-recording polish from the CLI — auto-zoom on clicks, idle speed-up, keystroke overlays, cursor smoothing, and 9:16 vertical that follows the action. Headless: Python + ffmpeg + a Swift input logger. No app, no subscription.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors