WaxTap downloads and processes YouTube audio. It is available as a Go library
and as the waxtap command-line tool. Both use the same processing core.
WaxTap can download the best available YouTube audio or process an existing local file. Processing stages are opt-in: transcode, cut explicit time ranges, remove SponsorBlock segments, measure loudness, and normalize loudness. A plain download keeps the selected source stream and does not re-encode.
WaxTap targets public videos. Private, age-restricted, and login-gated videos are expected failures, not bypass targets. YouTube changes its player and anti-bot behavior without notice; see MAINTENANCE.md for the recovery runbook and runtime knobs.
- Library and CLI over one core.
github.com/colespringer/waxtapis the stable facade;cmd/waxtapis a real CLI built on the same packages. - Pure-Go extraction (InnerTube + goja for the cipher). No
yt-dlp. The default ANDROID_VR and iOS clients return playable audio for public videos with no PO token. Full WEB audio over SABR/UMP is opt-in: it needs a GVSpotoken.Providerplus an attested/player-contexthandoff (WaxTap's own WEB/playeronly earns a ~1-minute preview); see PO tokens & WEB. - YouTube-specific code is isolated behind small interfaces (
youtube,youtube/internal/resolver) so most upstream changes stay in a few files. - Operational behavior: concurrency-safe, context-cancelable, bounded memory, per-operation timeouts (never a single global cap), atomic temp-file output.
- Encoding behavior: YouTube audio is lossy; FLAC/ALAC/WAV are lossless re-encodes of a lossy source. Only copy/remux avoids re-encoding.
| Package | Role |
|---|---|
waxtap (root) |
Stable facade: Client, Request/Result, Options. |
cmd/waxtap |
The CLI (cobra): download, info, cut, normalize, doctor, and other commands. |
format |
Audio-first Format model and selectors. |
download |
Resilient ranged/streaming download (parallel chunks, expiry refresh). |
transcode |
ffmpeg/ffprobe execution home (codecs, probing). |
cut |
Time-range cut + SponsorBlock bridge (composes transcode). |
normalize |
Loudness measure/normalize (EBU R128; track and album). |
sponsorblock |
SponsorBlock client + category vocabulary. |
potoken |
PO-token provider contract (caller-supplied). |
waxerr |
The domain error taxonomy (one errors.Is source of truth). |
youtube |
YouTube extraction (volatile; exported for the facade, may churn). |
youtube/internal/resolver |
Cipher / base.js / stream-URL resolution. |
youtube/internal/sabr |
SABR/UMP streaming for URL-less WEB-family audio. |
internal/pipeline |
Fused probe, cut, loudness, and encode pipeline. |
internal/httpx |
HTTP client: retry, backoff, Retry-After, per-host limiter. |
internal/cache |
In-memory LRU+TTL+singleflight cache. |
internal/diskcache |
On-disk, size-capped, schema-versioned player-JS cache. |
internal/tempfile |
Atomic temp-output staging + cleanup contract. |
- Go 1.26+
ffmpeg/ffprobeonPATH(for transcode/cut/normalize; not needed for plain metadata or best-source downloads).
Install either the CLI or the Go package. The CLI is meant to run from a shell; the release archives do not install a desktop app. See Using the prebuilt binaries for platform notes.
With Go:
go install github.com/colespringer/waxtap/cmd/waxtap@latestThis installs waxtap into $(go env GOBIN) (or $(go env GOPATH)/bin); make
sure that directory is on your PATH.
Prebuilt binaries: tagged releases include Linux, macOS, and Windows archives (amd64/arm64) on the GitHub Releases page.
Library:
go get github.com/colespringer/waxtapEach release archive contains the waxtap executable and documentation. Extract
the archive for your platform, put the executable somewhere on your PATH, then
run it from a terminal like any other CLI.
Linux / macOS:
# 1. Extract the archive for your platform
tar -xzf waxtap_*_linux_amd64.tar.gz # or _darwin_arm64, etc.
# 2. Move it to a directory on your PATH
sudo mv waxtap /usr/local/bin/ # or ~/.local/bin, ~/bin, etc.
# 3. Run it from a terminal
waxtap --helpThe archive preserves the executable bit; a standalone downloaded binary may need
chmod +x waxtap first.
macOS Gatekeeper: the binaries are not code-signed, so macOS may block the first launch ("cannot be opened because Apple cannot check it for malware" / "developer cannot be verified"). Clear the quarantine flag for the installed binary:
xattr -d com.apple.quarantine /usr/local/bin/waxtapYou can also right-click the binary in Finder and choose Open the first time.
Windows:
- Unzip
waxtap_*_windows_amd64.zip. - Move
waxtap.exeinto a folder on yourPATH(or add its folder toPATH). - Open PowerShell or Command Prompt and run
waxtap --help.
SmartScreen: because the
.exeis not signed, Windows may show "Windows protected your PC". Choose More info > Run anyway on first launch.
Media commands accept a YouTube URL or bare video/playlist ID and support
--json for a stable scriptable contract (schemaVersion 3; result objects now
carry the YouTube client that was used). cut, transcode, and normalize
also accept a local file, so no network is needed for local processing. Every
command has --help.
# Inspect audio formats (no download)
waxtap info <video-url>
waxtap formats <video-url>
# Download the best audio with no re-encode (the default, keep-source)
waxtap download <video-url> -o track
# Download and transcode to FLAC in a single ffmpeg pass
waxtap download <video-url> --transcode flac -o track.flac
# Prefer a native stereo track (the default); fold 5.1 to stereo only if needed
waxtap download <video-url> --channels stereo --downmix --transcode flac -o track.flac
# Remove SponsorBlock non-music segments and normalize loudness in one pass
waxtap download <video-url> --cut-sponsorblock --transcode mp3 --normalize --loudness-target -14 -o track.mp3
# Process a LOCAL file (no network)
waxtap transcode song.flac song.mp3
waxtap normalize song.wav --apply --target -14 --transcode flac -o song.flac
# Download a whole playlist into a directory, skipping already-fetched IDs
waxtap download <playlist-url> -d ./music --download-archive seen.txt
# Download serially, waiting 5 seconds between downloads, up to 10 attempts
waxtap download <playlist-url> -d ./music --concurrency 1 --sleep-interval 5s --max-downloads 10
# Check extraction health
waxtap doctorThe CLI maps each failure class to a stable exit code so scripts can branch
without parsing messages (--json carries the same class in error.code):
| Code | Meaning |
|---|---|
| 0 | success |
| 1 | unclassified error |
| 2 | invalid request: usage error, invalid ID, playlist URL passed to a video command, incompatible spec, unsupported local input, unknown --client, or invalid config |
| 3 | video unavailable, restricted, login required, live, or no audio formats |
| 4 | extraction, cipher, or playlist parsing failure (often indicates WaxTap needs an update) |
| 5 | rate limited |
| 6 | ffmpeg/ffprobe not found |
| 7 | incomplete stream (delivery ended early; another client may work) |
| 8 | PO token required (none configured, mint failed, or YouTube rejected it) |
| 130 | canceled (SIGINT) |
Scripts may rely on these codes.
package main
import (
"context"
"fmt"
"log"
"github.com/colespringer/waxtap"
"github.com/colespringer/waxtap/sponsorblock"
)
func main() {
client, err := waxtap.New(waxtap.Options{})
if err != nil {
log.Fatal(err)
}
// Download the best audio, remove SponsorBlock "music_offtopic" segments, and
// transcode to FLAC in one ffmpeg pass.
res, err := client.Download(context.Background(), waxtap.Request{
URL: "https://youtu.be/VIDEO_ID_01",
ProcessSpec: waxtap.ProcessSpec{
Transcode: &waxtap.TranscodeSpec{Format: waxtap.FormatFLAC},
Cut: &waxtap.CutSpec{
SponsorBlock: []sponsorblock.Category{sponsorblock.CategoryMusicOffTopic},
OnError: waxtap.ProceedUncut,
},
Output: waxtap.ToFile("track.flac"),
},
})
if err != nil {
log.Fatal(err)
}
fmt.Printf("%s -> %s (%d bytes)\n", res.VideoID, res.OutputPath, res.OutputBytes)
}The default Download does no re-encode; all processing is opt-in. See
example_test.go for Stream, Process (local files),
Enumerate and DownloadPlaylist (playlists), MeasureAlbum, and Info.
DownloadPlaylist downloads playlist entries with bounded concurrency, optional
pacing, and an optional limit on download attempts.
CLI configuration is resolved in this order: explicit flag, WAXTAP_*
environment variable, JSON config file, built-in default. The default config file
is config.json under os.UserConfigDir()/waxtap; override it with --config
or WAXTAP_CONFIG.
Useful operational knobs:
- Cache:
waxtap cache dir,waxtap cache clean,--cache-dir,WAXTAP_CACHE_DIR,--no-cache,WAXTAP_NO_CACHE. - Runtime client refresh:
--profile-override,WAXTAP_PROFILE_OVERRIDE, orprofileOverridePathin config JSON. - Built-in Chrome identity: use
--chrome-major,WAXTAP_CHROME_MAJOR, orchromeMajorin config JSON to override the emulated Chrome major without a rebuild. This cannot be combined with--profile-override, which supplies its own user agents. - Single client / session adoption:
--client web|ios|android_vr|web_embeddedforces one built-in client as the whole chain (conflicts with--profile-override).--visitor-data(+ optional--cookies) or--session-urladopt an external guest session for byte-exact coherence with a token minter; see PO tokens & WEB. - Network posture:
--proxy,--qps,--cooldown,--hl,--gl, and their documentedWAXTAP_*equivalents.--cooldown(orWAXTAP_COOLDOWN, seconds) pauses requests to a host after HTTP 429, or after HTTP 503/403 with aRetry-Afterheader. A longerRetry-Aftervalue takes precedence, up to the retry-wait limit. - Playlist pacing (download command):
--sleep-intervalsets the minimum delay before each download after the first.--max-sleep-intervaladds a randomized upper bound, and--max-downloadslimits download attempts; skipped entries and resolution failures do not count. With--concurrency 1, the interval falls between completed downloads. - Channel layout (
download,transcode,cut):--channels mono|stereo|surround|anyprefers the best native track of that layout. The default isstereo, so a native stereo mix beats a 387 kbps 5.1 track (itag 258) instead of ranking by raw bitrate;anyrestores the bitrate-only ranking. iOS exposes only ~128 kbps stereo, while android_vr and WEB also expose 5.1, so--channels surroundneeds one of those clients. Selection prefers a native match first. Without a native mono or stereo match, it prefers a source that can be downmixed to the requested layout over one that would require upmixing. Among downmixable sources, codec preference and non-DRC audio take precedence over channel count. If those are equal, the source with fewer channels wins. For example, a stereo request chooses 5.1 over mono, while a mono request chooses stereo over a comparable 5.1 source.--downmixapplies the selected downmix. It never upmixes and requires--channels monoorstereo. Thechannelsanddownmixconfig keys set the defaults. Library callers opt in withAudioSelector.WithChannelsandProcessSpec.Channels. - Extraction control:
--no-fallback(download and process commands) prevents fallback from a WEB player context to the configured client chain, disables watch-page extraction, and prevents retrying another client after an incomplete download. The configured extraction chain can still select a working client. Use--clientto force a single client. If a forced non-WEB client fails, WaxTap may still use the WEB watch page. It reports this with afallback-profilewarning and a matching stderr line.--no-fallbackdisables the watch-page fallback. Results report the client used asClient:(andclientin--json). - Diagnostics: set
WAXTAP_DUMP_DIRto write unusable YouTube responses on extraction failures, andWAXTAP_SABR_DUMP_DIRto write each raw SABR round for offline inspection.
ffmpeg and ffprobe are required only for processing or probing. Plain
metadata, stream resolution, and keep-source downloads do not need them.
ANDROID_VR and iOS return playable audio for public videos with no PO token, and they are WaxTap's zero-dependency default. Everything below is opt-in, for callers who specifically want the WEB path.
The WEB-family clients serve URL-less audio over SABR/UMP and need a GVS-scope
potoken.Provider; WaxTap ships no token generator (supply one via
Options.POTokenProvider, or the CLI's --potoken-url for a bgutil server).
A WEB /player call WaxTap makes itself only earns a ~1-minute preview
(YouTube's anti-automation grade: STREAM_PROTECTION_STATUS=2). Full delivery
(status 1) is baked into a serverAbrStreamingUrl minted by an attested
browser that has actually begun playback. So for complete WEB audio, WaxTap
consumes a streaming context from an external attesting browser (e.g. a
WaxSeal /player-context endpoint) instead of building its own preview-grade URL:
waxtap download <url> \
--player-context-url http://127.0.0.1:4416 \
--potoken-url http://127.0.0.1:4416The provider returns snake_case JSON with player_url,
server_abr_streaming_url (scrambled n),
video_playback_ustreamer_config, visitor_data, client_version,
audio_formats, and video metadata. Each audio_formats entry includes itag,
lmt, xtags, and mime_type, plus optional is_drc and audio_track_id
fields for DRC and multi-audio renditions. WaxTap descrambles n with the
context's player_url, mints a GVS token bound to its visitor_data, picks a
format, and streams the file through its existing SABR loop. Wire the provider
as Options.PlayerContextProvider.
--player-context-url requires --potoken-url (the stream binds a GVS token to
the context's visitor_data), and the context mint and the download must share
an egress IP (the signed URL is IP-bound). When the WEB context is unavailable,
WaxTap logs a web-context-fallback warning and falls back to the configured
client chain. After a provider failure, a short cooldown prevents the unavailable
sidecar from being queried for every video in a batch. Each context fetch is
bounded by webContextTimeoutSeconds / WAXTAP_WEB_CONTEXT_TIMEOUT (default 20s).
Fallback normally moves through the default multi-client chain. A forced
non-WEB client can still fall back to the WEB watch page. When that happens,
WaxTap reports that WEB delivered the result instead of the requested client.
Forcing --client web does not count as a substitution because the watch page
also uses WEB. Pass --no-fallback to return the forced client's error without
trying another path. For example, a capped WEB context returns
ErrIncompleteStream (exit 7) instead of falling back to android_vr. Every
result reports the client used as Client: (and client in --json).
For byte-exact session coherence with a minter, WaxTap can also adopt an external guest session instead of bootstrapping its own, so it streams under the exact identity a real browser attested. Adoption requires a uniform client chain:
# Force WEB and adopt a session from a /session endpoint (e.g. a token minter):
waxtap download <url> --client web \
--session-url http://127.0.0.1:4417/session \
--potoken-url http://127.0.0.1:4417
# Or a static session: the browser's exact X-Goog-Visitor-Id literal (+ cookies):
waxtap download <url> --client web --visitor-data 'Cgt...%3D%3D' --cookies ./cookies.txtNotes: the adopted visitorData must be the browser's exact X-Goog-Visitor-Id
literal (sent verbatim); the session must be a logged-out guest (login cookies
are dropped); the minter host and downloads must share an egress IP (use --proxy
to align them). The session resolves once per client, so long-running services
should construct a fresh client per task.
YouTube's player and anti-bot surfaces change without notice. WaxTap isolates that
volatility behind a few marked files and ships operational controls:
client-profile override files, env-gated artifact dumps, a persistent player
cache, and a doctor health check suitable for cron or container health probes.
See MAINTENANCE.md for the full breakage-response runbook.
WaxTap was heavily influenced by kkdai/youtube and yt-dlp.
WaxTap is an independent implementation: it ships no code from either project and does not invoke yt-dlp at runtime.
WaxTap is for personal and otherwise authorized use only. You are responsible for complying with YouTube's Terms of Service and applicable law. WaxTap is public-video only: private, age-restricted, and login-gated videos are expected failures, not something WaxTap promises to bypass.
MIT. Copyright Cole Springer.