From 658b44d943e162c2ef74a66f11ea56af4f4904ae Mon Sep 17 00:00:00 2001 From: Karol Binkowski Date: Tue, 31 Mar 2026 13:03:50 +0200 Subject: [PATCH 1/5] add note on Polish language token overhead and AI misinformation --- .../docs/expanding-horizons/model-pricing.mdx | 21 +++++++++++++++++++ src/data/links.csv | 2 ++ 2 files changed, 23 insertions(+) diff --git a/src/content/docs/expanding-horizons/model-pricing.mdx b/src/content/docs/expanding-horizons/model-pricing.mdx index 23ae3f6..d6d71e3 100644 --- a/src/content/docs/expanding-horizons/model-pricing.mdx +++ b/src/content/docs/expanding-horizons/model-pricing.mdx @@ -38,6 +38,27 @@ Subscriptions are different from API billing. You pay a monthly fee for usage in For most people, this is the cheapest way to get heavy day-to-day usage. The effective subscription-vs-API ratio can swing a lot as vendors change limits, model mixes, and pricing. +## Language choice and token cost + +Token counts vary significantly by language. +Polish, for example, requires roughly 40% more tokens than English for equivalent content — you can verify this yourself with Tiktokenizer. + +| Text | Tokens | +|------|--------| +| *"Jestem Adam i sprawdzam, ile tokenów potrzeba do zapisania tego tekstu. Czy język polski faktycznie wymaga ich więcej niż angielski?"* | 38 | +| *"My name is Adam, and I'm checking how many tokens are needed to write this text. Does Polish actually require more tokens than English?"* | 27 | + +This is worth keeping in mind if you work in a non-English language — more tokens means higher cost and slower inference at scale. + +It's also a good occasion to debunk a myth that spread in Polish media in early 2025. +Several articles claimed that Polish is the "best language for AI", citing a real paper: One ruler to measure them all: Benchmarking multilingual long-context language models. +Marzena Karpińska from Microsoft, a co-author of the paper, addressed this directly: + +> No. We didn't study that at all. +> We created a tool for diagnosing language models, checking how well they are able to extract information from very long texts. + +A neat example of a fake news cycle born from real research — likely fueled by misunderstood patriotism. + Common subscription options: - diff --git a/src/data/links.csv b/src/data/links.csv index ca4f650..8800a78 100644 --- a/src/data/links.csv +++ b/src/data/links.csv @@ -12,6 +12,7 @@ https://ampcode.com/threads/T-019cafee-1be2-72ab-bcec-28c6d41b753b,Reproduce fuz https://ampcode.com/threads/T-d0d0c3c4-4994-4574-9e56-e7f97e88bc33,Implement file mentions in command palette,Nicolay Gerold (via Amp),2025-10-31,2026-03-04 https://ampcode.com/threads/T-f02e59f8-e474-493d-9558-11fddf823672,Tmux control mode protocol documentation,Mitchell Hashimoto (via Amp),2025-12-01,2026-03-04 https://app.devin.ai/review,Devin Review,,,2026-03-05 +https://arxiv.org/pdf/2503.01996,One ruler to measure them all: Benchmarking multilingual long-context language models,Alveera Ahsan et al.,,2026-03-31 https://bits.logic.inc/p/ai-is-forcing-us-to-write-good-code,AI Is Forcing Us To Write Good Code,Steve Krenzel,2025-12-29,2026-03-04 https://claude.com/blog/category/claude-code,Claude Code Blog,,,2026-03-04 https://claude.com/chrome,Claude in Chrome,,,2026-03-04 @@ -98,6 +99,7 @@ https://support.apple.com/guide/mac-help/mh40584/mac,Dictate messages and docume https://support.microsoft.com/en-us/windows/use-voice-typing-to-talk-instead-of-type-on-your-pc-fec94565-c4bd-329d-e59a-af033fa5689f,Use voice typing to talk instead of type on your PC - Microsoft Support,,,2026-03-10 https://swmansion.com/,Software Mansion,,,2026-03-04 https://tidewave.ai/,Tidewave,,,2026-03-04 +https://tiktokenizer.vercel.app/,Tiktokenizer,,,2026-03-31 https://warden.sentry.dev/,Warden,,,2026-03-05 https://www.catonetworks.com/blog/cato-ctrl-weaponizing-claude-skills-with-medusalocker/,Weaponizing Claude Skills with MedusaLocker,Inga Cherny,2025-12-02,2026-03-18 https://www.figma.com/blog/introducing-claude-code-to-figma/,From Claude Code to Figma: Turning Production Code into Editable Figma Designs,Figma,2026-02-17,2026-03-04 From ce4fb7b568bf25ef76edcb72a53d057a69299d63 Mon Sep 17 00:00:00 2001 From: Karol Binkowski Date: Thu, 2 Apr 2026 13:49:17 +0200 Subject: [PATCH 2/5] Apply suggestion from @mkaput Co-authored-by: Marek Kaput Signed-off-by: Karol Binkowski --- src/content/docs/expanding-horizons/model-pricing.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/content/docs/expanding-horizons/model-pricing.mdx b/src/content/docs/expanding-horizons/model-pricing.mdx index d6d71e3..a382df7 100644 --- a/src/content/docs/expanding-horizons/model-pricing.mdx +++ b/src/content/docs/expanding-horizons/model-pricing.mdx @@ -41,7 +41,7 @@ For most people, this is the cheapest way to get heavy day-to-day usage. The eff ## Language choice and token cost Token counts vary significantly by language. -Polish, for example, requires roughly 40% more tokens than English for equivalent content — you can verify this yourself with Tiktokenizer. +Polish, for example, requires roughly 40% more tokens than English for equivalent content — you can verify this yourself with . | Text | Tokens | |------|--------| From 93c9b639fcf78407197a56cb07cbd285ee686564 Mon Sep 17 00:00:00 2001 From: Karol Binkowski Date: Thu, 2 Apr 2026 13:49:33 +0200 Subject: [PATCH 3/5] Apply suggestion from @mkaput Co-authored-by: Marek Kaput Signed-off-by: Karol Binkowski --- src/content/docs/expanding-horizons/model-pricing.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/content/docs/expanding-horizons/model-pricing.mdx b/src/content/docs/expanding-horizons/model-pricing.mdx index a382df7..0427197 100644 --- a/src/content/docs/expanding-horizons/model-pricing.mdx +++ b/src/content/docs/expanding-horizons/model-pricing.mdx @@ -51,7 +51,7 @@ Polish, for example, requires roughly 40% more tokens than English for equivalen This is worth keeping in mind if you work in a non-English language — more tokens means higher cost and slower inference at scale. It's also a good occasion to debunk a myth that spread in Polish media in early 2025. -Several articles claimed that Polish is the "best language for AI", citing a real paper: One ruler to measure them all: Benchmarking multilingual long-context language models. +Several articles claimed that Polish is the "best language for AI", citing a real paper: . Marzena Karpińska from Microsoft, a co-author of the paper, addressed this directly: > No. We didn't study that at all. From 28f678c17975a8afe3be535637b84074b6ffa4e9 Mon Sep 17 00:00:00 2001 From: Karol Binkowski Date: Thu, 2 Apr 2026 13:50:02 +0200 Subject: [PATCH 4/5] Apply suggestion from @mkaput Co-authored-by: Marek Kaput Signed-off-by: Karol Binkowski --- src/data/links.csv | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/data/links.csv b/src/data/links.csv index 8800a78..8d292d6 100644 --- a/src/data/links.csv +++ b/src/data/links.csv @@ -12,7 +12,7 @@ https://ampcode.com/threads/T-019cafee-1be2-72ab-bcec-28c6d41b753b,Reproduce fuz https://ampcode.com/threads/T-d0d0c3c4-4994-4574-9e56-e7f97e88bc33,Implement file mentions in command palette,Nicolay Gerold (via Amp),2025-10-31,2026-03-04 https://ampcode.com/threads/T-f02e59f8-e474-493d-9558-11fddf823672,Tmux control mode protocol documentation,Mitchell Hashimoto (via Amp),2025-12-01,2026-03-04 https://app.devin.ai/review,Devin Review,,,2026-03-05 -https://arxiv.org/pdf/2503.01996,One ruler to measure them all: Benchmarking multilingual long-context language models,Alveera Ahsan et al.,,2026-03-31 +https://arxiv.org/pdf/2503.01996,One ruler to measure them all: Benchmarking multilingual long-context language models,Alveera Ahsan et al.,2025-03-03,2026-03-31 https://bits.logic.inc/p/ai-is-forcing-us-to-write-good-code,AI Is Forcing Us To Write Good Code,Steve Krenzel,2025-12-29,2026-03-04 https://claude.com/blog/category/claude-code,Claude Code Blog,,,2026-03-04 https://claude.com/chrome,Claude in Chrome,,,2026-03-04 From 8febcdde00f0fd70e89955e5b347cfb8dcf3c675 Mon Sep 17 00:00:00 2001 From: Karol Binkowski Date: Thu, 2 Apr 2026 11:57:01 +0000 Subject: [PATCH 5/5] fix: move `Common subscription options` to the `Subscriptions` section --- .../docs/expanding-horizons/model-pricing.mdx | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/src/content/docs/expanding-horizons/model-pricing.mdx b/src/content/docs/expanding-horizons/model-pricing.mdx index 0427197..4effa86 100644 --- a/src/content/docs/expanding-horizons/model-pricing.mdx +++ b/src/content/docs/expanding-horizons/model-pricing.mdx @@ -38,6 +38,14 @@ Subscriptions are different from API billing. You pay a monthly fee for usage in For most people, this is the cheapest way to get heavy day-to-day usage. The effective subscription-vs-API ratio can swing a lot as vendors change limits, model mixes, and pricing. +Common subscription options: + +- +- +- +- +- + ## Language choice and token cost Token counts vary significantly by language. @@ -58,11 +66,3 @@ Marzena Karpińska from Microsoft, a co-author of the paper, addressed this dire > We created a tool for diagnosing language models, checking how well they are able to extract information from very long texts. A neat example of a fake news cycle born from real research — likely fueled by misunderstood patriotism. - -Common subscription options: - -- -- -- -- --