Wiki Term Base is a tool designed to standardise terminology used on Arabic Wikipedia and accelerate vocabulary translation.
ℹ For functional documentation, please check the dedicated Wikipedia page مسرد الويكي (in Arabic).
🌐 The website is available at: https://wikitermbase.toolforge.org
It is hosted on Toolforge, as a Python ASGI application built with the FastAPI framework (served by gunicorn with uvicorn workers via the Toolforge Build Service), using a MariaDB relational database.
The website's frontend is built with React framework.
The Wikipedia gadget frontend is built with OOUI and can be enabled in Arabic Wikipedia's user preferences.
The Wikipedia gadget can be activated in user preferences -> "مسرد الويكي".
The deployed version in Arabic Wikipedia:
- Gadget definition: gadget-WikiTerm
- Gadget Javascript code: Gadget-WikiTerm.js
- Gadget CSS code: Gadget-WikiTerm.css
On Wikipedia, gadgets are production-ready features, while user scripts serve as a flexible environment for development and experimentation.
The user script, available at gadget/SearchTerm.js, differs from gadget code in that it consolidates all imports, JavaScript code, and CSS styles into a single file.
Please note that the database content is managed in the project arabterm.
Clone the arabterm repository, and start the MariaDB database in a Docker container:
make init
make init_mariadb # start or create container
make delete_mariadb # delete database if exists
make migrate_to_mariadb # migrate the SQLite content to MariaDBThen from wikitermbase repository, install python dependencies (requires uv):
make initCreate a file at ./var/local.cnf with (adapt values):
[client]
user = MyUserName
password = MyTestPasswordStart the application:
make runYou can then open the web application at http://127.0.0.1:5001/
Python version: 3.13
Interactive OpenAPI docs (Swagger UI) are available at /docs — and at /redoc for the ReDoc rendering. These are auto-generated from the FastAPI route signatures and let you try every endpoint from the browser.
- Aggregated search (results are groupped by the arabic term):
GET /api/v1/search/aggregated?q=magnetoscope
GET /api/v1/search/aggregated?q=اشتقاق
As a result, we get a JSON. An example can found at gadget/response.json
- Raw search (without groupping):
GET /api/v1/search?q=magnetoscope
GET /api/v1/search?q=اشتقاق
ASGI applications cannot run on Toolforge's legacy python3.13 uWSGI webservice — they require the Build Service backend, which uses Cloud Native Buildpacks to build a container image directly from the public GitHub repo and runs it according to the Procfile. Frontend assets (backend/frontend/dist/) are committed to git so the Python buildpack alone is sufficient — no Node.js step in the build pipeline.
Refs:
- https://wikitech.wikimedia.org/wiki/Help:Toolforge/My_first_Python_ASGI_tool
- https://wikitech.wikimedia.org/wiki/Help:Toolforge/Build_Service
DB credentials don't need to be configured: Toolforge auto-injects TOOL_REPLICA_USER and TOOL_REPLICA_PASSWORD into Build Service containers (same as for the legacy uWSGI webservice). The app reads them directly from os.environ.
ssh toolforge
become wikitermbase
# Stop the legacy webservice if it was previously running on python3.13
toolforge webservice --backend=kubernetes python3.13 stop || true
# Build the image from the public GitHub repo
toolforge build start https://github.com/forzagreen/wikitermbase
toolforge build show # wait until status is ok(Succeeded)
# Start the Build Service webservice
toolforge webservice buildservice start --mount=noneTest: https://wikitermbase.toolforge.org/api/v1/stats. Logs: toolforge webservice buildservice logs -f.
Code deploys are automated. On push to main, the deploy-code job in .github/workflows/ci.yml SSHs into the bastion and runs toolforge build start + toolforge webservice buildservice restart. Markdown-only and data-only changes skip the rebuild. Manual re-deploy: Actions tab → "CI" → "Run workflow" on main.
Include any frontend rebuild in the commit (make build_frontend && git add backend/frontend/dist && git commit). The Python buildpack auto-detects uv.lock and installs deps with uv sync, so committing changes to pyproject.toml + uv.lock is all that's needed when adding dependencies.
Verify the gadget on Arabic Wikipedia still works after each deploy.
Manual fallback (if GitHub Actions is down):
ssh toolforge && become wikitermbase
toolforge build start https://github.com/forzagreen/wikitermbase
toolforge build show # wait until status is ok(Succeeded)
toolforge webservice buildservice restartData lives in forzagreen/arabterm — that's the source of truth and where dictionary edits happen. When a PR touching db/mariadb/arabterm.sql.gz is merged to arabterm's main, the cross-repo CI flow auto-opens a PR here with the regenerated db/arabterm.sql; merging that PR triggers the production DB import (see "Updating the Database" below). For the upstream dump-generation workflow (make init_mariadb, make migrate_to_mariadb, make dump), see arabterm's README.
Ref: https://wikitech.wikimedia.org/wiki/Help:Toolforge/Database#User_databases
ssh toolforgeandbecome wikitermbase- Find out your user in
$HOME/replica.my.cnf - Create the database:
- Open the SQL console:
sql tools - Create the database:
MariaDB [(none)]> CREATE DATABASE s55953__arabterm;
- Open the SQL console:
DB imports are automated. The flow is:
- Update data in forzagreen/arabterm and merge to
main. Whendb/mariadb/arabterm.sql.gzchanges, arabterm'snotify-wikitermbase.ymldispatches an event to this repo. - wikitermbase's
refresh-dump.ymlrunsmake download_dump && make fix_dumpand opens a PR titledchore: refresh DB dump from arabterm@<sha>. - Review the diff to
db/arabterm.sqland merge. CI'sdeploy-dbjob SSHs into the bastion and runsmariadb ... < db/arabterm.sqlautomatically.
Manual triggers:
-
Re-run the dump regeneration: Actions tab → "Refresh DB dump from arabterm" → "Run workflow".
-
Re-import without a code change:
ssh toolforge && become wikitermbase cd ~/wikitermbase mariadb --defaults-file=$HOME/replica.my.cnf -h tools.db.svc.wikimedia.cloud s55953__arabterm < db/arabterm.sql
All these issues are fixed by running make fix_dump
- https://jira.mariadb.org/browse/MDEV-34183 drop the line
/*!999999\- enable the sandbox mode */or/*M!999999\- enable the sandbox mode */ ERROR 1273 (HY000) at line 25: Unknown collation: 'utf8mb4_uca1400_ai_ci', replace it withutf8mb4_unicode_520_ci
- Project description at Wikipedia: مسرد الويكي
- Database from forzagreen/arabterm
- ويكيبيديا:مصادر موثوقة/معاجم وقواميس وأطالس
- Java client for the API: wiki-connect/WikiTermBaseAPI