Perl-based tools for Wikipedia, Wiktionary etc.

Prerequisites

Software:

Unix-like operating system
Perl
Perl modules: MediaWiki::Bot version 3.3.1 or newer, Array::Compare

You may need to force install MediaWiki::Bot, as its tests fail.

Other:

account in MediaWiki with bot flag

Copy settings.ini.example to settings.ini and set your bot username and password.

Supported wikis

audiosetter.pl supports the following Wiktionaries:

de
en
pl
simple

Usage scenarios

Adding audio files to Wiktionary

./audio_fetcher.pl
./audiosetter.pl -w en -a
./count_audio.pl -w en

Short description:

./audio_fetcher.pl

Saves information on available pronunciation files to audio/ directory.
./audiosetter.pl -w en -a

Adds pronunciation files on English Wiktionary (-w en) in all available languages (-a). Will take a lot of time. You can kill audiosetter.pl at any time using Ctrl+C. It will save progress in done/ directory and resume without repeating anything when started for the next time.
./count_audio.pl -w en

Prints a MediaWiki table with a summary of work done.
./count_audio.pl -w en > /tmp/en.txt && ./audio_errors.pl -w en >> /tmp/en.txt

Saves a summary of added files and skipped files to /tmp/en.txt.

Running for chosen languages

./audio_fetcher -r de,fi,ru

Refresh audio files for given languages.

./audiosetter.pl -w en -l de,fi,ru

Only run for given languages instead of all.

Running again after all work is done

./audio_fetcher.pl --cleanstart --cleancache

audio_fetcher.pl caches web pages, so running it again normally won't detect any new files. Use --cleanstart and --cleancache options to fetch new audio files.
./audiosetter.pl --cleanstart -w en -a

After you run audio_fetcher.pl, run audiosetter.pl for the first time with --cleanstart option. This will reset done/ directory and the count of added files. Otherwise audiosetter.pl will consider all work done and finish without doing anything. Alternatively:

sed -i -e '/=no_entry/ d' -e '/=no_section/ d' done/done_audio_en.txt && ./audiosetter.pl -w en -a
sed -i -e '/=no_pronunciation/ d' -e '/=no_section/ d' -e '/=error/ d' done/done_dewikt_de.txt && ./dewikt_audiosetter_de.pl --recache && ./audio_errors.pl -w de --send

Files

audio_errors.pl

prints out a list pronunciation files to be added manually, because audiosetter.pl was unable to add them automatically
audio_fetcher.pl

scans for pronunciation files in Wikimedia Commons and writes results in audio/ directory for later use in scripts setting pronunciation on Wiktionary
audiosetter.pl

reads pronunciation files from audio/ directory and sets them in Wiktionary
commons_sort_fixer.pl

sets category sorting of media files on Wikimedia Commons
count_audio.pl

counts how many audio files have been added by audiosetter.pl
dewikt_audiosetter_de.pl

adds German pronunciation on German Wiktionary using an improved algorithm

Development

test/

Test harness.

Source code, bug reports

GitHub

Copyright

All code created by Derbeth under MIT license.

Name		Name	Last commit message	Last commit date
Latest commit History 398 Commits
.github/workflows		.github/workflows
Derbeth		Derbeth
test		test
testdata		testdata
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
audio_errors.pl		audio_errors.pl
audio_fetcher.pl		audio_fetcher.pl
audiosetter.pl		audiosetter.pl
block_centrump2p.pl		block_centrump2p.pl
block_coolproxy.pl		block_coolproxy.pl
block_from_log.pl		block_from_log.pl
block_ifreeproxies.pl		block_ifreeproxies.pl
block_ips.pl		block_ips.pl
block_newipnow.pl		block_newipnow.pl
block_projecthoneypot.pl		block_projecthoneypot.pl
block_proxies.pl		block_proxies.pl
block_proxylist.pl		block_proxylist.pl
block_range.pl		block_range.pl
block_samair.pl		block_samair.pl
blockdelete.pl		blockdelete.pl
category-fixer.pl		category-fixer.pl
category_sorter.pl		category_sorter.pl
commons_sort_fixer.pl		commons_sort_fixer.pl
count_audio.pl		count_audio.pl
cpanfile		cpanfile
dailyblock.sh		dailyblock.sh
debug_split_article.pl		debug_split_article.pl
detect_word.pl		detect_word.pl
dewikt_audiosetter_de.pl		dewikt_audiosetter_de.pl
get-done.sh		get-done.sh
plnews_month.pl		plnews_month.pl
proxytest.pl		proxytest.pl
revert.pl		revert.pl
rsync.cfg		rsync.cfg
send-done.sh		send-done.sh
settings.ini.example		settings.ini.example
sort-jeuwre.pl		sort-jeuwre.pl
sort_commons_cat.pl		sort_commons_cat.pl
stroke-fetcher.pl		stroke-fetcher.pl
stroke-setter.pl		stroke-setter.pl
testmwapi.pl		testmwapi.pl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Perl-based tools for Wikipedia, Wiktionary etc.

Prerequisites

Supported wikis

Usage scenarios

Adding audio files to Wiktionary

Running for chosen languages

Running again after all work is done

Files

Development

Source code, bug reports

Copyright

About

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Perl-based tools for Wikipedia, Wiktionary etc.

Prerequisites

Supported wikis

Usage scenarios

Adding audio files to Wiktionary

Running for chosen languages

Running again after all work is done

Files

Development

Source code, bug reports

Copyright

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors

Uh oh!

Languages