Wiktionary dump file parser and multilingual data extractor
-
Updated
Aug 28, 2025 - Python
Wiktionary dump file parser and multilingual data extractor
Discover the most comprehensive dictionaries built on Wiktionary. Universal, multilingual & monolingual—bimonthly updates, 180+ languages supported.
Extract data from German Wiktionary XML files.
[LREC 2020] EtymDB, an Etymological DataBase (v2.1)
A comprehensive and extensible Wiktionary parsing framework.
Code for the paper: Wikinflection: Massive semi-supervised generation of multilingual inflectional corpus from Wiktionary (Metheniti and Neumann, 2018)
This repository contains a python script for parsing an xml dump of the Italian Wiktionary (Wikizionario); it also contains the parsed dictionary in a JSON file and a ONLI (italian database of neologisms) scraper with the scraped data in a CSV file
A library for parsing the french wiktionary
Light Wiki parser and renderer developed in Java and Lua, from wiktionary xml dump to html
DBnary extractor mirror - See https://gitlab.com/gilles.serasset/dbnary
Extraction of the Russian word forms and their segmentation from the Russian Wiktionary
A Hands-On Guide to Parsing Wikitext with Python
Parses the Russian Wiktionary HTML dumps into JSON and generates ereader dictionaries
Selected data processing scripts including language agnostic multilingual wiktionary parser
A scraper which extracts data from the German Wiktionary HTML dump.
Wiktionary Parser written in Ruby
A Python package to parse and extract data from the German Wiktionary. It allows users to access wikitext content, either by fetching it directly online or by loading a dump file locally.
Web interface for parsing Wiktionary for results in specific languages
English-German (Sorted by Frequency)
Add a description, image, and links to the wiktionary-parser topic page so that developers can more easily learn about it.
To associate your repository with the wiktionary-parser topic, visit your repo's landing page and select "manage topics."