Dev/Command-line/Site meta
From XOWA: the free, open-source, offline wiki application
Contents
XOWA can download the metadata for the Wikimedia wikis
Background
Wikimedia exposes an API for accessing the meta-data for a given wiki. For example, for English Wikipedia, the following will return most of the meta-data around the wiki installation.
https://en.wikipedia.org/w/api.php?action=query&meta=siteinfo&siprop=general|namespaces|statistics|interwikimap|namespacealiases|specialpagealiases|libraries|extensions|skins|magicwords|functionhooks|showhooks|extensiontags|protocols|defaultoptions|languages
XOWA can call this API to download metadata for each wiki and save them in a database for data-processing. XOWA uses this info to resolve namespaces, but it will also incorporate other metadata from this API in future releases.
Process
Assuming you are on a Windows system with XOWA installed at C:\xowa
- Create a plain text-file called "C:\xowa\build_site_meta.gfs"
- Save the following text to the file:
app.bldr.pause_at_end_('n'); app.scripts.run_file_by_type('xowa_cfg_app'); app.bldr.cmds { // NOTE: wiki doesn't matter; just use any wiki name that is on your system add('simple.wikipedia.org', 'util.site_meta') { // path of the database to generate; default is C:\xowa\bin\any\xowa\cfg\wiki\site_meta.sqlite3 db_url = 'C:\xowa\site_meta__enwiki.sqlite3'; // skip any wikis which have been downloaded after this time. default is now() - 1 day // the purpose of this argument is to avoid recalling the api if it's already been called recently. // for example, if the script runs for 800 wikis and fails for 3 wikis, // you can rerun the script again and it will only download the 3 failed ones; not all 800 cutoff_time = '2015-07-01'; // list of wikis to download; note that each wiki must be separated by a new-line. default is all wikis listed in [[Dashboard/Import/Online]] wikis = 'en.wikipedia.org en.wiktionary.org'; } } app.bldr.run;
- Run the file with the following:
java -jar xowa_windows.jar --app_mode cmd --cmd_file C:\xowa\build_site_meta.gfs
- Open C:\xowa\site_meta__enwiki.sqlite3 in a sqlite shell and run the following:
SELECT * FROM site_statistic;