App/Xtn/Page sync
XOWA can synchronize the latest pages from the online version to the offline version
Please note that this feature is still a work in progress. For the latest updates / usage notes, please check https://github.com/gnosygnu/xowa/issues/72 |
Contents
Options
The options page is at Options/Page_sync
Background
In general, offline dumps are generated on a semi-frequent basis:
- Wikimedia generates offline Wikitext dumps twice per month. See https://dumps.wikimedia.org/backup-index.html
- XOWA generates HTML dumps once per month. See Special:XowaDownloadCentral
Sometimes though you may only want to update one page without:
- Waiting for the dump to occur
- Downloading and importing a whole new wiki
The Page sync feature allows you to update selected pages
Issues
The Page sync feature is still a work in progress. The following are known issues:
Math, Helper Buttons (enlarge / more info), Musical scores, and other images don't work
Wikipedia stores some images in a separate location. XOWA still needs code to detect this location, copy it offline, and show it correctly. This should be done in the next few releases, but in the meantime it's heavily recommended that you don't use Automatic sync for all pages. In particular, math pages will lose all equations after a sync. For example, https://en.wikipedia.org/wiki/Pythagorean_theorem
- v3.9.4.1 now syncs Math images
No rollback option
XOWA stores only the latest version of a page. Previous versions of the page will not be available after synchronization. This feature will be added in a future version
If you synchronize and want to rollback, you will need to do it manually for all updates. For example, here's a scenario for English Wikipedia:
- Backup en.wikipedia.org-core.xowa.
- Backup en.wikipedia.org-html.user.xowa and en.wikipedia.org-sync.xowa if they exist.
- Synchronize https://en.wikipedia.org/wiki/Pythagorean_theorem
- Realize that the synchronization is bad and start the rollback.
- Exit XOWA
- Restore en.wikipedia.org-core.xowa
- If backups exist, restore en.wikipedia.org-html.user.xowa and en.wikipedia.org-sync.xowa
- If backups don't exist, delete en.wikipedia.org-html.user.xowa and en.wikipedia.org-sync.xowa
- Run XOWA and go to https://en.wikipedia.org/wiki/Pythagorean_theorem
No synchronization for new pages
A page must exist in the offline wiki in order to be synchronized. New pages cannot be synchronized. Navigating to a new page will just result in a "Page not found" error.
This feature will also be added in a future version
Other issues
Other issues may be present. Please check https://github.com/gnosygnu/xowa/issues/72 for updates. Once the issue is closed, then the Page Sync feature will no longer be marked "Work in progress" and should be fully operational
Usage notes
Manual sync
Manual sync works by doing the following:
- Enable "manual sync" in the options page
- Click the "Sync" link in the left-hand sidebar.
Note the following details:
- online mode required: You must have "Web access enabled" in Options/Security
-
exclusions: Page sync does not work for the following pages:
- home wiki pages: For example, this page. The home wiki is updated offline with every release. The latest version can also be viewed online at http://xowa.org
- Special pages: Nearly all special pages have dynamic content and cannot be "mirrored" offline
- Wikia / non-Wikimedia pages: Wikia wikis and non-Wikimedia wikis cannot be synchronized. This may be an option for a future release, but there are currently no plans.
Auto sync
Automatic sync works by doing the following:
- Enable "manual sync" in the options page
- Visit the Main_Page for a wiki. The page will automatically sync
Note the following details:
- default page is Main_Page: Due to the issues above, it's recommended that auto-sync only be enabled for the Main Page. Other pages can be added under custom scope
- default interval is 1440 minutes (24 hours): By default, XOWA will only synchronize a page if the last synchronization check is at least 24 hours old. This interval can also be adjusted
Technical details
Manual sync
This is an overview of what occurs when the Sync link is pressed
-
XOWA calls the Wikipedia api to get the HTML version of the page. For example:
https://en.wikipedia.org/w/api.php?action=parse&format=json&redirects=1&page=Wikipedia:Main%20Page
-
XOWA parses the HTML and...
- Removes the Edit links (These aren't implemented in XOWA. They can be but I personally find them distracting and not applicable offline.)
- Identifies images to download.
- XOWA saves the HTML to "en.wikipedia.org-html.user.xowa"
- XOWA updates the core database (en.wikipedia.org-core.xowa) to point to this HTML
- XOWA then downloads the images separately
Automatic sync
Automatic sync uses the same process as Manual Sync. The main process is as follows:
- XOWA opens a page
- XOWA checks if auto sync is enabled
- If auto-sync is enabled, then it checks the sync time in "en.wikipedia.org-sync.xowa"
- If the sync time doesn't exist, or is greater than the specified interval (24 hours by default), then it kicks off the manual sync
- XOWA updates the sync time for the page