Category.html 11 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259
  1. <!DOCTYPE html>
  2. <html dir="ltr">
  3. <head>
  4. <meta http-equiv="content-type" content="text/html;charset=UTF-8" />
  5. <title>App/Category - XOWA</title>
  6. <link rel="shortcut icon" href="https://gnosygnu.github.io/xowa/xowa_logo.png" />
  7. <link rel="stylesheet" href="https://gnosygnu.github.io/xowa/xowa_common.css" type="text/css">
  8. </head>
  9. <body class="mediawiki ltr sitedir-ltr ns-0 ns-subject skin-vector action-submit vector-animateLayout" spellcheck="false">
  10. <div id="mw-page-base" class="noprint"></div>
  11. <div id="mw-head-base" class="noprint"></div>
  12. <div id="content" class="mw-body">
  13. <h1 id="firstHeading" class="firstHeading"><span>App/Category</span></h1>
  14. <div id="bodyContent" class="mw-body-content">
  15. <div id="siteSub">From XOWA: the free, open-source, offline wiki application</div>
  16. <div id="contentSub"></div>
  17. <div id="mw-content-text" lang="en" dir="ltr" class="mw-content-ltr">
  18. <p>
  19. As of v3.9.2.1 XOWA has one Category system: v3.
  20. </p>
  21. <ul>
  22. <li>
  23. The first part of this page will discuss v3.
  24. </li>
  25. <li>
  26. The second part of the page is an archived copy of the earlier v1 and v2 explanation
  27. </li>
  28. </ul>
  29. <h2>
  30. <span class="mw-headline" id="Version_3">Version 3</span>
  31. </h2>
  32. <p>
  33. Version 3 was introduced to handle Categories for HTML dumps on PC and Android. The high-level details are as follows:
  34. </p>
  35. <ul>
  36. <li>
  37. <b>Uses the Wikimedia categorylinks and page_props dumps</b>: Like v2, v3 downloads separate MediaWiki dumps for categorylinks.sql and page_props.sql . Both files are needed to generate accurate renditions of the Wikipedia Category system.
  38. </li>
  39. <li>
  40. <b>Generates "*-xtn.category.*.xowa" files</b>: v3 stores all the category info in "*-xtn.category.*.xowa" files. For smaller wikis, the data is stored instead in the "-core.xowa" file
  41. </li>
  42. <li>
  43. <b>Works with both Wikitext databases and HTML databases</b>: v3 will work with wikis imported by "Import/Online" (Wikitext) as well as "Download Central" (HTML)
  44. </li>
  45. <li>
  46. <b>Is backwards compatible with v1 and v2</b>: v3.9.2.1 will work with v1, v2, and v3 category systems
  47. </li>
  48. <li>
  49. <b>Is generated automatically with import</b>: v3 is now generated automatically when importing a wiki. Previously, v2 would require a separate post-processing step under Import/Offline.
  50. </li>
  51. <li>
  52. <b>Smaller size</b>: v3 makes some database changes to reduce file size. For English Wikipedia, that means a difference between 10 GB and 8 GB. 8 GB may sound like a lot for Categories, but keep in mind there are over 100 million page to category links.
  53. </li>
  54. <li>
  55. <b>Does not work for text database dumps</b>: XOWA originally started off storing files in text files instead of sqlite files. I switched over to SQLite three years ago and phased out text databases two years ago. It's possible that some users with old wikis (3 years old) may still have these text databases. If so, then the new Category system won't work.
  56. </li>
  57. </ul>
  58. <hr>
  59. <h2>
  60. <span class="mw-headline" id="Version_1">Version 1</span>
  61. </h2>
  62. <p>
  63. Version 1 is a simplistic category system.
  64. </p>
  65. <ul>
  66. <li>
  67. It relies only on page content inside the xml file. It does not use any of the category*.sql dumps.
  68. </li>
  69. </ul>
  70. <p>
  71. Note the following limitations:
  72. </p>
  73. <ul>
  74. <li>
  75. Does not work with large categories. It gets linearly worse with more members (do not use it to load a category with over 10,000 members)
  76. </li>
  77. <li>
  78. Does not support paging. If a category has 1,000 members, it will load title information on all 1,000 (instead of just the first 200)
  79. </li>
  80. <li>
  81. Does not use sortkey. For example, Jimmy Wales will alphabetize under J (for Jimmy Wales) instead of W (for Wales, Jimmy)
  82. </li>
  83. <li>
  84. Does not accurately reflect page membership in categories.
  85. </li>
  86. </ul>
  87. <dl>
  88. <dd>
  89. For example, most hidden categories are added to a template which is then included in a page.
  90. </dd>
  91. <dd>
  92. Specifically, a page called "File:GNU.png" may belong to "All free media". However, the "File:GNU.png" page doesn't have the [[Category:All_free_media]] but instead embeds a template {{All_non_free_media}} which has the [[Category:All_free_media]]
  93. </dd>
  94. <dd>
  95. Since a full parse (with templates) of the entire xml file would take many hours, this membership data is omitted.
  96. </dd>
  97. </dl>
  98. <p>
  99. V1 should be considered obsolete. No signficant changes will be made to it, as V2 is the official category system.
  100. </p>
  101. <p>
  102. However, because V1 is faster to setup than V2, it still remains the default (with a strong recommendation to upgrade to V2 when time permits)
  103. </p>
  104. <h2>
  105. <span class="mw-headline" id="Version_2">Version 2</span>
  106. </h2>
  107. <p>
  108. Version 2 is an accurate category system.
  109. </p>
  110. <ul>
  111. <li>
  112. It uses the Wikimedia dump files: categorylinks.sql, page_props.sql
  113. </li>
  114. </ul>
  115. <p>
  116. It addresses each of the limitations of version 1, including
  117. </p>
  118. <ul>
  119. <li>
  120. Works with large categories
  121. </li>
  122. <li>
  123. Supports paging
  124. </li>
  125. <li>
  126. Uses sortkey
  127. </li>
  128. <li>
  129. Accurately includes all members of a category
  130. </li>
  131. </ul>
  132. <p>
  133. It has a few limitations:
  134. </p>
  135. <ul>
  136. <li>
  137. It requires additional dump files (as mentioned above).
  138. </li>
  139. <li>
  140. It takes longer to setup. A separate .sql file must be parsed. For English Wikipedia this process takes about another hour.
  141. </li>
  142. <li>
  143. It takes more disk space. The v2 system stores sortkeys individually per entry (just like Wikipedia). However this text data greatly increases the overall file size. English Wikipedia will have about 10.0 GB of extra data.
  144. </li>
  145. </ul>
  146. <p>
  147. V2 is the official category system and should generate Category pages just like Wikipedia.
  148. </p>
  149. <p>
  150. For more information about V2 setup see <a href="/wiki/App/Category/Building" id="xolnki_2" title="App/Category/Building">App/Category/Building</a>
  151. </p>
  152. <p>
  153. For more information about V2 internals see <a href="/wiki/App/Category/Internals" id="xolnki_3" title="App/Category/Internals">App/Category/Internals</a>
  154. </p>
  155. </div>
  156. </div>
  157. </div>
  158. <div id="mw-head" class="noprint">
  159. <div id="left-navigation">
  160. <div id="p-namespaces" class="vectorTabs">
  161. <h3>Namespaces</h3>
  162. <ul>
  163. <li id="ca-nstab-main" class="selected"><span><a id="ca-nstab-main-href" href="index.html">Page</a></span></li>
  164. </ul>
  165. </div>
  166. </div>
  167. </div>
  168. <div id='mw-panel' class='noprint'>
  169. <div id='p-logo'>
  170. <a style="background-image: url(https://gnosygnu.github.io/xowa/xowa_logo.png);" href="http://xowa.org/" title="Visit the main page"></a>
  171. </div>
  172. <div class="portal" id='xowa-portal-home'>
  173. <h3>XOWA</h3>
  174. <div class="body">
  175. <ul>
  176. <li><a href="http://xowa.org/index.html" title='Visit the main page'>Main page</a></li>
  177. <li><a href="http://xowa.org/screenshots.html" title='See screenshots of XOWA'>Screenshots</a></li>
  178. <li><a href="https://www.youtube.com/watch?v=q0qbXYXEH6M" title="See a video of XOWA Desktop in action">Video</a></li>
  179. <li><a href="http://xowa.org/home/wiki/Help/Download_XOWA.html" title='Download the XOWA application'>Download XOWA</a></li>
  180. <li><a href="http://xowa.org/home/wiki/Dashboard/Image_databases.html" title='Download offline wikis and image databases'>Download wikis</a></li>
  181. </ul>
  182. </div>
  183. </div>
  184. <div class="portal" id='xowa-portal-started'>
  185. <h3>Getting started</h3>
  186. <div class="body">
  187. <ul>
  188. <li><a href="http://xowa.org/home/wiki/App/Setup/System_requirements.html" title='Get XOWA&apos;s system requirements'>Requirements</a></li>
  189. <li><a href="http://xowa.org/home/wiki/App/Setup/Installation.html" title='Get instructions for installing XOWA'>Installation</a></li>
  190. <li><a href="http://xowa.org/home/wiki/App/Import/Simple_Wikipedia.html" title='Learn how to set up Simple Wikipedia'>Simple Wikipedia</a></li>
  191. <li><a href="http://xowa.org/home/wiki/App/Import/English_Wikipedia.html" title='Learn how to set up English Wikipedia'>English Wikipedia</a></li>
  192. <li><a href="http://xowa.org/home/wiki/App/Import/Other_wikis.html" title='Learn how to set up other Wikipedias'>Other Wikipedias</a></li>
  193. </ul>
  194. </div>
  195. </div>
  196. <div class="portal" id='xowa-portal-android'>
  197. <h3>Android</h3>
  198. <div class="body">
  199. <ul>
  200. <li><a href="http://xowa.org/home/wiki/Android/Setup.html" title='Setup XOWA on your Android device'>Setup</a></li>
  201. <li><a href="https://www.youtube.com/watch?v=jsMTBxGweUw" title="See a video of XOWA Android in action">Video</a></li>
  202. </ul>
  203. </div>
  204. </div>
  205. <div class="portal" id='xowa-portal-help'>
  206. <h3>Help</h3>
  207. <div class="body">
  208. <ul>
  209. <li><a href="http://xowa.org/home/wiki/Help/About.html" title='Get more information about XOWA'>About</a></li>
  210. <li><a href="http://xowa.org/home/wiki/Help/Contents.html" title='View a list of help topics'>Contents</a></li>
  211. <li><a href="http://xowa.org/home/wiki/Help/Media.html" title='Read what others have written about XOWA'>Media</a></li>
  212. <li><a href="http://xowa.org/home/wiki/Help/Feedback.html" title='Questions? Comments? Leave feedback for XOWA'>Feedback</a></li>
  213. </ul>
  214. </div>
  215. </div>
  216. <div class="portal" id='xowa-portal-blog'>
  217. <h3>Blog</h3>
  218. <div class="body">
  219. <ul>
  220. <li><a href="http://xowa.org/home/wiki/Blog.html" title='Follow XOWA''s development process'>Current</a></li>
  221. </ul>
  222. </div>
  223. </div>
  224. <div class="portal" id='xowa-portal-links'>
  225. <h3>Links</h3>
  226. <div class="body">
  227. <ul>
  228. <li><a href="http://dumps.wikimedia.org/backup-index.html" title="Get wiki datababase dumps directly from Wikimedia">Wikimedia dumps</a></li>
  229. <li><a href="https://archive.org/search.php?query=xowa" title="Search archive.org for XOWA files">XOWA @ archive.org</a></li>
  230. <li><a href="http://en.wikipedia.org" title="Visit Wikipedia (and compare to XOWA!)">English Wikipedia</a></li>
  231. </ul>
  232. </div>
  233. </div>
  234. <div class="portal" id='xowa-portal-donate'>
  235. <h3>Donate</h3>
  236. <div class="body">
  237. <ul>
  238. <li><a href="https://archive.org/donate/index.php" title="Support archive.org!">archive.org</a></li><!-- listed first due to recent fire damages: http://blog.archive.org/2013/11/06/scanning-center-fire-please-help-rebuild/ -->
  239. <li><a href="https://donate.wikimedia.org/wiki/Special:FundraiserRedirector" title="Support Wikipedia!">Wikipedia</a></li>
  240. <li><a href="http://xowa.org/home/wiki/Help/Donate.html" title="Support XOWA!">XOWA</a></li>
  241. </ul>
  242. </div>
  243. </div>
  244. </div>
  245. </body>
  246. </html>