File-sizes.html 11 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271
  1. <!DOCTYPE html>
  2. <html dir="ltr">
  3. <head>
  4. <meta http-equiv="content-type" content="text/html;charset=UTF-8" />
  5. <title>Dev/Design/Sqlite/File-sizes - XOWA</title>
  6. <link rel="shortcut icon" href="https://gnosygnu.github.io/xowa/xowa_logo.png" />
  7. <link rel="stylesheet" href="https://gnosygnu.github.io/xowa/xowa_common.css" type="text/css">
  8. </head>
  9. <body class="mediawiki ltr sitedir-ltr ns-0 ns-subject skin-vector action-submit vector-animateLayout" spellcheck="false">
  10. <div id="mw-page-base" class="noprint"></div>
  11. <div id="mw-head-base" class="noprint"></div>
  12. <div id="content" class="mw-body">
  13. <h1 id="firstHeading" class="firstHeading"><span>Dev/Design/Sqlite/File-sizes</span></h1>
  14. <div id="bodyContent" class="mw-body-content">
  15. <div id="siteSub">From XOWA: the free, open-source, offline wiki application</div>
  16. <div id="contentSub"></div>
  17. <div id="mw-content-text" lang="en" dir="ltr" class="mw-content-ltr">
  18. <p>
  19. The XOWA sqlite import currently defaults to a multi-file format. This format is chosen for two reasons:
  20. </p>
  21. <ul>
  22. <li>
  23. <b>Large wikis and FAT32</b>:
  24. <ul>
  25. <li>
  26. Most flash memory cards use a <a href="http://en.wikipedia.org/wiki/FAT32" rel="nofollow" class="external text">FAT32</a> file-system. FAT32 is particulary convenient when exchanging files between Windows, Linux, Mac OS X and Android.
  27. </li>
  28. <li>
  29. FAT32 has a limit of 4GB for any one file. A large wiki like en.wikipedia.org will easily take 20 GB.
  30. </li>
  31. <li>
  32. Multiple files allow the 20 GB data to be broken into smaller pieces: each less than 4 GB
  33. </li>
  34. </ul>
  35. </li>
  36. </ul>
  37. <ul>
  38. <li>
  39. <b>Slight performance gains</b>
  40. </li>
  41. </ul>
  42. <dl>
  43. <dd>
  44. A smaller database file may be easier to query than a large one because all the pages will be grouped closer together on disk
  45. </dd>
  46. <dd>
  47. For example, consider a wiki page that requires 50 template pages.
  48. <dl>
  49. <dd>
  50. With a single-file format, these 50 pages may be scattered anywhere over the 20 GB file.
  51. </dd>
  52. <dd>
  53. With a multi-file format, these 50 pages may be scattered anywhere over a smaller 280 MB file. A disk drive will have to seek over a smaller section of disk. For a smaller wiki, the entire template file may be stored in the hard disk cache.
  54. </dd>
  55. </dl>
  56. </dd>
  57. </dl>
  58. <p>
  59. The file format is controlled by other arguments
  60. </p>
  61. <div id="toc" class="toc">
  62. <div id="toctitle" class="toctitle">
  63. <h2>
  64. Contents
  65. </h2>
  66. </div>
  67. <ul>
  68. <li class="toclevel-1 tocsection-1">
  69. <a href="#ns_file_map"><span class="tocnumber">1</span> <span class="toctext">ns_file_map</span></a>
  70. </li>
  71. <li class="toclevel-1 tocsection-2">
  72. <a href="#db_text_max_value"><span class="tocnumber">2</span> <span class="toctext">db_text_max value</span></a>
  73. </li>
  74. <li class="toclevel-1 tocsection-3">
  75. <a href="#db_categorylink_max_and_db_wikidata_max_value"><span class="tocnumber">3</span> <span class="toctext">db_categorylink_max and db_wikidata_max value</span></a>
  76. </li>
  77. <li class="toclevel-1 tocsection-4">
  78. <a href="#db_wikidata_max_value"><span class="tocnumber">4</span> <span class="toctext">db_wikidata_max value</span></a>
  79. </li>
  80. </ul>
  81. </div>
  82. <h2>
  83. <span class="mw-headline" id="ns_file_map">ns_file_map</span>
  84. </h2>
  85. <p>
  86. The ns_file_map argument is a new-line/semi-colon delimited string. The default value is the following:
  87. </p>
  88. <pre>
  89. Template;Module
  90. </pre>
  91. <p>
  92. Note that each line has a list of namespace names. Multiple namespaces can be delimited with the ";". The namespace name must be the "canonical" English name.
  93. </p>
  94. <p>
  95. Note that an empty string will default everything to be stored in the core database. If a single file database is desired, specify "".
  96. </p>
  97. <h2>
  98. <span class="mw-headline" id="db_text_max_value">db_text_max value</span>
  99. </h2>
  100. <p>
  101. This is a number that represents the maximum number of MB of text data that can be stored in the file. Note the following
  102. </p>
  103. <ul>
  104. <li>
  105. Once a file reaches that number, it will spill over into a new file.
  106. </li>
  107. </ul>
  108. <dl>
  109. <dd>
  110. For example, file 002 is the text database. After 3,000 MB of text data is stored in file 002, the next 3,000 MB of text data will be stored in file 003.
  111. </dd>
  112. </dl>
  113. <ul>
  114. <li>
  115. The number is a rough approximation of total database size. A precise value cannot be used b/c of the following non-deterministic variables:
  116. <ul>
  117. <li>
  118. Sqlite database page size (data / indexes will not fill up an entire page)
  119. </li>
  120. <li>
  121. Sqlite table / database overhead
  122. </li>
  123. </ul>
  124. </li>
  125. </ul>
  126. <dl>
  127. <dd>
  128. As such, please use a number which is 80% of the desired size. For example, if you want a database no greater than 4,000 MB (4.0 GB), use 3,000
  129. </dd>
  130. </dl>
  131. <h2>
  132. <span class="mw-headline" id="db_categorylink_max_and_db_wikidata_max_value">db_categorylink_max and db_wikidata_max value</span>
  133. </h2>
  134. <p>
  135. This is a number that represents the maximum number of MB of categorylink data that can be stored in the file. Note the following:
  136. </p>
  137. <ul>
  138. <li>
  139. This number functions similarly to the db_text_max value above. (Once the max is reached, new data will spill over into a new file)
  140. </li>
  141. <li>
  142. However, it is more precise than db_text_max. The number specified is 90% of the actual value (presumably due to less page fragmentation)
  143. </li>
  144. </ul>
  145. <h2>
  146. <span class="mw-headline" id="db_wikidata_max_value">db_wikidata_max value</span>
  147. </h2>
  148. <p>
  149. This is a number that represents the maximum number of MB of wikidata label data that can be stored in the file. Note the following:
  150. </p>
  151. <ul>
  152. <li>
  153. This number only affects www.wikidata.org wikis
  154. </li>
  155. <li>
  156. This number only recognizes 0 and not 0.
  157. <ul>
  158. <li>
  159. To put all wikidata data in one database, use 0
  160. </li>
  161. <li>
  162. To put all wikidata data in another database, use any number &gt; 0
  163. </li>
  164. </ul>
  165. </li>
  166. </ul>
  167. </div>
  168. </div>
  169. </div>
  170. <div id="mw-head" class="noprint">
  171. <div id="left-navigation">
  172. <div id="p-namespaces" class="vectorTabs">
  173. <h3>Namespaces</h3>
  174. <ul>
  175. <li id="ca-nstab-main" class="selected"><span><a id="ca-nstab-main-href" href="index.html">Page</a></span></li>
  176. </ul>
  177. </div>
  178. </div>
  179. </div>
  180. <div id='mw-panel' class='noprint'>
  181. <div id='p-logo'>
  182. <a style="background-image: url(https://gnosygnu.github.io/xowa/xowa_logo.png);" href="http://xowa.org/" title="Visit the main page"></a>
  183. </div>
  184. <div class="portal" id='xowa-portal-home'>
  185. <h3>XOWA</h3>
  186. <div class="body">
  187. <ul>
  188. <li><a href="http://xowa.org/index.html" title='Visit the main page'>Main page</a></li>
  189. <li><a href="http://xowa.org/screenshots.html" title='See screenshots of XOWA'>Screenshots</a></li>
  190. <li><a href="https://www.youtube.com/watch?v=q0qbXYXEH6M" title="See a video of XOWA Desktop in action">Video</a></li>
  191. <li><a href="http://xowa.org/home/wiki/Help/Download_XOWA.html" title='Download the XOWA application'>Download XOWA</a></li>
  192. <li><a href="http://xowa.org/home/wiki/Dashboard/Image_databases.html" title='Download offline wikis and image databases'>Download wikis</a></li>
  193. </ul>
  194. </div>
  195. </div>
  196. <div class="portal" id='xowa-portal-started'>
  197. <h3>Getting started</h3>
  198. <div class="body">
  199. <ul>
  200. <li><a href="http://xowa.org/home/wiki/App/Setup/System_requirements.html" title='Get XOWA&apos;s system requirements'>Requirements</a></li>
  201. <li><a href="http://xowa.org/home/wiki/App/Setup/Installation.html" title='Get instructions for installing XOWA'>Installation</a></li>
  202. <li><a href="http://xowa.org/home/wiki/App/Import/Simple_Wikipedia.html" title='Learn how to set up Simple Wikipedia'>Simple Wikipedia</a></li>
  203. <li><a href="http://xowa.org/home/wiki/App/Import/English_Wikipedia.html" title='Learn how to set up English Wikipedia'>English Wikipedia</a></li>
  204. <li><a href="http://xowa.org/home/wiki/App/Import/Other_wikis.html" title='Learn how to set up other Wikipedias'>Other Wikipedias</a></li>
  205. </ul>
  206. </div>
  207. </div>
  208. <div class="portal" id='xowa-portal-android'>
  209. <h3>Android</h3>
  210. <div class="body">
  211. <ul>
  212. <li><a href="http://xowa.org/home/wiki/Android/Setup.html" title='Setup XOWA on your Android device'>Setup</a></li>
  213. <li><a href="https://www.youtube.com/watch?v=jsMTBxGweUw" title="See a video of XOWA Android in action">Video</a></li>
  214. </ul>
  215. </div>
  216. </div>
  217. <div class="portal" id='xowa-portal-help'>
  218. <h3>Help</h3>
  219. <div class="body">
  220. <ul>
  221. <li><a href="http://xowa.org/home/wiki/Help/About.html" title='Get more information about XOWA'>About</a></li>
  222. <li><a href="http://xowa.org/home/wiki/Help/Contents.html" title='View a list of help topics'>Contents</a></li>
  223. <li><a href="http://xowa.org/home/wiki/Help/Media.html" title='Read what others have written about XOWA'>Media</a></li>
  224. <li><a href="http://xowa.org/home/wiki/Help/Feedback.html" title='Questions? Comments? Leave feedback for XOWA'>Feedback</a></li>
  225. </ul>
  226. </div>
  227. </div>
  228. <div class="portal" id='xowa-portal-blog'>
  229. <h3>Blog</h3>
  230. <div class="body">
  231. <ul>
  232. <li><a href="http://xowa.org/home/wiki/Blog.html" title='Follow XOWA''s development process'>Current</a></li>
  233. </ul>
  234. </div>
  235. </div>
  236. <div class="portal" id='xowa-portal-links'>
  237. <h3>Links</h3>
  238. <div class="body">
  239. <ul>
  240. <li><a href="http://dumps.wikimedia.org/backup-index.html" title="Get wiki datababase dumps directly from Wikimedia">Wikimedia dumps</a></li>
  241. <li><a href="https://archive.org/search.php?query=xowa" title="Search archive.org for XOWA files">XOWA @ archive.org</a></li>
  242. <li><a href="http://en.wikipedia.org" title="Visit Wikipedia (and compare to XOWA!)">English Wikipedia</a></li>
  243. </ul>
  244. </div>
  245. </div>
  246. <div class="portal" id='xowa-portal-donate'>
  247. <h3>Donate</h3>
  248. <div class="body">
  249. <ul>
  250. <li><a href="https://archive.org/donate/index.php" title="Support archive.org!">archive.org</a></li><!-- listed first due to recent fire damages: http://blog.archive.org/2013/11/06/scanning-center-fire-please-help-rebuild/ -->
  251. <li><a href="https://donate.wikimedia.org/wiki/Special:FundraiserRedirector" title="Support Wikipedia!">Wikipedia</a></li>
  252. <li><a href="http://xowa.org/home/wiki/Help/Donate.html" title="Support XOWA!">XOWA</a></li>
  253. </ul>
  254. </div>
  255. </div>
  256. </div>
  257. </body>
  258. </html>