123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172 |
- #
- # Run this Tcl script to generate the sqlite.html file.
- #
- set rcsid {$Id: arch.tcl,v 1.7 2001/11/24 13:23:05 drh Exp $}
- puts {<html>
- <head>
- <title>Architecture of SQLite</title>
- </head>
- <body bgcolor=white>
- <h1 align=center>
- The Architecture Of SQLite
- </h1>}
- puts "<p align=center>
- (This page was last modified on [lrange $rcsid 3 4] UTC)
- </p>"
- puts {
- <h2>Introduction</h2>
- <table align="right" border="1" cellpadding="15" cellspacing="1">
- <tr><th>Block Diagram Of SQLite</th></tr>
- <tr><td><img src="arch.png"></td></tr>
- </table>
- <p>This document describes the architecture of the SQLite library.
- The information here is useful to those who want to understand or
- modify the inner workings of SQLite.
- </p>
- <p>
- A block diagram showing the main components of SQLite
- and how they interrelate is shown at the right. The text that
- follows will provide a quick overview of each of these components.
- </p>
- <h2>Interface</h2>
- <p>Most of the public interface to the SQLite library is implemented by
- four functions found in the <b>main.c</b> source file. The
- <b>sqlite_get_table()</b> routine is implemented in <b>table.c</b>.
- The Tcl interface is implemented by <b>tclsqlite.c</b>. More
- information on the C interface to SQLite is
- <a href="c_interface.html">available separately</a>.<p>
- <p>To avoid name collisions with other software, all external
- symbols in the SQLite library begin with the prefix <b>sqlite</b>.
- Those symbols that are intended for external use (in other words,
- those symbols which form the API for SQLite) begin
- with <b>sqlite_</b>.</p>
- <h2>Tokenizer</h2>
- <p>When a string containing SQL statements is to be executed, the
- interface passes that string to the tokenizer. The job of the tokenizer
- is to break the original string up into tokens and pass those tokens
- one by one to the parser. The tokenizer is hand-coded in C.
- All of the code for the tokenizer
- is contained in the <b>tokenize.c</b> source file.</p>
- <p>Note that in this design, the tokenizer calls the parser. People
- who are familiar with YACC and BISON may be used to doing things the
- other way around -- having the parser call the tokenizer. The author
- of SQLite
- has done it both ways and finds things generally work out nicer for
- the tokenizer to call the parser. YACC has it backwards.</p>
- <h2>Parser</h2>
- <p>The parser is the piece that assigns meaning to tokens based on
- their context. The parser for SQLite is generated using the
- <a href="http://www.hwaci.com/sw/lemon/">Lemon</a> LALR(1) parser
- generator. Lemon does the same job as YACC/BISON, but is uses
- a different input syntax which is less error-prone.
- Lemon also generates a parser which is reentrant and thread-safe.
- And lemon defines the concept of a non-terminal destructor so
- that it does not leak memory when syntax errors are encountered.
- The source file that drives Lemon is found in <b>parse.y</b>.</p>
- <p>Because
- lemon is a program not normally found on development machines, the
- complete source code to lemon (just one C file) is included in the
- SQLite distribution in the "tool" subdirectory. Documentation on
- lemon is found in the "doc" subdirectory of the distribution.
- </p>
- <h2>Code Generator</h2>
- <p>After the parser assembles tokens into complete SQL statements,
- it calls the code generator to produce virtual machine code that
- will do the work that the SQL statements request. There are seven
- files in the code generator: <b>build.c</b>, <b>delete.c</b>,
- <b>expr.c</b>, <b>insert.c</b> <b>select.c</b>, <b>update.c</b>,
- and <b>where.c</b>.
- In these files is where most of the serious magic happens.
- <b>expr.c</b> handles code generation for expressions.
- <b>where.c</b> handles code generation for WHERE clauses on
- SELECT, UPDATE and DELETE statements. The files
- <b>delete.c</b>, <b>insert.c</b>, <b>select.c</b>, and
- <b>update.c</b> handle the code generation for SQL statements
- with the same names. (Each of these files calls routines
- in <b>expr.c</b> and <b>where.c</b> as necessary.) All other
- SQL statements are coded out of <b>build.c</b>.</p>
- <h2>Virtual Machine</h2>
- <p>The program generated by the code generator is executed by
- the virtual machine. Additional information about the virtual
- machine is <a href="opcode.html">available separately</a>.
- To summarize, the virtual machine implements an abstract computing
- engine specifically designed to manipulate database files. The
- machine has a stack which is used for intermediate storage.
- Each instruction contains an opcode and
- up to three additional operands.</p>
- <p>The virtual machine is entirely contained in a single
- source file <b>vdbe.c</b>. The virtual machine also has
- its own header file <b>vdbe.h</b> that defines an interface
- between the virtual machine and the rest of the SQLite library.</p>
- <h2>B-tree Driver</h2>
- <p>An SQLite database is maintained on disk using a B-tree implementation
- found in the <b>btree.c</b> source file. A separate B-tree is used for
- each table and index in the database. All B-trees are stored in the
- same disk file. Each page of a B-tree is 1024 bytes in size. The key
- and data for an entry are stored together in an area called "payload".
- Up to 236 bytes of payload can be stored on the same page as the B-tree
- entry. Any additional payload is stored in a chain of overflow pages.</p>
- <p>The interface to the B-tree subsystem is defined by the header file
- <b>btree.h</b>.
- </p>
- <h2>Page Cache</h2>
- <p>The B-tree module requests information from the disk in 1024 byte
- chunks. The page cache is reponsible for reading, writing, and
- caching these chunks at the behest of the B-tree module.
- The page cache also provides the rollback and atomic commit abstraction
- and takes care of reader/writer locking of the database file. The
- B-tree driver requests particular pages from the page cache and notifies
- the page cache when it wants to modify pages or commit or rollback
- changes and the page cache handles all the messy details of making sure
- the requests are handled quickly, safely, and efficiently.</p>
- <p>The code to implement the page cache is contained in the single C
- source file <b>pager.c</b>. The interface to the page cache subsystem
- is defined by the header file <b>pager.h</b>.
- </p>
- <h2>OS Interface</h2>
- <p>
- In order to provide portability between POSIX and Win32 operating systems,
- SQLite uses an abstraction layer to interace with the operating system.
- The <b>os.c</b> file contains about 20 routines used for opening and
- closing files, deleting files, creating and deleting locks on files,
- flushing the disk cache, and so forth. Each of these functions contains
- two implementations separated by #ifdefs: one for POSIX and the other
- for Win32. The interface to the OS abstraction layer is defined by
- the <b>os.h</b> header file.
- </p>
- }
- puts {
- <br clear="both" />
- <p><hr /></p>
- <p><a href="index.html"><img src="/goback.jpg" border=0 />
- Back to the SQLite Home Page</a>
- </p>
- </body></html>}
|