From debcfd2dcb2f2f628dde7b1ae0a1c5359e7df610 Mon Sep 17 00:00:00 2001 From: danielk1977 Date: Fri, 24 Apr 2009 09:27:16 +0000 Subject: [PATCH] Improve comments and documentation of the asynchronous IO VFS module. (CVS 6543) FossilOrigin-Name: 92bc6be2a86f8a68ceded2bc08fe7d6ff23b56fb --- ext/async/README.txt | 164 ++++++++++++++++++++++++++++++++++ ext/async/sqlite3async.c | 105 ++-------------------- ext/async/sqlite3async.h | 185 +++++++++++++++++++++++++++++++++++++-- manifest | 17 ++-- manifest.uuid | 2 +- 5 files changed, 356 insertions(+), 117 deletions(-) create mode 100644 ext/async/README.txt diff --git a/ext/async/README.txt b/ext/async/README.txt new file mode 100644 index 0000000000..05acffe0df --- /dev/null +++ b/ext/async/README.txt @@ -0,0 +1,164 @@ + +Normally, when SQLite writes to a database file, it waits until the write +operation is finished before returning control to the calling application. +Since writing to the file-system is usually very slow compared with CPU +bound operations, this can be a performance bottleneck. This directory +contains an extension that causes SQLite to perform all write requests +using a separate thread running in the background. Although this does not +reduce the overall system resources (CPU, disk bandwidth etc.) at all, it +allows SQLite to return control to the caller quickly even when writing to +the database, eliminating the bottleneck. + + 1. Functionality + + 1.1 How it Works + 1.2 Limitations + 1.3 Locking and Concurrency + + 2. Compilation and Usage + + 3. Porting + + + +1. FUNCTIONALITY + + With asynchronous I/O, write requests are handled by a separate thread + running in the background. This means that the thread that initiates + a database write does not have to wait for (sometimes slow) disk I/O + to occur. The write seems to happen very quickly, though in reality + it is happening at its usual slow pace in the background. + + Asynchronous I/O appears to give better responsiveness, but at a price. + You lose the Durable property. With the default I/O backend of SQLite, + once a write completes, you know that the information you wrote is + safely on disk. With the asynchronous I/O, this is not the case. If + your program crashes or if a power loss occurs after the database + write but before the asynchronous write thread has completed, then the + database change might never make it to disk and the next user of the + database might not see your change. + + You lose Durability with asynchronous I/O, but you still retain the + other parts of ACID: Atomic, Consistent, and Isolated. Many + appliations get along fine without the Durablity. + + 1.1 How it Works + + Asynchronous I/O works by creating a special SQLite "vfs" structure + and registering it with sqlite3_vfs_register(). When files opened via + this vfs are written to (using the vfs xWrite() method), the data is not + written directly to disk, but is placed in the "write-queue" to be + handled by the background thread. + + When files opened with the asynchronous vfs are read from + (using the vfs xRead() method), the data is read from the file on + disk and the write-queue, so that from the point of view of + the vfs reader the xWrite() appears to have already completed. + + The special vfs is registered (and unregistered) by calls to the + API functions sqlite3async_initialize() and sqlite3async_shutdown(). + See section "Compilation and Usage" below for details. + + 1.2 Limitations + + In order to gain experience with the main ideas surrounding asynchronous + IO, this implementation is deliberately kept simple. Additional + capabilities may be added in the future. + + For example, as currently implemented, if writes are happening at a + steady stream that exceeds the I/O capability of the background writer + thread, the queue of pending write operations will grow without bound. + If this goes on for long enough, the host system could run out of memory. + A more sophisticated module could to keep track of the quantity of + pending writes and stop accepting new write requests when the queue of + pending writes grows too large. + + 1.3 Locking and Concurrency + + Multiple connections from within a single process that use this + implementation of asynchronous IO may access a single database + file concurrently. From the point of view of the user, if all + connections are from within a single process, there is no difference + between the concurrency offered by "normal" SQLite and SQLite + using the asynchronous backend. + + If file-locking is enabled (it is enabled by default), then connections + from multiple processes may also read and write the database file. + However concurrency is reduced as follows: + + * When a connection using asynchronous IO begins a database + transaction, the database is locked immediately. However the + lock is not released until after all relevant operations + in the write-queue have been flushed to disk. This means + (for example) that the database may remain locked for some + time after a "COMMIT" or "ROLLBACK" is issued. + + * If an application using asynchronous IO executes transactions + in quick succession, other database users may be effectively + locked out of the database. This is because when a BEGIN + is executed, a database lock is established immediately. But + when the corresponding COMMIT or ROLLBACK occurs, the lock + is not released until the relevant part of the write-queue + has been flushed through. As a result, if a COMMIT is followed + by a BEGIN before the write-queue is flushed through, the database + is never unlocked,preventing other processes from accessing + the database. + + File-locking may be disabled at runtime using the sqlite3async_control() + API (see below). This may improve performance when an NFS or other + network file-system, as the synchronous round-trips to the server be + required to establish file locks are avoided. However, if multiple + connections attempt to access the same database file when file-locking + is disabled, application crashes and database corruption is a likely + outcome. + + +2. COMPILATION AND USAGE + + The asynchronous IO extension consists of a single file of C code + (sqlite3async.c), and a header file (sqlite3async.h) that defines the + C API used by applications to activate and control the modules + functionality. + + To use the asynchronous IO extension, compile sqlite3async.c as + part of the application that uses SQLite. Then use the API defined + in sqlite3async.h to initialize and configure the module. + + The asynchronous IO VFS API is described in detail in comments in + sqlite3async.h. Using the API usually consists of the following steps: + + 1. Register the asynchronous IO VFS with SQLite by calling the + sqlite3async_initialize() function. + + 2. Create a background thread to perform write operations and call + sqlite3async_run(). + + 3. Use the normal SQLite API to read and write to databases via + the asynchronous IO VFS. + + Refer to sqlite3async.h for details. + + +3. PORTING + + Currently the asynchronous IO extension is compatible with win32 systems + and systems that support the pthreads interface, including Mac OSX, Linux, + and other varieties of Unix. + + To port the asynchronous IO extension to another platform, the user must + implement mutex and condition variable primitives for the new platform. + Currently there is no externally available interface to allow this, but + modifying the code within sqlite3async.c to include the new platforms + concurrency primitives is relatively easy. Search within sqlite3async.c + for the comment string "PORTING FUNCTIONS" for details. Then implement + new versions of each of the following: + + static void async_mutex_enter(int eMutex); + static void async_mutex_leave(int eMutex); + static void async_cond_wait(int eCond, int eMutex); + static void async_cond_signal(int eCond); + static void async_sched_yield(void); + + The functionality required of each of the above functions is described + in comments in sqlite3async.c. + diff --git a/ext/async/sqlite3async.c b/ext/async/sqlite3async.c index 0ca78c5738..f1fdb8d87f 100644 --- a/ext/async/sqlite3async.c +++ b/ext/async/sqlite3async.c @@ -10,101 +10,10 @@ ** ************************************************************************* ** -** $Id: sqlite3async.c,v 1.1 2009/04/23 14:58:40 danielk1977 Exp $ +** $Id: sqlite3async.c,v 1.2 2009/04/24 09:27:16 danielk1977 Exp $ ** -** This file contains an example implementation of an asynchronous IO -** backend for SQLite. -** -** WHAT IS ASYNCHRONOUS I/O? -** -** With asynchronous I/O, write requests are handled by a separate thread -** running in the background. This means that the thread that initiates -** a database write does not have to wait for (sometimes slow) disk I/O -** to occur. The write seems to happen very quickly, though in reality -** it is happening at its usual slow pace in the background. -** -** Asynchronous I/O appears to give better responsiveness, but at a price. -** You lose the Durable property. With the default I/O backend of SQLite, -** once a write completes, you know that the information you wrote is -** safely on disk. With the asynchronous I/O, this is not the case. If -** your program crashes or if a power loss occurs after the database -** write but before the asynchronous write thread has completed, then the -** database change might never make it to disk and the next user of the -** database might not see your change. -** -** You lose Durability with asynchronous I/O, but you still retain the -** other parts of ACID: Atomic, Consistent, and Isolated. Many -** appliations get along fine without the Durablity. -** -** HOW IT WORKS -** -** Asynchronous I/O works by creating a special SQLite "vfs" structure -** and registering it with sqlite3_vfs_register(). When files opened via -** this vfs are written to (using sqlite3OsWrite()), the data is not -** written directly to disk, but is placed in the "write-queue" to be -** handled by the background thread. -** -** When files opened with the asynchronous vfs are read from -** (using sqlite3OsRead()), the data is read from the file on -** disk and the write-queue, so that from the point of view of -** the vfs reader the OsWrite() appears to have already completed. -** -** The special vfs is registered (and unregistered) by calls to -** function asyncEnable() (see below). -** -** LIMITATIONS -** -** This demonstration code is deliberately kept simple in order to keep -** the main ideas clear and easy to understand. Real applications that -** want to do asynchronous I/O might want to add additional capabilities. -** For example, in this demonstration if writes are happening at a steady -** stream that exceeds the I/O capability of the background writer thread, -** the queue of pending write operations will grow without bound until we -** run out of memory. Users of this technique may want to keep track of -** the quantity of pending writes and stop accepting new write requests -** when the buffer gets to be too big. -** -** LOCKING + CONCURRENCY -** -** Multiple connections from within a single process that use this -** implementation of asynchronous IO may access a single database -** file concurrently. From the point of view of the user, if all -** connections are from within a single process, there is no difference -** between the concurrency offered by "normal" SQLite and SQLite -** using the asynchronous backend. -** -** If connections from within multiple processes may access the -** database file, the ENABLE_FILE_LOCKING symbol (see below) must be -** defined. If it is not defined, then no locks are established on -** the database file. In this case, if multiple processes access -** the database file, corruption will quickly result. -** -** If ENABLE_FILE_LOCKING is defined (the default), then connections -** from within multiple processes may access a single database file -** without risking corruption. However concurrency is reduced as -** follows: -** -** * When a connection using asynchronous IO begins a database -** transaction, the database is locked immediately. However the -** lock is not released until after all relevant operations -** in the write-queue have been flushed to disk. This means -** (for example) that the database may remain locked for some -** time after a "COMMIT" or "ROLLBACK" is issued. -** -** * If an application using asynchronous IO executes transactions -** in quick succession, other database users may be effectively -** locked out of the database. This is because when a BEGIN -** is executed, a database lock is established immediately. But -** when the corresponding COMMIT or ROLLBACK occurs, the lock -** is not released until the relevant part of the write-queue -** has been flushed through. As a result, if a COMMIT is followed -** by a BEGIN before the write-queue is flushed through, the database -** is never unlocked,preventing other processes from accessing -** the database. -** -** Defining ENABLE_FILE_LOCKING when using an NFS or other remote -** file-system may slow things down, as synchronous round-trips to the -** server may be required to establish database file locks. +** This file contains the implementation of an asynchronous IO backend +** for SQLite. */ #if !defined(SQLITE_CORE) || defined(SQLITE_ENABLE_ASYNCIO) @@ -113,12 +22,6 @@ #define ENABLE_FILE_LOCKING -#ifndef SQLITE_AMALGAMATION -# include "sqliteInt.h" -# include -# include -#endif - /* Useful macros used in several places */ #define MIN(x,y) ((x)<(y)?(x):(y)) #define MAX(x,y) ((x)>(y)?(x):(y)) @@ -255,6 +158,8 @@ static void asyncTrace(const char *zFormat, ...){ #endif /* +** PORTING FUNCTIONS +** ** There are two definitions of the following functions. One for pthreads ** compatible systems and one for Win32. These functions isolate the OS ** specific code required by each platform. diff --git a/ext/async/sqlite3async.h b/ext/async/sqlite3async.h index c057432af8..100dfff0fc 100644 --- a/ext/async/sqlite3async.h +++ b/ext/async/sqlite3async.h @@ -5,32 +5,201 @@ #define SQLITEASYNC_VFSNAME "sqlite3async" /* -** Install the asynchronous IO VFS. +** THREAD SAFETY NOTES: +** +** Of the four API functions in this file, the following are not threadsafe: +** +** sqlite3async_initialize() +** sqlite3async_shutdown() +** +** Care must be taken that neither of these functions is called while +** another thread may be calling either any sqlite3async_XXX() function +** or an sqlite3_XXX() API function related to a database handle that +** is using the asynchronous IO VFS. +** +** These functions: +** +** sqlite3async_run() +** sqlite3async_control() +** +** are threadsafe. It is quite safe to call either of these functions even +** if another thread may also be calling one of them or an sqlite3_XXX() +** function related to a database handle that uses the asynchronous IO VFS. +*/ + +/* +** Initialize the asynchronous IO VFS and register it with SQLite using +** sqlite3_vfs_register(). If the asynchronous VFS is already initialized +** and registered, this function is a no-op. The asynchronous IO VFS +** is registered as "sqlite3async". +** +** The asynchronous IO VFS does not make operating system IO requests +** directly. Instead, it uses an existing VFS implementation for all +** required file-system operations. If the first parameter to this function +** is NULL, then the current default VFS is used for IO. If it is not +** NULL, then it must be the name of an existing VFS. In other words, the +** first argument to this function is passed to sqlite3_vfs_find() to +** locate the VFS to use for all real IO operations. This VFS is known +** as the "parent VFS". +** +** If the second parameter to this function is non-zero, then the +** asynchronous IO VFS is registered as the default VFS for all SQLite +** database connections within the process. Otherwise, the asynchronous IO +** VFS is only used by connections opened using sqlite3_open_v2() that +** specifically request VFS "sqlite3async". +** +** If a parent VFS cannot be located, then SQLITE_ERROR is returned. +** In the unlikely event that operating system specific initialization +** fails (win32 systems create the required critical section and event +** objects within this function), then SQLITE_ERROR is also returned. +** Finally, if the call to sqlite3_vfs_register() returns an error, then +** the error code is returned to the user by this function. In all three +** of these cases, intialization has failed and the asynchronous IO VFS +** is not registered with SQLite. +** +** Otherwise, if no error occurs, SQLITE_OK is returned. */ int sqlite3async_initialize(const char *zParent, int isDefault); /* -** Uninstall the asynchronous IO VFS. +** This function unregisters the asynchronous IO VFS using +** sqlite3_vfs_unregister(). +** +** On win32 platforms, this function also releases the small number of +** critical section and event objects created by sqlite3async_initialize(). */ void sqlite3async_shutdown(); /* -** Process events on the write-queue. +** This function may only be called when the asynchronous IO VFS is +** installed (after a call to sqlite3async_initialize()). It processes +** zero or more queued write operations before returning. It is expected +** (but not required) that this function will be called by a different +** thread than those threads that use SQLite. The "background thread" +** that performs IO. +** +** How many queued write operations are performed before returning +** depends on the global setting configured by passing the SQLITEASYNC_HALT +** verb to sqlite3async_control() (see below for details). By default +** this function never returns - it processes all pending operations and +** then blocks waiting for new ones. +** +** If multiple simultaneous calls are made to sqlite3async_run() from two +** or more threads, then the calls are serialized internally. */ void sqlite3async_run(); /* -** Control/configure the asynchronous IO system. +** This function may only be called when the asynchronous IO VFS is +** installed (after a call to sqlite3async_initialize()). It is used +** to query or configure various parameters that affect the operation +** of the asynchronous IO VFS. At present there are three parameters +** supported: +** +** * The "halt" parameter, which configures the circumstances under +** which the sqlite3async_run() parameter is configured. +** +** * The "delay" parameter. Setting the delay parameter to a non-zero +** value causes the sqlite3async_run() function to sleep for the +** configured number of milliseconds between each queued write +** operation. +** +** * The "lockfiles" parameter. This parameter determines whether or +** not the asynchronous IO VFS locks the database files it operates +** on. Disabling file locking can improve throughput. +** +** This function is always passed two arguments. When setting the value +** of a parameter, the first argument must be one of SQLITEASYNC_HALT, +** SQLITEASYNC_DELAY or SQLITEASYNC_LOCKFILES. The second argument must +** be passed the new value for the parameter as type "int". +** +** When querying the current value of a paramter, the first argument must +** be one of SQLITEASYNC_GET_HALT, GET_DELAY or GET_LOCKFILES. The second +** argument to this function must be of type (int *). The current value +** of the queried parameter is copied to the memory pointed to by the +** second argument. For example: +** +** int eCurrentHalt; +** int eNewHalt = SQLITEASYNC_HALT_IDLE; +** +** sqlite3async_control(SQLITEASYNC_HALT, eNewHalt); +** sqlite3async_control(SQLITEASYNC_GET_HALT, &eCurrentHalt); +** assert( eNewHalt==eCurrentHalt ); +** +** See below for more detail on each configuration parameter. +** +** SQLITEASYNC_HALT: +** +** This is used to set the value of the "halt" parameter. The second +** argument must be one of the SQLITEASYNC_HALT_XXX symbols defined +** below (either NEVER, IDLE and NOW). +** +** If the parameter is set to NEVER, then calls to sqlite3async_run() +** never return. This is the default setting. If the parameter is set +** to IDLE, then calls to sqlite3async_run() return as soon as the +** queue of pending write operations is empty. If the parameter is set +** to NOW, then calls to sqlite3async_run() return as quickly as +** possible, without processing any pending write requests. +** +** If an attempt is made to set this parameter to an integer value other +** than SQLITEASYNC_HALT_NEVER, IDLE or NOW, then sqlite3async_control() +** returns SQLITE_MISUSE and the current value of the parameter is not +** modified. +** +** Modifying the "halt" parameter affects calls to sqlite3async_run() +** made by other threads that are currently in progress. +** +** SQLITEASYNC_DELAY: +** +** This is used to set the value of the "delay" parameter. If set to +** a non-zero value, then after completing a pending write request, the +** sqlite3async_run() function sleeps for the configured number of +** milliseconds. +** +** If an attempt is made to set this parameter to a negative value, +** sqlite3async_control() returns SQLITE_MISUSE and the current value +** of the parameter is not modified. +** +** Modifying the "delay" parameter affects calls to sqlite3async_run() +** made by other threads that are currently in progress. +** +** SQLITEASYNC_LOCKFILES: +** +** This is used to set the value of the "lockfiles" parameter. This +** parameter must be set to either 0 or 1. If set to 1, then the +** asynchronous IO VFS uses the xLock() and xUnlock() methods of the +** parent VFS to lock database files being read and/or written. If +** the parameter is set to 0, then these locks are omitted. +** +** This parameter may only be set when there are no open database +** connections using the VFS and the queue of pending write requests +** is empty. Attempting to set it when this is not true, or to set it +** to a value other than 0 or 1 causes sqlite3async_control() to return +** SQLITE_MISUSE and the value of the parameter to remain unchanged. +** +** If this parameter is set to zero, then it is only safe to access the +** database via the asynchronous IO VFS from within a single process. If +** while writing to the database via the asynchronous IO VFS the database +** is also read or written from within another process, or via another +** connection that does not use the asynchronous IO VFS within the same +** process, the results are undefined (and may include crashes or database +** corruption). +** +** Alternatively, if this parameter is set to 1, then it is safe to access +** the database from multiple connections within multiple processes using +** either the asynchronous IO VFS or the parent VFS directly. */ int sqlite3async_control(int op, ...); /* ** Values that can be used as the first argument to sqlite3async_control(). */ -#define SQLITEASYNC_HALT 1 -#define SQLITEASYNC_DELAY 2 -#define SQLITEASYNC_GET_HALT 3 -#define SQLITEASYNC_GET_DELAY 4 +#define SQLITEASYNC_HALT 1 +#define SQLITEASYNC_GET_HALT 2 +#define SQLITEASYNC_DELAY 3 +#define SQLITEASYNC_GET_DELAY 4 +#define SQLITEASYNC_LOCKFILES 5 +#define SQLITEASYNC_GET_LOCKFILES 6 /* ** If the first argument to sqlite3async_control() is SQLITEASYNC_HALT, diff --git a/manifest b/manifest index ea67f1fa0a..a39099d4bd 100644 --- a/manifest +++ b/manifest @@ -1,5 +1,5 @@ -C os_win.c,\swinOpen(),\schanged\sto\shandle\sthe\sSQLITE_OPEN_EXCLUSIVE\sflag\sand\ssharing\smodes\sin\sthe\ssame\smanner\sas\sos_unix.c.\sTicket\s#3821.\s(CVS\s6542) -D 2009-04-23T19:08:33 +C Improve\scomments\sand\sdocumentation\sof\sthe\sasynchronous\sIO\sVFS\smodule.\s(CVS\s6543) +D 2009-04-24T09:27:16 F Makefile.arm-wince-mingw32ce-gcc fcd5e9cd67fe88836360bb4f9ef4cb7f8e2fb5a0 F Makefile.in 583e87706abc3026960ed759aff6371faf84c211 F Makefile.linux-gcc d53183f4aa6a9192d249731c90dbdffbd2c68654 @@ -24,8 +24,9 @@ F contrib/sqlitecon.tcl 210a913ad63f9f991070821e599d600bd913e0ad F doc/lemon.html f0f682f50210928c07e562621c3b7e8ab912a538 F doc/report1.txt a031aaf37b185e4fa540223cb516d3bccec7eeac F ext/README.txt 913a7bd3f4837ab14d7e063304181787658b14e1 -F ext/async/sqlite3async.c d59701cc27f8a7a2bf6ffa997af12b32f05633e6 -F ext/async/sqlite3async.h 73d37b60bc37cbd86f836a172fb57a59a7dd2697 +F ext/async/README.txt 0c541f418b14b415212264cbaaf51c924ec62e5b +F ext/async/sqlite3async.c 2975386c0422dc44e8beebbb85988524141ef8b9 +F ext/async/sqlite3async.h b6d74dbf9aa5a0ac4e79aa15a4d987f3552a0f75 F ext/fts1/README.txt 20ac73b006a70bcfd80069bdaf59214b6cf1db5e F ext/fts1/ft_hash.c 3927bd880e65329bdc6f506555b228b28924921b F ext/fts1/ft_hash.h 1a35e654a235c2c662d3ca0dfc3138ad60b8b7d5 @@ -722,7 +723,7 @@ F tool/speedtest16.c c8a9c793df96db7e4933f0852abb7a03d48f2e81 F tool/speedtest2.tcl ee2149167303ba8e95af97873c575c3e0fab58ff F tool/speedtest8.c 2902c46588c40b55661e471d7a86e4dd71a18224 F tool/speedtest8inst1.c 293327bc76823f473684d589a8160bde1f52c14e -P 1e2c71596e3f7a69afc5b745c20b2e4e81bffda5 -R 76d5774a4fbfd2590aa81cedc88a2f67 -U shane -Z ae8755c671bd4929f4c005dcef86f4c8 +P 18fef3fcf61c137a89a83352f6769ed06845434a +R 848783e3f0d8b2c2ee5a000cbb3565d6 +U danielk1977 +Z a232dc6b75d11d81cd6e819ebf60dfe9 diff --git a/manifest.uuid b/manifest.uuid index d0313ab81c..d550700345 100644 --- a/manifest.uuid +++ b/manifest.uuid @@ -1 +1 @@ -18fef3fcf61c137a89a83352f6769ed06845434a \ No newline at end of file +92bc6be2a86f8a68ceded2bc08fe7d6ff23b56fb \ No newline at end of file