cdb: Alternatives

There are many choices of persistent data structures (i.e., of "database formats"). Persistent means that the data structure does not disappear when your program exits.

A good default choice of persistent data structure is your operating system's filesystem, which maps strings (filenames) to strings (file contents). This is an example of a persistent associative array (a "key-value store"), and you can use it in essentially the same way that you use associative arrays in your favorite programming language.

Choosing the filesystem means that you can use and interoperate with a vast range of existing tools that also use the filesystem. Filesystem software is also better tested than most database software. However, filesystem reliability is far from perfect, and in any case there are good reasons to use other database formats, as explained below.

Transactions

A reader carrying out a series of related associative-array lookups can easily confuse itself if the associative array is being modified in the meantime. To address this source of bugs, databases often support atomic "transactions" that arrange for a series of modifications to happen all at once, with a reader consistently seeing the old state or consistently seeing the new state.

Filesystems typically do not provide direct support for transactions: transactions are extra work that happens on top of a filesystem. Some examples of key-value stores supporting transactions:

There are many more examples.

Support for transactions usually makes database software more complicated and raises the question of how well tested the software is. However, database software becomes simpler if the implementation requires an entire database to be rewritten all at once, again with a reader consistently seeing the old database state or consistently seeing the new database state. The rewrite might apply just one transaction, perhaps changing just one database entry, but internally this is implemented as a full database rewrite. Some examples of these "constant databases" (specifically, constant key-value stores):

Efficiency

Another traditional reason to move from the filesystem to another database format is to improve space efficiency or time efficiency (but keep in mind that for most applications the filesystem is fast enough!). One can also try to improve the efficiency of the filesystem itself, but the evolution of new filesystems is slowed down by the complexity of typical operating-system interfaces to filesystems.

Key-value stores designed for efficiency include dbm, ndbm, gdbm, Berkeley db, tkrzw, and many more. Typically the top efficiency priorities are fast single reads (found or not found), fast single writes (add or replace or delete), and low space consumption.

Constant databases are normally also designed for efficiency, but with different priorities: fast single reads (found or not found), fast rewrites of the entire database, and low space consumption. Beware that space consumption is higher during a rewrite, and if several previous versions of a constant database are still being read then space consumption is several times higher; compression is possible, but at the expense of simplicity.

Whether constant databases are faster than non-constant databases depends on the application:

Advanced queries

People talking about database queries are often talking about more advanced types of searches than simply looking up a key. Some data structures for associative arrays naturally allow fast range queries such as "show me every key between 314159000000 and 314160000000". Other data structures support more advanced queries such as "show me all files containing the word ossifrage".

"Relational databases" typically support queries in SQL, the Structured Query Language, and internally arrange data to try to handle typical queries efficiently. An example of a relational database is SQLite, which is also a famous example of software in the public domain.

One can think of, and implement, a relational database as an extra layer on top of a key-value store. Presumably a constant relational database would be able to gain efficiency in many applications compared to a non-constant relational database.


Version: This is version 2025.10.23 of the "Alternatives" web page.