Perl Training Logo: Praying Mantis
MJD Perl Course - 15th of May 2003

"How do I delete a line from a file?" Lightweight data storage techniques

Many programs need cheap, convenient access to small amounts of data. There are two commonly used solutions: Flat text files and DBM files. This class will look at these in detail. Whether you're looking for a good solution for storage of your own data, or you have to deal with data stored in one of these formats by another program, this class will equip you with valuable tools for solving your problems.

In the first half, we'll look at techniques for managing flat text databases and the systems programming that underlies these. We'll examine the tradeoffs of variable vs. fixed-length records and sorted vs. unsorted files. We'll take a detailed look at C, a new standard module that provides easy access to text databases.

The second half will be an overview of Perl's 'DBM' feature, including a comparison of the standard DBM modules. We'll see several extremely useful but little-known features of DB_File, the only one of these standard modules that doesn't have serious defects.

  • Plain text: perl -i.bak; Variable-length records: appending, inserting and deleting, modifying records in-place, buffering issues; sorted files. XML.
  • Files with fixed-length records. 'seek'. case study: The 'lastlog' file; building auxiliary indices.
  • Variable-length records revisited. Tie::File and how to use it. Offsets; read cache; immediate writing; deferred writing.
  • DBM. Comparison of DBM packages. The small ones: ODBM, NDBM, SDBM; data length limits; file extent problems. GDBM and why you should avoid it. DB_File.
  • DB_File in depth. Buffering and locking issues. In-memory databases. DB_RECNO. DB_BTREE: At last, an ordered hash! Choosing your sort order; multiple values for a single key. Little-known but useful DB_File features.
Back to "MJD in Israel"