Tuesday, January 03, 2006

Welcome to my Audio Database

Welcome to the Audio Database blog



The purpose of this blog is to highlight the pleasures and pains I encountered during the development of a RDBMS system written entirely within the confines of the Microsoft .NET Framework.

The project started life six months ago while I was re-reading a copy of Inside MS SQL Server 6.5... The description of the internal server architecture proved too close to pseudo-code for me not to have a go at pulling together my own version of such a beast.

Futile though this task may seem, I do have an end goal! The application is designed to support streaming audio/video data and it is designed to handle multiple requests from multiple clients. It will also support the attachment of multiple audio streams to a given video clip.

The database itself supports full ACID properties and a fully recoverable transaction log which records page changes.

Databases can span multiple files and can be organised into sub-groups known as file-groups to increase performance by ensuring certain allocations are placed on a given group of devices.

The page engine manages an in-memory representation of the underlying physical data stores and keeps this concurrent and synchronised by way of a transaction log and suitable locking mechanisms which ensure only a single transaction can ever update a given page at any given time.

As you can imagine this is no small undertaking!

The task list is daunting;

  1. Device class hierarchy
  2. Page class hierarchy
  3. Page cache
  4. Lazy writer
  5. File-groups
  6. Locking
  7. Transactions
  8. Logging
  9. Recovery
  10. Physical Page Allocation
  11. Logical Page Allocation
  12. Database Page Allocation
  13. Table/Sample/Video Index Managers
  14. Table Manager
  15. Sample Manager
  16. Video Manager


Those tasks are all needed to complete the database engine alone!

Over the past few months Items 1 through 8, 10 and 11 have been completed and currently lie untested (gulp!)
with item 8 (recovery) on the critical path.

Once the recovery implementation is complete I can start testing the paging and recovery logic - this will be extensive and will require among other things being able to disable the lazy writer to check the behaviour of the system during recovery scenarios.

After that the allocation logic can be finished off. Currently the device expansion is implemented and the mechanism for allocating a new logical ID is there together with a simple algorithm for picking the best device when expanding a file-group. The only logic missing from the allocation support is the updates needed for the distribution pages (which track database page allocations).

With allocation complete work on the index managers finalised. Unlike SQL Server I will not be supporting clustered-indices although I do share the B-tree implementation - the test harness for proving the algorithm was fun to write I can tell you...

With completed index managers I can finish the three object managers - easy!

No comments: