Liveblogging: Architecture of Maria By Monty Widenius

Architecture of Maria: A New Storage Engine with a Transactional Design

Goals of Maria:

ACID compliant
MVCC, transactional
default non-transactional and default transactional storage engine for MySQL
MyISAM replacement, including temporary table use
Storage engine good for data warehousing.
Allow any length transactions to take place
all indexes should have equal speed (clustered indexes are not in the plan)
log shipping — incremental backups just by copying the logs
used as a standalone library
fast count(*)
allow copying of Maria tables between different Maria servers
Better blob handling (than MyISAM) — no memory copying, or extra memory used for blobs on INSERT/UPDATE
BLOBs are allocated in big sequential blocks means that there is less fragmentation over time.
BLOBs are stored so Maria can be easily extended to have access to any part of a BLOB with a single fetch in the future

Maria is actually faster than MyISAM.
Will support READ COMMITTED and REPEATABLE READ (was a specific question).

“You should be able to upgrade Maria and have it just work, without ever having to backup and restore as part of the upgrade.”

Why do we need 2 storage engines, Maria and Falcon?
More storage engines = more scenarios to be able to handle. Falcon good for lots of memory, and shorter queries, Maria is to take care of long transactions and data warehousing side, and also to be an overall “good” engine for most purposes (at least as good as InnoDB).

Project plan — MySQL 5.1 and higher. Maria strives to be bug-free.

Maria 1.0 = “crash safe”, released in Jan 2008. Cacheable row format.
Maria 1.5 = “concurrent insert/select”, Apr/May 2008. Will be merged into the MySQL 6.0 release. Non-transactional.
Supports:

External/internal row CHECK/REPAIR/compressing
Different row formats — fixed size, dynamic length, compressed, page
Fast checksum of a table
maria_check

MySQL 6.0 (includes Maria)
Maria 2.0 = “Transactional and ACID” alpha, Q4 2008
Maria 3.0 = “High concurrency + Online backup” alpha, Q1 2009; GA, Q2 2009

Advantages of Maria compared to MyISAM

Data, index and statistics are crash safe. On crash, everything will rollback to the state of the start of the statement or last LOCK TABLES commands.
Can replay everything from the log. Still need to backup the .frm files though.
LOAD INDEX can skip index blocks for unwanted indexes
most parts of Maria have unit tests.
Supports all MyISAM formats and a new PAGE format, where data is stored in pages — this is:
- Only crashsafe, transactional row format for Maria
- Versioned, so you can have concurrent insert/select
- Cached by page cache so will give noticeable speed improvement on systems with poor data caching
MyISAM is non-transactional only, Maria can support transactional (crash-safe) and non-transactional formats
Maria uses big log files by default — 1G
Log control file = maria_log_control
Log files = maria_log.?????
If a log file is already in use by a transaction, create a new one, otherwise, reuse it. SHOW LOGS will show which logfiles exist, which ones are in use, and you can PURGE the logs.
Maria uses 8K pages by default (MyISAM uses 1K pages by default. This should be faster on static size indexes but slower on variable length keys, until Maria 2.0 where there will be a directory for index pages.

Before Apr/May release of Maria 1.5

Statistics and indexes versioned (data is already versioned)
FULLTEXT (boolean mode) and R-TREE indexes need to be made crash-safe
Parallel recovery

Disadvantages, not likely to be fixed

No external locking
One page size for both index and data
only one page cache
index number and checksums require an extra 5 bytes per page
doesn’t support RAID
Minimum data file size for BLOCK format is 16K, with 8K pages
Storage of rows < 25 bytes are not efficient for the PAGE format

XDB indexes — group records, and get the min and max for each group (groups are not sorted), and save that in a separate block. This means that stats like min and max are very fast. This is something that will help data warehousing queries.

Resources:

Maria worklog (tasks to do)
5.1 Storage Engine documentation — note that Maria docs have not made it to the manual yet.
http://monty-says.blogspot.com
Download Maria
FAQ and answers about Maria
Maria Documentation on the MySQL Forge.

Best quote from Monty: “You don’t steal ideas in open source. You respect people.”