One thing MERGE Tables Are Good For

Many people have some kind of reporting or auditing on their database. The problem is that the data grows very large, and lots of times there is data that can be purged. Sure, theoretically one never needs to purge data, but sometimes a “delete” flag just won’t work — when you search on the delete flag, a full table scan may be the most efficient way to go.

Of course, that’s not acceptable. And in many cases, say when you have users who no longer use the site but did in the past (and perhaps have billing data associated with them), you never want to get rid of them.

So what to do? Make a special reporting database, that gathers information from the production database(s). Use MyISAM tables, because a reporting server can afford to be behind the master, and MyISAM is better for reporting — better metadata. For something like a “Users” table, make 2 more tables:

1) DeletedUsers
2) AllUsers

Where Deleted Users is where you put information about the users you delete (something like INSERT INTO DeletedUsers SELECT * FROM Users WHERE [parameters of deletion] and then run DELETE FROM Users WHERE [parameters of deletion] on the master. On the reporting slave, make a MERGE table called “AllUsers” and run your reports from that, when you might need to gather historical table.

Thoughts?

Many people have some kind of reporting or auditing on their database. The problem is that the data grows very large, and lots of times there is data that can be purged. Sure, theoretically one never needs to purge data, but sometimes a “delete” flag just won’t work — when you search on the delete flag, a full table scan may be the most efficient way to go.

Of course, that’s not acceptable. And in many cases, say when you have users who no longer use the site but did in the past (and perhaps have billing data associated with them), you never want to get rid of them.

So what to do? Make a special reporting database, that gathers information from the production database(s). Use MyISAM tables, because a reporting server can afford to be behind the master, and MyISAM is better for reporting — better metadata. For something like a “Users” table, make 2 more tables:

1) DeletedUsers
2) AllUsers

Where Deleted Users is where you put information about the users you delete (something like INSERT INTO DeletedUsers SELECT * FROM Users WHERE [parameters of deletion] and then run DELETE FROM Users WHERE [parameters of deletion] on the master. On the reporting slave, make a MERGE table called “AllUsers” and run your reports from that, when you might need to gather historical table.

Thoughts?