I am not a wizard with infographics, but I can do a few pie charts. I copied the data to the right of the pie charts for those that want to see the numbers. Overall, there are almost 400 databases at Mozilla, in 11 different categories. Here is how each category fares in number of databases:
Here is how each category measures up with regards to database size clearly, our crash-stats database (which is on Postgres, not MySQL) is the largest:
So here is another pie chart with the relative sizes of the MySQL databases:
I’m sure I’ve miscategorized some things (for instance, are metrics on AMO classified under AMO/Marketplace or internal tools?) but here are the categories I used:
Categories:
air.m.o air.mozilla.org
AMO/Marketplace addons/marketplace
blog/web page its a db behind a blog or mostly static webpage
bugzilla Bugzilla
Crash-stats Socorro, crash-stats.mozilla.com Where apps like Firefox send crash details.
Internal tool If the db behind this is down, moco/mofo people may not be able to do their work. This covers applications from graphs.mozilla.org to inventory.mozilla.org to the PTO app.
release tool If this db is down, releases can not happen (but this db is not a tree-closing db).
SUMO support.mozilla.org
Tree-closing if this db is down, the tree closes (and releases cant happen)
World-facing if this db is down, non moco/mofo ppl will notice. These are specifically tools that folks interact with, including the Mozilla Developer Network and sites like gameon.mozilla.org
World-interfacing This db is critical to tools we use to interface with the world, though not necessarily world visible. basket.mozilla.org, Mozillians, etc.
The count of databases includes all production/dev/stage servers. The size is the size of the database on one of the production/dev/stage machines. For example, Bugzilla has 6 servers in use 4 in production and 2 in stage. The size is the size of the master in production and the master in stage, combined. This way we have not grossly inflated the size of the database, even though technically speaking we do have to manage the data on each of the servers.
For next year, I hope to be able to gather this kind of information automatically, and have easily accessible comprehensive numbers for bandwidth, number of queries per day on each server, and more.