Liveblogging: Performance is Overrated, by Mark Callaghan

Mark Callaghan speaks at the New England Database Summit about how data manageability is more important than performance.

Peak performance is thrown out, 95%-98% is important.

Variance shouldn’t be large.

Data manageability is rate of interrupts per server for the operations team. Rate of server growth much bigger than rate of new hires for the systems teams. A lot of the db team is from University of Wisconsin-Madison!

Why MySQL? Because it was there when he came. Mark and ops/engineering peers made it scale 10x. He likes MySQL for OLTP, InnoDB is “An amazing piece of software.”

They can get 500,000 qps using a cached workload, which is on par with memcached.

What Facebook really does is OLTP for the social graph. The workload is secondary indexes, index-only queries, small joins but most queries use one table, multi-row transactions, majority of workload does not need SQL/optimizer, they do a physical and logical backup.

Most of this does not require SQL [blogger’s note – they built Cassandra]. Why is the grass greener on the other side? automated replacement of failed nodes, less downtime on schema changes and/or fewer schema changes, multi-master, better compressions, etc.

Circa 2010, 13 million queries per second, 4 ms reads, 5 ms writes, 38GB peak network per second, etc.

Why so many servers? Big data high queries per seconds. They add servers to add IOPS, so they’re interested in compression and flash, so they can get more IOPS. If they do remain on disk, write-optimized dbs are interesting too. About 10 people on the db team, which is very small for a company that size.

How to scale MySQL? Fix stalls to make use of capacity, improve efficiency to use fewer queries/fewer data. Fixing stalls doesn’t make MySQL faster, makes it less slow.

[blogger’s note – I stopped taking notes here because this is a rehash of the “How Facebook Does MySQL” talk that has been done over and over…]

[restarted when he started talking about data manageability again]

How Facebook got it’s data manageable.

pylander- sheds load during a query pileup – kills dup queries, limits # of queries from some specific accounts — take off on Highlander: there can be only one.

dogpile – collects data during a query pileup – gets perf counters and list of running queries, generates HTML page with interesting results.

Online schema change tool, for frequent schema changes, especially adding indexes. This briefly locks the table, to setup triggers to track changes, copy data to a new table with the new desired schema, replay changes to the new table, then briefly lock the table again as you rename the new table as the target table.

Manageability is a work in progress — working on:
– make InnoDB compression work for OLTP
– Faker – tool for prefetching for replication slaves – replay workload is: page read, do some modification, page write. bottleneck might be disk reads, work is done by a single thread, transactions on master are concurrent. Faker has multiple threads replay transactions in “fake-changes” mode, no undo, no rollback, read-only, fetches into the buffer pool the pages needed for that transaction. Captures about 70% of disk reads for replication, they’re working on fixes to get it up to 80-90%.
– auto replacement – replace failed and unhealthy MySQL servers.
– Auto resharding – sharding is easy, re-sharding is hard.

open issues in manageability:
diagnose why one host is slow, others are not.
….and some more.

Mark Callaghan speaks at the New England Database Summit about how data manageability is more important than performance.

Peak performance is thrown out, 95%-98% is important.

Variance shouldn’t be large.

Data manageability is rate of interrupts per server for the operations team. Rate of server growth much bigger than rate of new hires for the systems teams. A lot of the db team is from University of Wisconsin-Madison!

Why MySQL? Because it was there when he came. Mark and ops/engineering peers made it scale 10x. He likes MySQL for OLTP, InnoDB is “An amazing piece of software.”

They can get 500,000 qps using a cached workload, which is on par with memcached.

What Facebook really does is OLTP for the social graph. The workload is secondary indexes, index-only queries, small joins but most queries use one table, multi-row transactions, majority of workload does not need SQL/optimizer, they do a physical and logical backup.

Most of this does not require SQL [blogger’s note – they built Cassandra]. Why is the grass greener on the other side? automated replacement of failed nodes, less downtime on schema changes and/or fewer schema changes, multi-master, better compressions, etc.

Circa 2010, 13 million queries per second, 4 ms reads, 5 ms writes, 38GB peak network per second, etc.

Why so many servers? Big data high queries per seconds. They add servers to add IOPS, so they’re interested in compression and flash, so they can get more IOPS. If they do remain on disk, write-optimized dbs are interesting too. About 10 people on the db team, which is very small for a company that size.

How to scale MySQL? Fix stalls to make use of capacity, improve efficiency to use fewer queries/fewer data. Fixing stalls doesn’t make MySQL faster, makes it less slow.

[blogger’s note – I stopped taking notes here because this is a rehash of the “How Facebook Does MySQL” talk that has been done over and over…]

[restarted when he started talking about data manageability again]

How Facebook got it’s data manageable.

pylander- sheds load during a query pileup – kills dup queries, limits # of queries from some specific accounts — take off on Highlander: there can be only one.

dogpile – collects data during a query pileup – gets perf counters and list of running queries, generates HTML page with interesting results.

Online schema change tool, for frequent schema changes, especially adding indexes. This briefly locks the table, to setup triggers to track changes, copy data to a new table with the new desired schema, replay changes to the new table, then briefly lock the table again as you rename the new table as the target table.

Manageability is a work in progress — working on:
– make InnoDB compression work for OLTP
– Faker – tool for prefetching for replication slaves – replay workload is: page read, do some modification, page write. bottleneck might be disk reads, work is done by a single thread, transactions on master are concurrent. Faker has multiple threads replay transactions in “fake-changes” mode, no undo, no rollback, read-only, fetches into the buffer pool the pages needed for that transaction. Captures about 70% of disk reads for replication, they’re working on fixes to get it up to 80-90%.
– auto replacement – replace failed and unhealthy MySQL servers.
– Auto resharding – sharding is easy, re-sharding is hard.

open issues in manageability:
diagnose why one host is slow, others are not.
….and some more.

Presenting on Security Topics at Percona’s MySQL Conference

I am honored to have been selected to give a 3-hour tutorial on white-hat Google hacking and other MySQL security topics at the 2012 Percona Live: MySQL Conference and Expo. The tutorial schedule can be found here and for more about the presentation I will be giving, you can see Google Hacking MySQL and More MySQL Security. I am also pleased that Percona has given a discount code for OurSQL: The MySQL Database Community Podcast listeners – 10% off any ticket price with code PL-pod.

The discount is on top of early bird prices, so you get two discounts this way!

I cannot wait for the conference — all the tutorials and sessions look very top-notch.

I am honored to have been selected to give a 3-hour tutorial on white-hat Google hacking and other MySQL security topics at the 2012 Percona Live: MySQL Conference and Expo. The tutorial schedule can be found here and for more about the presentation I will be giving, you can see Google Hacking MySQL and More MySQL Security. I am also pleased that Percona has given a discount code for OurSQL: The MySQL Database Community Podcast listeners – 10% off any ticket price with code PL-pod.

The discount is on top of early bird prices, so you get two discounts this way!

I cannot wait for the conference — all the tutorials and sessions look very top-notch.

Do not attribute to malice….

Yesterday, during a session at a User Group Leader’s conference, I suggested to the MySQL Community Team (Keith Larson and Dave Stokes) that it would be nice to see all the events that Oracle does that are MySQL-related, because everyone else is posting their MySQL events to Planet MySQL and Oracle was talking about events I had never heard of. I noted that http://events.oracle.com/search/search?group=Events&keyword=mysql has an RSS feed.

Well, Keith and Dave also thought that was a good idea, so they added the feed. Looks like Oracle’s feed isn’t so great, though, and they’re working on the fix with the appropriate tech folks within Oracle.

I think it is ridiculous that some folks say things like “I hope planet mysql has not been hijacked!”. Of course it has not — this is a silly technical issue, which is pretty obvious once you think about it — when the feed was added, it added the most recent posts. It looks like those posts might be in the future or something.

It’s just a broken RSS feed, folks. Why is everyone so quick to try to accuse Oracle of bad stuff? They certainly have not been perfect, but they have done a lot of good, sprinkled in with some bad stuff here and there. Just take a deep breath, relax, and think about e-mailing the community team when that happens.

That’s what I did.

Yesterday, during a session at a User Group Leader’s conference, I suggested to the MySQL Community Team (Keith Larson and Dave Stokes) that it would be nice to see all the events that Oracle does that are MySQL-related, because everyone else is posting their MySQL events to Planet MySQL and Oracle was talking about events I had never heard of. I noted that http://events.oracle.com/search/search?group=Events&keyword=mysql has an RSS feed.

Well, Keith and Dave also thought that was a good idea, so they added the feed. Looks like Oracle’s feed isn’t so great, though, and they’re working on the fix with the appropriate tech folks within Oracle.

I think it is ridiculous that some folks say things like “I hope planet mysql has not been hijacked!”. Of course it has not — this is a silly technical issue, which is pretty obvious once you think about it — when the feed was added, it added the most recent posts. It looks like those posts might be in the future or something.

It’s just a broken RSS feed, folks. Why is everyone so quick to try to accuse Oracle of bad stuff? They certainly have not been perfect, but they have done a lot of good, sprinkled in with some bad stuff here and there. Just take a deep breath, relax, and think about e-mailing the community team when that happens.

That’s what I did.

MySQL Community and User Group slides

This week I have been at the IOUC User Group Leader’s Conference, and I have met a ton of great folks who are user group leaders and made some great contacts for future speaking engagement. Follow this space to learn about calls for papers for international conferences! First up is the call for papers for the OUG Harmony conference.

The OUG is the Oracle Users Group for Finland and Latvia, in conjunction with Estonia and Russia. The Finland conference is specifically looking for MySQL content, and is May 30-31st in Hämeenlinna, Finland. Talks can be in Finnish or English, and they’re looking for good basic MySQL information.

I spoke today at the conference about the MySQL community – who it is, how it’s grown, and what the challenges are as we try to find our place under the Oracle banner. Slides are at http://bit.ly/mysqlcomm2012.

I have to say that I love doing slides in HTML and CSS, as I’m already very familiar with it, and I don’t have to worry about a separate Office program. I marvel at how open Mozilla is, and you too can make slides as I did, though I’d recommend changing the CSS so you can represent your company. It took me about 5 minutes to figure out how to work things, thanks to our public documentation at https://wiki.mozilla.org/HTML_Slides — which, by the way, I found by Google searching for “Mozilla slides”.

Note that anyone can develop slides on Mozilla’s site or export it yourself to work on offline…I found it handy to develop the slides right on the Mozilla type, and whenever I saved it the display version automatically updates. Pretty nifty stuff.

This week I have been at the IOUC User Group Leader’s Conference, and I have met a ton of great folks who are user group leaders and made some great contacts for future speaking engagement. Follow this space to learn about calls for papers for international conferences! First up is the call for papers for the OUG Harmony conference.

The OUG is the Oracle Users Group for Finland and Latvia, in conjunction with Estonia and Russia. The Finland conference is specifically looking for MySQL content, and is May 30-31st in Hämeenlinna, Finland. Talks can be in Finnish or English, and they’re looking for good basic MySQL information.

I spoke today at the conference about the MySQL community – who it is, how it’s grown, and what the challenges are as we try to find our place under the Oracle banner. Slides are at http://bit.ly/mysqlcomm2012.

I have to say that I love doing slides in HTML and CSS, as I’m already very familiar with it, and I don’t have to worry about a separate Office program. I marvel at how open Mozilla is, and you too can make slides as I did, though I’d recommend changing the CSS so you can represent your company. It took me about 5 minutes to figure out how to work things, thanks to our public documentation at https://wiki.mozilla.org/HTML_Slides — which, by the way, I found by Google searching for “Mozilla slides”.

Note that anyone can develop slides on Mozilla’s site or export it yourself to work on offline…I found it handy to develop the slides right on the Mozilla type, and whenever I saved it the display version automatically updates. Pretty nifty stuff.

Replication and Data Integrity

Last week, Baron pointed out that semi-synchronous replication is not synchronous. I learned a lot reading that post, but I was surprised it was used to pimp the Percona cluster, with no comparison to MySQL’s own cluster solution — that would be a much more fair comparison. There is one critical point Baron did not make, though….

whether it’s semi-synchronous replication or regular asynchronous replication, there is no guarantee of data integrity. I saw this over and over when I was consulting. Just because replication is not failing does *not* mean that the data on the master and slave are in sync.

There is no form of replication that verifies data integrity. You can check if the data on the slave is in sync with the data on the master with pt-table-checksum and pt-table-sync, from the Percona toolkit. I use those tools widely.

I have not yet started using the new version, which boasts just working out of the box — right now there are many options I use, in addition to the ones listed in the blog post that even I reference to this day I have also started using –chunk-size-limit, to avoid pt-table-checksum skipping chunks that are just a bit too large.

I am excited about the rewrite of the tools and have it in my plan to use them. I hope they will save me a lot of time.

In my mind, checksumming is as critical as backups (in fact, if you backup from a slave, you must verify that the slave is in sync with the master and has no data integrity issues). It is not optional. Hopefully you already know that replication does not verify data integrity, but if you did not know, now you know and you also know how to check for that.

What I do is have the checksum run with the modulo/offset feature every so often on the master, and within a week or 2 of a data discrepancy happening, I find it, using a daily monitoring check to see if the checksum table on the slave has matching values for the data on the master and the slave.

Last week, Baron pointed out that semi-synchronous replication is not synchronous. I learned a lot reading that post, but I was surprised it was used to pimp the Percona cluster, with no comparison to MySQL’s own cluster solution — that would be a much more fair comparison. There is one critical point Baron did not make, though….

whether it’s semi-synchronous replication or regular asynchronous replication, there is no guarantee of data integrity. I saw this over and over when I was consulting. Just because replication is not failing does *not* mean that the data on the master and slave are in sync.

There is no form of replication that verifies data integrity. You can check if the data on the slave is in sync with the data on the master with pt-table-checksum and pt-table-sync, from the Percona toolkit. I use those tools widely.

I have not yet started using the new version, which boasts just working out of the box — right now there are many options I use, in addition to the ones listed in the blog post that even I reference to this day I have also started using –chunk-size-limit, to avoid pt-table-checksum skipping chunks that are just a bit too large.

I am excited about the rewrite of the tools and have it in my plan to use them. I hope they will save me a lot of time.

In my mind, checksumming is as critical as backups (in fact, if you backup from a slave, you must verify that the slave is in sync with the master and has no data integrity issues). It is not optional. Hopefully you already know that replication does not verify data integrity, but if you did not know, now you know and you also know how to check for that.

What I do is have the checksum run with the modulo/offset feature every so often on the master, and within a week or 2 of a data discrepancy happening, I find it, using a daily monitoring check to see if the checksum table on the slave has matching values for the data on the master and the slave.

Working for an Open Company

Many of us know what it’s like to work at an open source company. About 6 weeks ago I started my job as a Senior MySQL DB Admin/Architect (DBA but the “A” stands for both) at Mozilla. And I have to say, working for an open company is a lot different from working for an open source company.

There’s so much more that’s, well, open.

I can point to the Bugzilla bugs database, where all our ticket tracking is done. It’s open to the public, although on the systems side we mark a lot of bugs private because they contain important information like hostnames and IP addresses and what ports are open vs. not.

Or I could point to my director’s quest to make the IT department more open – one that I think is possible, although it does make our legal team try to figure out exactly what constitutes an employee and what does not (or rather, what could be argued in a court of law if it comes down to that).

The greatest example that I have seen of how open the company is came from a coworker’s blog about our anti-SOPA/PIPA efforts – the blog points to an etherpad document used to collaborate on what needed to be done. You can see that it was a lot of work, but the part that stands out to me is this:

You can see it.

It’s amazing to me how open the company is, and how very little suffers from it. Plans are divulged, there are very few secrets — most of those are required by the folks we work with, not by Mozilla itself.

Many of us know what it’s like to work at an open source company. About 6 weeks ago I started my job as a Senior MySQL DB Admin/Architect (DBA but the “A” stands for both) at Mozilla. And I have to say, working for an open company is a lot different from working for an open source company.

There’s so much more that’s, well, open.

I can point to the Bugzilla bugs database, where all our ticket tracking is done. It’s open to the public, although on the systems side we mark a lot of bugs private because they contain important information like hostnames and IP addresses and what ports are open vs. not.

Or I could point to my director’s quest to make the IT department more open – one that I think is possible, although it does make our legal team try to figure out exactly what constitutes an employee and what does not (or rather, what could be argued in a court of law if it comes down to that).

The greatest example that I have seen of how open the company is came from a coworker’s blog about our anti-SOPA/PIPA efforts – the blog points to an etherpad document used to collaborate on what needed to be done. You can see that it was a lot of work, but the part that stands out to me is this:

You can see it.

It’s amazing to me how open the company is, and how very little suffers from it. Plans are divulged, there are very few secrets — most of those are required by the folks we work with, not by Mozilla itself.

A great SOPA/PIPA analogy

Today, English versions of major sites like Wikipedia and Mozilla (my employer) are going dark to protest SOPA and the PROTECT IP Act. If you think SOPA/PROTECT IP Act is not a big problem, or if you are having trouble explaining why it is a problem to folks who are not as aware, a post by Mitchell Baker should help clear things up. Baker illuminates the problem, complete with references. You should read the article, it is not very long, but here is a snippet:

“Assume there’s a corner store in your neighborhood that rents movies. But the movie industry believes that some or even all of the videos in that store are unauthorized copies, so that they’re not being paid when people watch their movies. What should be done?

SOPA/PIPA don’t aim at the people trying to get to the store. SOPA/ PIPA don’t penalize or regulate the store itself. SOPA and PIPA penalize us if we don’t block the people trying to get to the store.

The solution under the proposed bills is to make it as difficult as possible to find or interact with the store. Maps showing the location of the store must be changed to hide it(1). The road to the store must be blocked off so that it’s difficult to physically get to there(2). Directory services must unlist the store’s phone number and address(3). Credit card companies(4) would have to cease providing services to the store. Local newspapers would no longer be allowed to place ads for the video store(5). And to make sure it all happens, any person or organization who doesn’t do this is subject to penalties(6). Even publishing a newsletter that tells people where the store is would be prohibited by this legislation(7).”

(footnotes left out, this is just a copy/paste, click through to to the article to see the notes).

Today, English versions of major sites like Wikipedia and Mozilla (my employer) are going dark to protest SOPA and the PROTECT IP Act. If you think SOPA/PROTECT IP Act is not a big problem, or if you are having trouble explaining why it is a problem to folks who are not as aware, a post by Mitchell Baker should help clear things up. Baker illuminates the problem, complete with references. You should read the article, it is not very long, but here is a snippet:

“Assume there’s a corner store in your neighborhood that rents movies. But the movie industry believes that some or even all of the videos in that store are unauthorized copies, so that they’re not being paid when people watch their movies. What should be done?

SOPA/PIPA don’t aim at the people trying to get to the store. SOPA/ PIPA don’t penalize or regulate the store itself. SOPA and PIPA penalize us if we don’t block the people trying to get to the store.

The solution under the proposed bills is to make it as difficult as possible to find or interact with the store. Maps showing the location of the store must be changed to hide it(1). The road to the store must be blocked off so that it’s difficult to physically get to there(2). Directory services must unlist the store’s phone number and address(3). Credit card companies(4) would have to cease providing services to the store. Local newspapers would no longer be allowed to place ads for the video store(5). And to make sure it all happens, any person or organization who doesn’t do this is subject to penalties(6). Even publishing a newsletter that tells people where the store is would be prohibited by this legislation(7).”

(footnotes left out, this is just a copy/paste, click through to to the article to see the notes).

Renormalize, Encrypt, MongoDB videos online

Thanks to the extraordinary work of volunteer superstar Richard Laskey, the videos for the September, October and November Boston MySQL user group meetings are available. Whether you want to encrypt, renormalize or learn how MTV uses MongoDB, there are now full-length videos available:

MongoDB @MTV by Jeff Yemin of MTV Networks, September 2011

Renormalize – Solving Performance Problems in MySQL without Denormalization by Ari Weil of Akiban, October 2011.

MySQL Encryption with Gazzang by Mike Frank of Gazzang, November 2011.

Enjoy the videos!

Thanks to the extraordinary work of volunteer superstar Richard Laskey, the videos for the September, October and November Boston MySQL user group meetings are available. Whether you want to encrypt, renormalize or learn how MTV uses MongoDB, there are now full-length videos available:

MongoDB @MTV by Jeff Yemin of MTV Networks, September 2011

Renormalize – Solving Performance Problems in MySQL without Denormalization by Ari Weil of Akiban, October 2011.

MySQL Encryption with Gazzang by Mike Frank of Gazzang, November 2011.

Enjoy the videos!

When Disaster Strikes, You Can Learn a Lot

My first week at Mozilla was relatively uneventful. I spent it at the Mountain View office, meeting people in person, having a few meetings, and actually doing a few tasks in addition to all of the setup and overhead that comes with being a new employee.

After traveling back home to Boston, my second week of work was a bit more eventful. We had a RAID array fail when we went to replace a disk. The RAID array held all of our e-mail, so until we could restore a backup and work on getting as much post-backup data we could out of corrupt databases (LDAP and MySQL), nobody had e-mail.

E-mail is a very big deal in any company, and Mozilla, with around 600 employees, is no exception. It was fascinating to watch my new coworkers deal with a crisis of this magnitude. A person’s true self is brought out under pressure, and I got to see how everyone acted and reacted.

Folks who wanted to help but could not stepped aside to let those with the requisite knowledge/experience fix things. Higher-ups asked for statuses. Employees found ways to communicate without e-mail.

I watched this all unfold with an observant eye – especially within my team. I am happy to learn that my coworkers (including my boss) are smart, helpful, know when to jump in and when to stay out of the way, sensitive to those who are under pressure, do not spend time rat-holing on the past, and avoid the blame game.

Of course, as Tim Callaghan pointed out at the December Boston MySQL User Group on Monday evening, what would I have done if I found out that my team was a bunch of finger-pointers, my boss was a jerk, and everyone complained selfishly? Luckily, that’s not a scenario I have to worry about.

My first week at Mozilla was relatively uneventful. I spent it at the Mountain View office, meeting people in person, having a few meetings, and actually doing a few tasks in addition to all of the setup and overhead that comes with being a new employee.

After traveling back home to Boston, my second week of work was a bit more eventful. We had a RAID array fail when we went to replace a disk. The RAID array held all of our e-mail, so until we could restore a backup and work on getting as much post-backup data we could out of corrupt databases (LDAP and MySQL), nobody had e-mail.

E-mail is a very big deal in any company, and Mozilla, with around 600 employees, is no exception. It was fascinating to watch my new coworkers deal with a crisis of this magnitude. A person’s true self is brought out under pressure, and I got to see how everyone acted and reacted.

Folks who wanted to help but could not stepped aside to let those with the requisite knowledge/experience fix things. Higher-ups asked for statuses. Employees found ways to communicate without e-mail.

I watched this all unfold with an observant eye – especially within my team. I am happy to learn that my coworkers (including my boss) are smart, helpful, know when to jump in and when to stay out of the way, sensitive to those who are under pressure, do not spend time rat-holing on the past, and avoid the blame game.

Of course, as Tim Callaghan pointed out at the December Boston MySQL User Group on Monday evening, what would I have done if I found out that my team was a bunch of finger-pointers, my boss was a jerk, and everyone complained selfishly? Luckily, that’s not a scenario I have to worry about.