Why does the MySQL optimizer not do what I think it should?

In May, I presented two talks – one called “Are you getting the best out of your indexes?” and “Optimizing Queries Using EXPLAIN”. I now have slides and video for both of them.

The first talk about indexing should probably be titled “Why is MySQL doing this?!!?!!?” It gives insight into why the MySQL optimizer chooses indexes that you do not expect; especially when it does not use an index you expect it to.

The talk has something for everyone – for beginners it explains B-trees and how they work, and for the more seasoned DBA it explains concepts like average value group size, and how the optimizer uses those concepts applied to metadata to make decisions.

Slides are at http://technocation.org/files/doc/2017_05_MySQLindexes.pdf.
Click the slide image below to go to the video at https://www.youtube.com/watch?v=e39-UfxQCCsSlide from MySQL indexing talk

The EXPLAIN talk goes through everything in EXPLAIN – both the regular and JSON formats – and describes what the fields mean, and how you can use them to figure out how to best optimize your query. There are examples that show where you can find red flags, so that when you EXPLAIN your own queries, you can be better prepared for gotchas. The EXPLAIN talk references the indexing talk in a few places (both talks were given to the same audience, about a week apart), so I highly recommend you watch that one first.

Slides are at http://technocation.org/files/doc/2017_05_EXPLAIN.pdf.
Click the slide image below to go to the video at https://www.youtube.com/watch?v=OlclCoWXplgSlide image from the EXPLAIN talk

Google Summer of Code: MySQL Auditing Software

On Monday August 20th, 2007, the Google Summer of Code officially ended. I have had a great time this summer, although it has not always been sunshine and flowers! Because of the nature of the Summer of Code, setbacks due to lack of knowledge were not a problem. It’s expected that the students don’t know everything!

So mostly the setbacks were organizational. I had 2 students working on MySQL Auditing Software, which I have tentatively (and very geekily) called OughtToAudit. One student was working on the administrative interface, where access to the auditing program and the auditing rules themselves are defined. As well, reporting on suspicious activity as well as the rule-breaking activity could be seen. The other student was working on a pcap (libpcap, winpcap) engine to store all database traffic. Why pcap? One of the main tenets of auditing is that the auditing system is independent of the system to be audited. Part of this is for control purposes, so that the DBA is not the final arbiter of what’s in the auditing system — that can be owned by someone else, so that the DBA can be watched, too (just 2 months ago a report came out about a DBA stealing sensitive data, http://tinyurl.com/2xpjmz).

The community bonding period was great. I did not want to code during that time, I wanted to have the students learn more about auditing, and get to be part of the community. Well, only one student had time during that period, and looking back on it, he had more to learn, so I should have had him start. I also wasn’t as organized as I could have been and was planning on using the community bonding time to write up a spec, which was late.

The coding started a bit late because both students had finals the first week in June. And then I got married the 2nd week in June and went on a 2-week honeymoon, which did not help matters. I thought my vacation would be 3 solid weeks into the Summer of Code, but it ended up being about 2 non-solid weeks (say, 1.5 actual weeks). So just when the questions started coming to the forefront, I was gone. The best laid plans and all that, I guess.

After my honeymoon it was July, and I scrambled to get organized and be the best help I could. I succeeded, but I really needed a push to get myself more motivated. Basically I did not do as much as I should have in the first half. During or just after the midterm, we established a schedule of twice-weekly conference calls (5 pm my time, 10 pm for one student, 11 pm for another, on Wednesdays and Sundays). This helped a lot, and sometimes one or more folks couldn’t make it, and that’s OK, because we had them twice a week.

From my point of view, there were not any surprises, though things did take longer than I expected, as I misjudged skills and knowledge of both students at different points, in different directions — that is, I thought both students were both better and worse at different parts of their projects, so some parts went faster and others went slower.

The outcome so far is this: we are at about an 0.7 or 0.8 release, not ready even for alpha until we can integrate a few things. We have overcome a lot of challenges, and both students know a lot more about MySQL and auditing than they did before, and got good coding experience. Which was the point of the Google Summer of Code. MySQL is closer to having auditing software, though I’d have hoped we’d have gotten a bit further than we have. But we’ve agreed to meet once a month, now that the students go back to jobs and school, and continue to work on it.

All in all, it was a good experience. Had I to do it over, I’d have done many things similarly. I would start with the conference calls from the beginning and not been overconfident in the beginning, and used the community bonding period to do what the students wanted instead of holding them back.

On Monday August 20th, 2007, the Google Summer of Code officially ended. I have had a great time this summer, although it has not always been sunshine and flowers! Because of the nature of the Summer of Code, setbacks due to lack of knowledge were not a problem. It’s expected that the students don’t know everything!

So mostly the setbacks were organizational. I had 2 students working on MySQL Auditing Software, which I have tentatively (and very geekily) called OughtToAudit. One student was working on the administrative interface, where access to the auditing program and the auditing rules themselves are defined. As well, reporting on suspicious activity as well as the rule-breaking activity could be seen. The other student was working on a pcap (libpcap, winpcap) engine to store all database traffic. Why pcap? One of the main tenets of auditing is that the auditing system is independent of the system to be audited. Part of this is for control purposes, so that the DBA is not the final arbiter of what’s in the auditing system — that can be owned by someone else, so that the DBA can be watched, too (just 2 months ago a report came out about a DBA stealing sensitive data, http://tinyurl.com/2xpjmz).

The community bonding period was great. I did not want to code during that time, I wanted to have the students learn more about auditing, and get to be part of the community. Well, only one student had time during that period, and looking back on it, he had more to learn, so I should have had him start. I also wasn’t as organized as I could have been and was planning on using the community bonding time to write up a spec, which was late.

The coding started a bit late because both students had finals the first week in June. And then I got married the 2nd week in June and went on a 2-week honeymoon, which did not help matters. I thought my vacation would be 3 solid weeks into the Summer of Code, but it ended up being about 2 non-solid weeks (say, 1.5 actual weeks). So just when the questions started coming to the forefront, I was gone. The best laid plans and all that, I guess.

After my honeymoon it was July, and I scrambled to get organized and be the best help I could. I succeeded, but I really needed a push to get myself more motivated. Basically I did not do as much as I should have in the first half. During or just after the midterm, we established a schedule of twice-weekly conference calls (5 pm my time, 10 pm for one student, 11 pm for another, on Wednesdays and Sundays). This helped a lot, and sometimes one or more folks couldn’t make it, and that’s OK, because we had them twice a week.

From my point of view, there were not any surprises, though things did take longer than I expected, as I misjudged skills and knowledge of both students at different points, in different directions — that is, I thought both students were both better and worse at different parts of their projects, so some parts went faster and others went slower.

The outcome so far is this: we are at about an 0.7 or 0.8 release, not ready even for alpha until we can integrate a few things. We have overcome a lot of challenges, and both students know a lot more about MySQL and auditing than they did before, and got good coding experience. Which was the point of the Google Summer of Code. MySQL is closer to having auditing software, though I’d have hoped we’d have gotten a bit further than we have. But we’ve agreed to meet once a month, now that the students go back to jobs and school, and continue to work on it.

All in all, it was a good experience. Had I to do it over, I’d have done many things similarly. I would start with the conference calls from the beginning and not been overconfident in the beginning, and used the community bonding period to do what the students wanted instead of holding them back.

‘Tis the Season of Code

Well, it’s official:

http://code.google.com/soc/mysql/about.html

I am officially mentoring 2 students for MySQL, AB for the Google Summer of Code. I have great hopes for the MySQL Auditing Software. My first tasks: familiarize myself with different types of regulations such as Sarbanes-Oxley and HIPAA, and the MySQL Coding Standards.

This summer is going to be great!

Well, it’s official:

http://code.google.com/soc/mysql/about.html

I am officially mentoring 2 students for MySQL, AB for the Google Summer of Code. I have great hopes for the MySQL Auditing Software. My first tasks: familiarize myself with different types of regulations such as Sarbanes-Oxley and HIPAA, and the MySQL Coding Standards.

This summer is going to be great!

OurSQL Episode 4: Cluster From Down Under

This week’s podcast is an interview with MySQL’s Stewart Smith, software engineer for MySQL Cluster. Straight from the warm southern hemisphere, Stewart talks about Cluster. Because we gabbed on and on for 19 minutes, we’ve stripped the format down a bit, having the feature as pretty much the only segment.

You can download all the oursql podcasts at:
http://technocation.org/podcasts/oursql/

Direct play this edition at:
http://technocation.org/content/oursql-episode-4%3A-cluster-down-under-0

Feature

Main cluster website:
http://www.mysql.com/cluster

Cluster documentation (in the manual):
http://dev.mysql.com/doc/refman/5.1/en/mysql-cluster.html

Cluster changes in MySQL 5.1:
http://dev.mysql.com/doc/refman/5.1/en/mysql-cluster-5-1-changes.html

Cluster mailing list (for support, preferred):
http://lists.mysql.com/cluster

The cluster forum (with RSS feed):
http://forums.mysql.com/list.php?25

Acknowledgements

http://www.technocation.org

http://music.podshow.com

http://www.russellwolff.com

http://www.smallfishadventures.com/Home.html “The Thank you song” — Smallfish

Feedback

If you have any feedback about this podcast, or want to suggest topics to cover in future podcasts, please email

podcast@technocation.org

You can also:

Call the comment line at +1 617-674-2369

Or use Odeo to leave a voice mail through your computer:
http://odeo.com/sendmeamessage/Sheeri

Or use the Technocation forums:
http://technocation.org/forum

This week’s podcast is an interview with MySQL’s Stewart Smith, software engineer for MySQL Cluster. Straight from the warm southern hemisphere, Stewart talks about Cluster. Because we gabbed on and on for 19 minutes, we’ve stripped the format down a bit, having the feature as pretty much the only segment.

You can download all the oursql podcasts at:
http://technocation.org/podcasts/oursql/

Direct play this edition at:
http://technocation.org/content/oursql-episode-4%3A-cluster-down-under-0

Feature

Main cluster website:
http://www.mysql.com/cluster

Cluster documentation (in the manual):
http://dev.mysql.com/doc/refman/5.1/en/mysql-cluster.html

Cluster changes in MySQL 5.1:
http://dev.mysql.com/doc/refman/5.1/en/mysql-cluster-5-1-changes.html

Cluster mailing list (for support, preferred):
http://lists.mysql.com/cluster

The cluster forum (with RSS feed):
http://forums.mysql.com/list.php?25

Acknowledgements

http://www.technocation.org

http://music.podshow.com

http://www.russellwolff.com

http://www.smallfishadventures.com/Home.html “The Thank you song” — Smallfish

Feedback

If you have any feedback about this podcast, or want to suggest topics to cover in future podcasts, please email

podcast@technocation.org

You can also:

Call the comment line at +1 617-674-2369

Or use Odeo to leave a voice mail through your computer:
http://odeo.com/sendmeamessage/Sheeri

Or use the Technocation forums:
http://technocation.org/forum