On Monday, October 12, 2009* from 7-9 pm at MIT, I will be giving a presentation explaining SHOW ENGINE INNODB STATUS for the Boston MySQL User Group. There is information about foreign keys, transactions, deadlocks and mutexes just waiting to be discovered, and I will show how to decipher the information.
For all those in the Boston area, I hope to see you there! For those who cannot be there, we will video this presentation and make it available online, and post here when the video/slides are up.
*Yes, I realize that this is a bank holiday in the US.
On Monday, October 12, 2009* from 7-9 pm at MIT, I will be giving a presentation explaining SHOW ENGINE INNODB STATUS for the Boston MySQL User Group. There is information about foreign keys, transactions, deadlocks and mutexes just waiting to be discovered, and I will show how to decipher the information.
For all those in the Boston area, I hope to see you there! For those who cannot be there, we will video this presentation and make it available online, and post here when the video/slides are up.
*Yes, I realize that this is a bank holiday in the US.
However, this will show also trivial differences, such as the value of AUTO_INCREMENT. It also does not give you a way to patch one schema to be like another.
We frequently are asked to “do a schema diff and create a script that will ‘patch’ one server.” Usually this is done to take a development or test schema and move it to production for a release.
We like to use the best tool for the job, and while diff is good, I like to use MySQL workbench. The OSS (Community) edition provides all the functionality we need for creating a schema diff and patch. MySQL workbench does both reverse engineering and “forward engineer” alter script for free in the OSS/Community version.
Step by step, here’s how to do it:
First, create a script with the database schema that you want to be the final result. For example, if you want to take an existing production schema and change it to be like the development schema, then your “final result” is the development schema:
mysqldump --no-data -h dev_server --all-databases > final.sql
You may want to specify which databases instead of --all-databases.
Open MySQL Workbench.
Right-click on the “mydb” schema and delete it.
File->Import->Reverse Engineer MySQL Create script and choose the final.sql script
Then, “forward engineer” using the current schema. Export the schema you want to change (in our example, the production schema):
File->Export->Forward Engineer SQL ALTER Script and choose the currentschema.sql script
View the alter script and/or save to a file. You can also copy and paste the information.
Note that depending on your setup, there may be some “noise” — if fields are in a different order, or if indexes have different names, they will have patches to make the schemas exactly the same. Index names usually do not matter, and for some organizations field order matters, for others it does not. In general it is good to have the same field orders, but if it is a lot of work to change that (for example, ALTERing a table with 2 million rows) then it may not be necessary. I strongly advise not to have an environment where order matters.
The easiest way to see the differences between two schemas on a non-Windows machine is to run:
However, this will show also trivial differences, such as the value of AUTO_INCREMENT. It also does not give you a way to patch one schema to be like another.
We frequently are asked to “do a schema diff and create a script that will ‘patch’ one server.” Usually this is done to take a development or test schema and move it to production for a release.
We like to use the best tool for the job, and while diff is good, I like to use MySQL workbench. The OSS (Community) edition provides all the functionality we need for creating a schema diff and patch. MySQL workbench does both reverse engineering and “forward engineer” alter script for free in the OSS/Community version.
Step by step, here’s how to do it:
First, create a script with the database schema that you want to be the final result. For example, if you want to take an existing production schema and change it to be like the development schema, then your “final result” is the development schema:
mysqldump --no-data -h dev_server --all-databases > final.sql
You may want to specify which databases instead of --all-databases.
Open MySQL Workbench.
Right-click on the “mydb” schema and delete it.
File->Import->Reverse Engineer MySQL Create script and choose the final.sql script
Then, “forward engineer” using the current schema. Export the schema you want to change (in our example, the production schema):
File->Export->Forward Engineer SQL ALTER Script and choose the currentschema.sql script
View the alter script and/or save to a file. You can also copy and paste the information.
Note that depending on your setup, there may be some “noise” — if fields are in a different order, or if indexes have different names, they will have patches to make the schemas exactly the same. Index names usually do not matter, and for some organizations field order matters, for others it does not. In general it is good to have the same field orders, but if it is a lot of work to change that (for example, ALTERing a table with 2 million rows) then it may not be necessary. I strongly advise not to have an environment where order matters.
I spent Friday examining the systems for a more traditional consulting gig (in case you did not know, Pythian has offered more traditional consulting and emergency services in addition to our remote DBA offering for our entire 12 year history). It is a familiar story to many people — the database performance was unacceptable.
The company had a few log tables that had huge amounts of inserts and foreign keys, so they used InnoDB. Unfortunately, they also used a GUID as a primary key (which is getting more and more common these days, but long primary keys with InnoDB slow down INSERT, UPDATE and DELETE commands a great deal) — varchar(32) and utf8.
That’s right — their primary key for many of these tables was 96 bytes long (32 characters * 3 bytes per character), and as an InnoDB table, the primary key is clustered with every other index. One of the more frequently used tables had 8 indexes other than the PRIMARY KEY — 7 were indexes on varchar(32) fields, the other was on an int(11) field (signed integer).
The query load on the server was 99% writes, 1% reads. The log tables only had rows inserted, not deleted or updated. Thus, we offered a few different solutions (in no particular order):
1) Use a BLACKHOLE table on the master and replicate to a slave server. This would make the INSERTs to the master fast, though the slave would lag behind. The slave could be changed to MyISAM or ARCHIVE or some other storage engine.
2) Use MyISAM on the master and slave. This was the solution we went to, because the log tables could be re-created with no data and moved into place, and the loss of foreign keys was not a big deal, since the application checked for them anyway. ARCHIVE could also be used, it is faster than MyISAM on inserts because it only allows one index, the PRIMARY KEY. concurrent_insert was set at the default of 1, but we changed it to 2 just in case there was an errant UPDATE or DELETE causing a gap in the MyISAM table.
3) Add a PRIMARY KEY that is AUTO_INCREMENT, keeping the GUID a UNIQUE KEY. This would speed up the writes without sacrificing any data. This can be combined with dropping unnecessary foreign keys to make the inserts faster.
4) Move the log tables to a different machine entirely. This would require the application to have two different database handlers.
5) Use INSERT DELAYED into the logging tables.
We discussed these scenarios and the pros and cons, and eventually we chose to do #2, which we could do right away, and indeed, we had an immediate positive impact. Almost all the unnacceptable performance went away.
We were also able to audit the system and provide several good recommendations for ways they can improve their system. It is exciting for me to be able to help people out in a very tangible way — often we do not see some of the really crazy problems that affect few people (such as slave lag getting further and further behind) because our remote DBA clients have the benefit of ongoing help from us, and we generally find the smaller degradations in performance before they become a large issue.
Being able to spend a few hours with a new environment and really help them out was extremely fulfilling for me. I love that MySQL has many different options (and there are likely more that I did not think of) that all would work to achieve better performance. I think people like to have a choice, too, so they do not feel they were forced to do something they did not want to do. Especially when there are solutions on both the DBA and developer side.
I spent Friday examining the systems for a more traditional consulting gig (in case you did not know, Pythian has offered more traditional consulting and emergency services in addition to our remote DBA offering for our entire 12 year history). It is a familiar story to many people — the database performance was unacceptable.
The company had a few log tables that had huge amounts of inserts and foreign keys, so they used InnoDB. Unfortunately, they also used a GUID as a primary key (which is getting more and more common these days, but long primary keys with InnoDB slow down INSERT, UPDATE and DELETE commands a great deal) — varchar(32) and utf8.
That’s right — their primary key for many of these tables was 96 bytes long (32 characters * 3 bytes per character), and as an InnoDB table, the primary key is clustered with every other index. One of the more frequently used tables had 8 indexes other than the PRIMARY KEY — 7 were indexes on varchar(32) fields, the other was on an int(11) field (signed integer).
The query load on the server was 99% writes, 1% reads. The log tables only had rows inserted, not deleted or updated. Thus, we offered a few different solutions (in no particular order):
1) Use a BLACKHOLE table on the master and replicate to a slave server. This would make the INSERTs to the master fast, though the slave would lag behind. The slave could be changed to MyISAM or ARCHIVE or some other storage engine.
2) Use MyISAM on the master and slave. This was the solution we went to, because the log tables could be re-created with no data and moved into place, and the loss of foreign keys was not a big deal, since the application checked for them anyway. ARCHIVE could also be used, it is faster than MyISAM on inserts because it only allows one index, the PRIMARY KEY. concurrent_insert was set at the default of 1, but we changed it to 2 just in case there was an errant UPDATE or DELETE causing a gap in the MyISAM table.
3) Add a PRIMARY KEY that is AUTO_INCREMENT, keeping the GUID a UNIQUE KEY. This would speed up the writes without sacrificing any data. This can be combined with dropping unnecessary foreign keys to make the inserts faster.
4) Move the log tables to a different machine entirely. This would require the application to have two different database handlers.
5) Use INSERT DELAYED into the logging tables.
We discussed these scenarios and the pros and cons, and eventually we chose to do #2, which we could do right away, and indeed, we had an immediate positive impact. Almost all the unnacceptable performance went away.
We were also able to audit the system and provide several good recommendations for ways they can improve their system. It is exciting for me to be able to help people out in a very tangible way — often we do not see some of the really crazy problems that affect few people (such as slave lag getting further and further behind) because our remote DBA clients have the benefit of ongoing help from us, and we generally find the smaller degradations in performance before they become a large issue.
Being able to spend a few hours with a new environment and really help them out was extremely fulfilling for me. I love that MySQL has many different options (and there are likely more that I did not think of) that all would work to achieve better performance. I think people like to have a choice, too, so they do not feel they were forced to do something they did not want to do. Especially when there are solutions on both the DBA and developer side.
However, the tests are skewed — not on purpose, I am sure, and Mark admits he has not used the query cache before — but they are skewed all the same. Mark’s error was that he assumed he could just turn on the query cache and see if it works. Most features of MySQL do not work that way — you have to understand the strengths and weaknesses of the feature in order to use it properly.
Mark’s benchmark definitely reinforces that turning on the query cache without any knowledge of your system is a bad idea, and I agree with him on that. But it does not in any way mean that the query cache is always a bad idea. In fact, the MySQL manual page at http://dev.mysql.com/doc/refman/5.1/en/query-cache.html has some caveats about when the query cache is useful and when it is not useful.
It is important to know how the MySQL query cache works, so I will first explain that, and then explain why Mark’s test is not a very good broad generalization. MySQL’s query cache is not useful all the time, but it can be quite useful! If you are already familiar with the query cache, you can skip ahead to why Mark’s benchmark is skewed.
How to use the MySQL query cache
Firstly, there are a few ways to use the MySQL query cache. You can disable it, which means no queries are cached; you can set it to cache all queries, and use a special keyword to not cache queries (replace SELECT with SELECT SQL_NO_CACHE). query_cache_type=1 or ON will set the query cache in this way.
You can also set it to cache no queries except those that have a special keyword — you guessed it, replace SELECT with SELECT SQL_CACHE and a query will attempt to be cached. To set the query cache in this way, set query_cache_type=2 or DEMAND.
What queries are cached
Regardless of which method you choose, not all possible queries are cached. Only deterministic SELECT statements are cached — a statement such as SELECT count(*) from tbl is deterministic, while SELECT NOW() is non-deterministic for the purpose of the query cache.* Queries that use user-defined functions, temporary tables, LOCK IN SHARE MODE, FOR UPDATE, INFORMATION_SCHEMA or mysql system tables are not cached. Most SELECT subqueries are not cached. SELECT queries that are part of stored functions, triggers, views and events are not cached. However, in MySQL 5.1, SELECT queries that contain a view can be cached. The MySQL manual page at http://dev.mysql.com/doc/refman/5.1/en/query-cache-operation.html has more detailed information on what is allowed and what is not allowed.
In the query cache, a hash of the query is stored along with the result. There is a variable, query_cache_limit, which sets an upper limit on the size of the result set that can be cached. So even if a query can be cached, it may not be cached if the result set is larger than query_cache_limit.
How the query cache is used
The lifecycle of a non-prepared SELECT statement when the query cache is on (query_cache_type > 0) is:
1) hash the SELECT query
2) check the hash against the query cache
3) if hash matches, returned the result set from the query cache. If not, the query is sent to the parser and executed in the usual manner, and results are stored in the query cache if the query is cachable.
This means that having the query cache on has some overhead, even if no queries are ever cached.
A query (and its result) is removed from the query cache when it is invalidated or pruned. A query is invalidated when any table it refers to has been changed, even if the changes do not affect the actual result set in the query cache. A query is pruned when a new query needs to be inserted into the query cache and there is no room; the query cache prunes queries using a least recently used (LRU) algorithm.
In other words, there will be a lot of unnecessary overhead with the query cache if a table is often written to, because queries using that table will be constantly removed from the query cache. This does not mean that if you have some tables that are frequently written to, you should turn the query cache off — you can use the query_cache_type of ON with the SQL_NO_CACHE keyword to cache all except some queries, or you can use the query_cache_type of DEMAND and the SQL_CACHE keyword in SELECT queries to cache no queries except those you specify.
I will explain what the important sysbench variables are in context.
1) The benchmark used sysbench in OLTP mode against an InnoDB table. The InnoDB buffer pool is hot after the table is created and populated, while the query cache is empty and cold.
InnoDB has its own cache — the InnoDB buffer pool. sysbench is run with –oltp-read-only, meaning there are no INSERT, DELETE or UPDATE statements. This is good; it means that the results are not skewed by query cache invalidations, though there may be pruning.
The first thing sysbench does is preparation — it creates the table and populates it with data (2,000,000 rows as per oltp-table-size). After this is done, the InnoDB buffer pool (2000M in size) has gone from empty to containing data. I do not know how large the row size of the sysbench table is; however, I can do some calculations:
CREATE TABLE `sbtest` ( `id` int(10) unsigned NOT NULL auto_increment, `k` int(10) unsigned NOT NULL default ‘0′, `c` char(120) NOT NULL default ”, `pad` char(60) NOT NULL default ”, PRIMARY KEY (`id`), KEY `k` (`k`)
(note that you at least need to add a final parenthesis “)” for the query to work, and should also specify ENGINE=INNODB if you want to test on your system).
So, that’s two 4-byte integers, and two character fields…for the sake of argument, let’s say you are using UTF-8 in MySQL 5.1, where utf8 is 3 bytes per character, that’s:
4 + 4 + 120*3 + 60*3 = 548 bytes
Let’s say you are using a version of MySQL that uses 4 bytes per character for utf8 (I know 6.0 was using that, not sure about 5.4):
4 + 4 + 120*4 + 60*4 = 728 bytes
So either way, all of the data in the table is already loaded in the InnoDB buffer pool when the test begins. The InnoDB buffer pool is already hot, whereas the query cache is empty.
On systems where a query can be cached and used instead of executing and having to look on disk for data, using the MySQL query cache will give you *much* better performance.
2) The query cache statistics are not given. It’s very possible that the only thing that was measured was how much overhead the query cache provides if it is never used.
The queries that are run are the default (oltp-test-mode is not set, and it defaults to “complex”) which can be seen under the “Advanced transactional” heading in the sysbench manual at http://sysbench.sourceforge.net/docs/#database_mode. Only read queries are done, so the queries are:
SELECT c FROM sbtest WHERE id=N
SELECT c FROM sbtest WHERE id BETWEEN N AND M
SELECT SUM(K) FROM sbtest WHERE id BETWEEN N and M
SELECT c FROM sbtest WHERE id between N and M ORDER BY c
SELECT DISTINCT c FROM sbtest WHERE id BETWEEN N and M ORDER BY c
The values for the variables in the query are chosen at random, though they are sensible values. The testing time is very short, only 3 minutes of running time (sysbench is called with max-time=180). It is quite possible that all the SELECT queries were put into the query cache and there was never a query cache hit. Of course there will be a lot of overhead if the query cache is never used! Particularly if it has to prune for more memory.
I would like to see Qcache_hits, Qcache_inserts, Com_select (number of SELECT statements that did not use the query cache) and Qcache_lowmem_prunes. That is what I look at for any system that has the query cache running, so I can determine the effectiveness of the query cache.
There is more in-depth information on how to find the usefulness of the query cache, query cache fragmentation, and a few ways to cache more queries in the MySQL Administrator’s Bible on pages 427-434 — part of the “Caching with MySQL” chapter. I used the book to help me write this blog post, so I do think it is fair to cite my sources….even though it happens to be a source I created (and that’s why I consider it a shameless plug, I have no shame citing my sources)!
—————-
So that all being said, the query cache is extremely useful when it is used for repeated queries. The benchmark Mark performed most likely did not actually use this. I have seen production environments fall apart when someone turned on on the query cache for all queries; Mark’s benchmark proves that it is not a good idea to randomly turn on the query cache. You have to know what you’re doing.
* for the purpose of replication, it is deterministic, but that’s another topic.
However, the tests are skewed — not on purpose, I am sure, and Mark admits he has not used the query cache before — but they are skewed all the same. Mark’s error was that he assumed he could just turn on the query cache and see if it works. Most features of MySQL do not work that way — you have to understand the strengths and weaknesses of the feature in order to use it properly.
Mark’s benchmark definitely reinforces that turning on the query cache without any knowledge of your system is a bad idea, and I agree with him on that. But it does not in any way mean that the query cache is always a bad idea. In fact, the MySQL manual page at http://dev.mysql.com/doc/refman/5.1/en/query-cache.html has some caveats about when the query cache is useful and when it is not useful.
It is important to know how the MySQL query cache works, so I will first explain that, and then explain why Mark’s test is not a very good broad generalization. MySQL’s query cache is not useful all the time, but it can be quite useful! If you are already familiar with the query cache, you can skip ahead to why Mark’s benchmark is skewed.
How to use the MySQL query cache
Firstly, there are a few ways to use the MySQL query cache. You can disable it, which means no queries are cached; you can set it to cache all queries, and use a special keyword to not cache queries (replace SELECT with SELECT SQL_NO_CACHE). query_cache_type=1 or ON will set the query cache in this way.
You can also set it to cache no queries except those that have a special keyword — you guessed it, replace SELECT with SELECT SQL_CACHE and a query will attempt to be cached. To set the query cache in this way, set query_cache_type=2 or DEMAND.
What queries are cached
Regardless of which method you choose, not all possible queries are cached. Only deterministic SELECT statements are cached — a statement such as SELECT count(*) from tbl is deterministic, while SELECT NOW() is non-deterministic for the purpose of the query cache.* Queries that use user-defined functions, temporary tables, LOCK IN SHARE MODE, FOR UPDATE, INFORMATION_SCHEMA or mysql system tables are not cached. Most SELECT subqueries are not cached. SELECT queries that are part of stored functions, triggers, views and events are not cached. However, in MySQL 5.1, SELECT queries that contain a view can be cached. The MySQL manual page at http://dev.mysql.com/doc/refman/5.1/en/query-cache-operation.html has more detailed information on what is allowed and what is not allowed.
In the query cache, a hash of the query is stored along with the result. There is a variable, query_cache_limit, which sets an upper limit on the size of the result set that can be cached. So even if a query can be cached, it may not be cached if the result set is larger than query_cache_limit.
How the query cache is used
The lifecycle of a non-prepared SELECT statement when the query cache is on (query_cache_type > 0) is:
1) hash the SELECT query
2) check the hash against the query cache
3) if hash matches, returned the result set from the query cache. If not, the query is sent to the parser and executed in the usual manner, and results are stored in the query cache if the query is cachable.
This means that having the query cache on has some overhead, even if no queries are ever cached.
A query (and its result) is removed from the query cache when it is invalidated or pruned. A query is invalidated when any table it refers to has been changed, even if the changes do not affect the actual result set in the query cache. A query is pruned when a new query needs to be inserted into the query cache and there is no room; the query cache prunes queries using a least recently used (LRU) algorithm.
In other words, there will be a lot of unnecessary overhead with the query cache if a table is often written to, because queries using that table will be constantly removed from the query cache. This does not mean that if you have some tables that are frequently written to, you should turn the query cache off — you can use the query_cache_type of ON with the SQL_NO_CACHE keyword to cache all except some queries, or you can use the query_cache_type of DEMAND and the SQL_CACHE keyword in SELECT queries to cache no queries except those you specify.
I will explain what the important sysbench variables are in context.
1) The benchmark used sysbench in OLTP mode against an InnoDB table. The InnoDB buffer pool is hot after the table is created and populated, while the query cache is empty and cold.
InnoDB has its own cache — the InnoDB buffer pool. sysbench is run with –oltp-read-only, meaning there are no INSERT, DELETE or UPDATE statements. This is good; it means that the results are not skewed by query cache invalidations, though there may be pruning.
The first thing sysbench does is preparation — it creates the table and populates it with data (2,000,000 rows as per oltp-table-size). After this is done, the InnoDB buffer pool (2000M in size) has gone from empty to containing data. I do not know how large the row size of the sysbench table is; however, I can do some calculations:
CREATE TABLE `sbtest` ( `id` int(10) unsigned NOT NULL auto_increment, `k` int(10) unsigned NOT NULL default ‘0′, `c` char(120) NOT NULL default ”, `pad` char(60) NOT NULL default ”, PRIMARY KEY (`id`), KEY `k` (`k`)
(note that you at least need to add a final parenthesis “)” for the query to work, and should also specify ENGINE=INNODB if you want to test on your system).
So, that’s two 4-byte integers, and two character fields…for the sake of argument, let’s say you are using UTF-8 in MySQL 5.1, where utf8 is 3 bytes per character, that’s:
4 + 4 + 120*3 + 60*3 = 548 bytes
Let’s say you are using a version of MySQL that uses 4 bytes per character for utf8 (I know 6.0 was using that, not sure about 5.4):
4 + 4 + 120*4 + 60*4 = 728 bytes
So either way, all of the data in the table is already loaded in the InnoDB buffer pool when the test begins. The InnoDB buffer pool is already hot, whereas the query cache is empty.
On systems where a query can be cached and used instead of executing and having to look on disk for data, using the MySQL query cache will give you *much* better performance.
2) The query cache statistics are not given. It’s very possible that the only thing that was measured was how much overhead the query cache provides if it is never used.
The queries that are run are the default (oltp-test-mode is not set, and it defaults to “complex”) which can be seen under the “Advanced transactional” heading in the sysbench manual at http://sysbench.sourceforge.net/docs/#database_mode. Only read queries are done, so the queries are:
SELECT c FROM sbtest WHERE id=N
SELECT c FROM sbtest WHERE id BETWEEN N AND M
SELECT SUM(K) FROM sbtest WHERE id BETWEEN N and M
SELECT c FROM sbtest WHERE id between N and M ORDER BY c
SELECT DISTINCT c FROM sbtest WHERE id BETWEEN N and M ORDER BY c
The values for the variables in the query are chosen at random, though they are sensible values. The testing time is very short, only 3 minutes of running time (sysbench is called with max-time=180). It is quite possible that all the SELECT queries were put into the query cache and there was never a query cache hit. Of course there will be a lot of overhead if the query cache is never used! Particularly if it has to prune for more memory.
I would like to see Qcache_hits, Qcache_inserts, Com_select (number of SELECT statements that did not use the query cache) and Qcache_lowmem_prunes. That is what I look at for any system that has the query cache running, so I can determine the effectiveness of the query cache.
There is more in-depth information on how to find the usefulness of the query cache, query cache fragmentation, and a few ways to cache more queries in the MySQL Administrator’s Bible on pages 427-434 — part of the “Caching with MySQL” chapter. I used the book to help me write this blog post, so I do think it is fair to cite my sources….even though it happens to be a source I created (and that’s why I consider it a shameless plug, I have no shame citing my sources)!
—————-
So that all being said, the query cache is extremely useful when it is used for repeated queries. The benchmark Mark performed most likely did not actually use this. I have seen production environments fall apart when someone turned on on the query cache for all queries; Mark’s benchmark proves that it is not a good idea to randomly turn on the query cache. You have to know what you’re doing.
* for the purpose of replication, it is deterministic, but that’s another topic.
Mike Hogan, CEO of ScaleDB spoke at the Boston MySQL User Group in September 2009:
ScaleDB is a storage engine for MySQL that delivers shared-disk clustering. It has been described as the Oracle RAC of MySQL. Using ScaleDB, you can scale your cluster by simply adding nodes, without partitioning your data. Each node has full read/write capability, eliminating the need for slaves, while delivering cluster-level load balancing. ScaleDB is looking for additional beta testers, there is a sign up at http://www.scaledb.com.
Mike Hogan, CEO of ScaleDB spoke at the Boston MySQL User Group in September 2009:
ScaleDB is a storage engine for MySQL that delivers shared-disk clustering. It has been described as the Oracle RAC of MySQL. Using ScaleDB, you can scale your cluster by simply adding nodes, without partitioning your data. Each node has full read/write capability, eliminating the need for slaves, while delivering cluster-level load balancing. ScaleDB is looking for additional beta testers, there is a sign up at http://www.scaledb.com.
A broken VIEW was caused by the view’s definer being non-existent. In this particular system, backups are done by replicating all the machines (production, development, test, etc) to one server and doing cold physical backups off that server, which currently has 12 instances running.
So in order to find on what machine the user might still be defined, I went to the backup server. All the data directories are in one path, ie:
instance 1 has a datadir of /data/mysql/instance1
instance 2 has a datadir of /data/mysql/instance2
Now, the unix tool strings can be used against many types of files. In particular, though, you can use strings on the mysql/user.MYD file to see the username, host, and password hash. (note that strings only shows strings longer than 3 characters, so if your host or username is 3 characters or less, it will not show up in the output of strings. You can change this with the -n option to strings).
So do not underestimate the power of basic tools such as strings and grep. They can really help you! (I often use strings mysql/user.MYD to see if a particular mysql user has been set up, especially when I cannot seem to login. This way I can know whether or not I am typing an incorrect password, or if the user just does not exist at all.)
A broken VIEW was caused by the view’s definer being non-existent. In this particular system, backups are done by replicating all the machines (production, development, test, etc) to one server and doing cold physical backups off that server, which currently has 12 instances running.
So in order to find on what machine the user might still be defined, I went to the backup server. All the data directories are in one path, ie:
instance 1 has a datadir of /data/mysql/instance1
instance 2 has a datadir of /data/mysql/instance2
Now, the unix tool strings can be used against many types of files. In particular, though, you can use strings on the mysql/user.MYD file to see the username, host, and password hash. (note that strings only shows strings longer than 3 characters, so if your host or username is 3 characters or less, it will not show up in the output of strings. You can change this with the -n option to strings).
So do not underestimate the power of basic tools such as strings and grep. They can really help you! (I often use strings mysql/user.MYD to see if a particular mysql user has been set up, especially when I cannot seem to login. This way I can know whether or not I am typing an incorrect password, or if the user just does not exist at all.)
In record time, less than a week after the conference (thanks to the free Pinnacle Video Spin and YouTube), all 11 videos that were taken at OpenSQLCamp Europe are online.
For those who missed the sessions, or just want to relive the fun!
The YouTube videos have the descriptions and resources from the official conference pages, and links to pages. If there is more information to add (for example, the slides from a talk are now online), or if you spot an error, please feel free to add a comment on the YouTube video, or as a comment to this blog post.
In record time, less than a week after the conference (thanks to the free Pinnacle Video Spin and YouTube), all 11 videos that were taken at OpenSQLCamp Europe are online.
For those who missed the sessions, or just want to relive the fun!
The YouTube videos have the descriptions and resources from the official conference pages, and links to pages. If there is more information to add (for example, the slides from a talk are now online), or if you spot an error, please feel free to add a comment on the YouTube video, or as a comment to this blog post.
It’s been just over three months since the April 2009 MySQL Users Conference and Expo. It took a while for the files to be processed, and then uploaded to www.technocation.org, and then I found out that the online streaming was not working properly. So I started playing with things, re-encoding some videos, updating the software, but to no avail.
Just as I was about to give up I got notification that Technocation, Inc. was accepted into YouTube’s not-for-profit program, which allows movies larger than 10 minutes to be uploaded and viewed advertisement-free.
So then I had to upload the videos to YouTube and add descriptions.
So with no *further* delay, here are all the videos from the 2009 MySQL Conference and 2009 MySQL Camp:
The longer description – each video with title, presenter(s) and link: MySQL Camp:
Keynote: The State of Open Source Databases
Brian Aker (Sun Microsystems) http://www.youtube.com/watch?v=m2V-hvD_icA
Using the Event Scheduler: The Friendly Behind-the-Scenes Helper
Giuseppe Maxia (Sun Microsystems Inc), Andrey Hristov (SUN Microsystems) http://www.youtube.com/watch?v=-CmIcEXrf3g
Starring Sakila: Data Warehousing Explained, Illustrated, and Subtitled
Roland Bouman (XCDSQL Solutions / Strukton Rail), Matt Casters (Pentaho Corp.) http://www.youtube.com/watch?v=cSXWTNYn3es
The ScaleDB Storage Engine: Enabling High Performance and Scalability Using Materialized Views and a Shared-Disk Clustering Architecture
Moshe Shadmon (ScaleDB) http://www.youtube.com/watch?v=_5QNSxfpYSU
It’s been just over three months since the April 2009 MySQL Users Conference and Expo. It took a while for the files to be processed, and then uploaded to www.technocation.org, and then I found out that the online streaming was not working properly. So I started playing with things, re-encoding some videos, updating the software, but to no avail.
Just as I was about to give up I got notification that Technocation, Inc. was accepted into YouTube’s not-for-profit program, which allows movies larger than 10 minutes to be uploaded and viewed advertisement-free.
So then I had to upload the videos to YouTube and add descriptions.
So with no *further* delay, here are all the videos from the 2009 MySQL Conference and 2009 MySQL Camp:
The longer description – each video with title, presenter(s) and link: MySQL Camp:
Keynote: The State of Open Source Databases
Brian Aker (Sun Microsystems) http://www.youtube.com/watch?v=m2V-hvD_icA
Using the Event Scheduler: The Friendly Behind-the-Scenes Helper
Giuseppe Maxia (Sun Microsystems Inc), Andrey Hristov (SUN Microsystems) http://www.youtube.com/watch?v=-CmIcEXrf3g
Starring Sakila: Data Warehousing Explained, Illustrated, and Subtitled
Roland Bouman (XCDSQL Solutions / Strukton Rail), Matt Casters (Pentaho Corp.) http://www.youtube.com/watch?v=cSXWTNYn3es
The ScaleDB Storage Engine: Enabling High Performance and Scalability Using Materialized Views and a Shared-Disk Clustering Architecture
Moshe Shadmon (ScaleDB) http://www.youtube.com/watch?v=_5QNSxfpYSU
At the May 2009 Boston MySQL User Group, Giuseppe Maxia of Sun Microsystems gave a presentation about MySQL 5.4 with use cases and benchmarks to show how it outperforms all other current MySQL releases (including the Google patches and the OurDelta/Percona releases).
The video can be watched below, or directly on YouTube at http://www.youtube.com/watch?v=giT4SYXuVqA&hl
At the May 2009 Boston MySQL User Group, Giuseppe Maxia of Sun Microsystems gave a presentation about MySQL 5.4 with use cases and benchmarks to show how it outperforms all other current MySQL releases (including the Google patches and the OurDelta/Percona releases).
At the July MySQL User Group, Eric Day and Patrick Galbraith spoke about Drizzle, a lightweight, microkernel, open source database for high-performance scale-out applications, and Gearman, an open source, distributed job queuing system.
The second part, about 1.5 hours, where Eric and Patrick talk about Gearman, and then illustrate Gearman and Drizzle working together in a custom search application called Narada, is at http://www.youtube.com/watch?v=Auatakyl5eM and below:
There’s a lot of information in those presentations, and I would like to wholeheartedly thank Eric and Patrick for such a wonderful user group!
At the July MySQL User Group, Eric Day and Patrick Galbraith spoke about Drizzle, a lightweight, microkernel, open source database for high-performance scale-out applications, and Gearman, an open source, distributed job queuing system.
The second part, about 1.5 hours, where Eric and Patrick talk about Gearman, and then illustrate Gearman and Drizzle working together in a custom search application called Narada, is at http://www.youtube.com/watch?v=Auatakyl5eM and below:
There’s a lot of information in those presentations, and I would like to wholeheartedly thank Eric and Patrick for such a wonderful user group!