So, recently we had a runaway query eat up all sorts of temporary table space on our machines. Several machines had several terabytes in their ibtmp1 file after this happened. So I set out to find out more about why the InnoDB temporary tablespace is used, why it is better than using regular files, which was what was used prior to MySQL 5.7, and how to make sure that runaway queries do not end up filling up disk space.
Unfortunately, the manual does not go into why ibtmp1 is better than one file per temporary query, which disappears once the query ends. There are a few sections to look into:
Temporary Table Undo Logs – has one paragraph that states that these are the undo logs for temporary tablespaces. Given that these are undo logs, my guess is that this makes MySQL more crash-safe. But that is just a guess.
There is also InnoDB Temporary Tablespace which is two paragraphs, with some more details, but again, no reasoning why.
And finally, the documentation for the innodb_temp_data_file_path system variable sheds a bit of light on the subject – It explains “Metadata about active InnoDB temporary tables is located in INFORMATION_SCHEMA.INNODB_TEMP_TABLE_INFO.”
There is a manual page on Innodb temp table info table as well, which shows both compressed and uncompressed tables – uncompressed tables live in the ibtmp tablespaces, and compressed temporary tables live in the .ibd tablespace of the compressed table in question – as shown in the “PER_TABLE_SPACE” and “IS_COMPRESSED” fields.
Sadly, the table does not give useful information such as which process ID or user is generating the data. And of course it is only the active temporary space usage at the time – if you have a large temporary tablespace but no active queries using the tablespace, INNODB_TEMP_TABLE_INFO is empty.
I can imagine a scenario with more than one long-running query using a lot of space in the temporary tablespace. But I do not see how the INNODB_TEMP_TABLE_INFO would help me determine any useful information as to which query I should kill. I guess it is useful to see if there is an active query currently using temporary tablespace, but when you have a large file with nothing in it, it is just that much more depressing.
# mysql INFORMATION_SCHEMA -p -e “SELECT * FROM INNODB_TEMP_TABLE_INFO”
# ls -rlth /var/lib/mysql/ibtmp1
-rw-r—– 1 mysql mysql 2.3T Oct 31 10:50 /var/lib/mysql/ibtmp1
9 thoughts on “MySQL 5.7 InnoDB Temporary Tablespace – but why?”
Having multiple temp tablespaces for undo (instead of ibdata1) allows mysql to truncate the inactive one. This fixes the issue where ibdata1 grows and then can’t shrink again eating up disk space
Ah, so it does not replace the temporary files in tmpdir, just the temporary stuff that used to happen in ibdata1. Thank you!
Daniël, I believe that you may be confusing persistent undo tablespaces and the temporary InnoDB tablespace.
There can be multiple persistent undo tablespaces. These are for the transaction logs of persistent tables, and indeed, MySQL 5.7 can truncate a persistent undo tablespace when multiple undo tablespaces are being used.
Temporary tables always use the temporary tablespace for undo logs. To my knowledge, there is no logic inside MySQL 5.7 or MariaDB that would allow the temporary tablespace to shrink. The only way to shrink it should be to restart the server.
Marko – so, I’m still left confused as to what ibtmp1 replaces – in other words, what do temp tablespaces look like in 5.6 vs. 5.7?
It looks to me like it replaces the files in tmpdir, and if so, I’m curious as to why that was done. More crash-safe? but at the expense of never being able to shrink? (with files in tmpdir, they were deleted once the query was over, successful or not, so the free space cleared up).
In MySQL 5.6, InnoDB treats all tables as persistent ones, and it is not possible for the SQL query executor to create InnoDB tables for processing JOIN or ORDER BY or similar. The only special handling for temporary tables is a flag in SYS_TABLES that suppresses some error log printout for missing tables at startup.
In MySQL 5.6, the tmpdir/#sql*.ibd files created for CREATE TEMPORARY TABLE would be treated as persistent tables. The undo logs would be written to ibdata or undo* files, and also redo log records would be written to ib_logfile* files.
I was not involved with the design, implementation or review of these changes, even though I was working in the InnoDB team at Oracle during the time.
You make a good point that the ibtmp1 file cannot shrink. To add insult to the injury, it seems that the file is being extended much more than necessary: https://jira.mariadb.org/browse/MDEV-13013
I have not yet had time to investigate this problem.
Another pain point of ibtmp1 is that the writes from the buffer pool are still tied to LSN and the flushing of persistent files. I believe that this could mean that even after DROP TEMPORARY TABLE, the garbage pages from the dropped table could be written back to the ibtmp1 file. Ideally, freed pages in the temporary tablespace would never be written to the file, and no temporary pages would be written unless the buffer pool is close to full and we may need to evict pages.
OK, so it seems Daniël’s point wasn’t totally wrong – it is taking the undo logs for the temporary tables from the .ibd files, and taking the redo logs from the ib_logfile files.
There are two kinds of temporary tables that land in the InnoDB temporary tablespace. One is user-created tables (CREATE TEMPORARY TABLE). These support ROLLBACK, so they require undo log records.
In MySQL 5.7, InnoDB is also used as the default for internal_tmp_disk_storage_engine. The other option is MyISAM. This is used during query execution for ORDER BY, JOIN, and sometimes buffering results for INFORMATION_SCHEMA tables.
In MariaDB, there is no setting for internal_tmp_disk_storage_engine. The Aria engine (improved MyISAM) is used by default. In MariaDB 10.2, the InnoDB code for dealing with internal temporary tables (called ‘intrinsic’ tables) was removed by me.
My understanding is that MyISAM or Aria would suit better to the needs of the query executor than InnoDB. The main benefit of using InnoDB would be that reads can happen via the buffer pool. But writes to ibtmp1 will not be optimized away even for short-lived temporary tables. And the index-oriented storage and delete-marking activity of InnoDB could cause bloat.
The point of temporary files is that there is no redo log written and no crash recovery for them. In MySQL 5.7, undo logs for persistent tables can be stored in 2 places, both of which are covered by the redo log (“ib_logfile*”):
* In the system tablespace (called “ibdata1” by default)
* In the persistent undo tablespace files (“undo001” to “undo127”)
I would not call the dedicated undo tablespaces “temporary” files. Even though the lifetime of the contained data is typically much shorter than the lifetime of the data in the InnoDB tables, these files are persistent and redo-logged.
Temporary tables as well as all related undo log is stored in the temporary tablespace (by default “ibtmp1”) and not redo-logged (nothing written to “ib_logfile*”).
One more point occurred to me when I was preparing a talk for https://mariadb.org/2017-2-developers-unconference-and-related-events-shenzhen/ next week:
MariaDB 10.2 fixed MDEV-12219 Discard temporary undo logs at transaction commit
which could help keep the ibtmp1 file smaller.
Comments are closed.