Twice recently I have had to import data from a flat file into a database — both sets of data were being imported to make better use of the data in the files. Importing a file is not that difficult using mysqlimport
, particularly since in both cases the data was tab-delimited. I also could have used LOAD DATA INFILE
.
The problem with mysqlimport
and LOAD DATA INFILE
is that you need to create the table beforehand. Often, though, I may not know the exact data types. Folks have written about PROCEDURE ANALYSE()
, but they use it in the context of “you can check to see that your tables have the correct data types.”
Now, that’s always a good thing — say, quarterly or yearly, doing a schema review, including indexes, foreign keys and data types. However, where PROCEDURE ANALYSE()
really comes in handy for me is when I am importing data.
Basically, I create a table with the correct number of fields, all with the datatype of tinytext. Then I run a SELECT * FROM tbl PROCEDURE ANALYSE()\G
to find out what types my data really want to be.
Of course, you want to pay attention to warnings when importing, especially truncation warnings. Generally I use this method a few times on the data, and refine the column types a few times before I get it right.
My one complaint is that I wish PROCEDURE ANALYSE()
would state the current column datatype as well as the “ideal” data type. It wastes a lot of time for me to make sure that the “ideal” data type is the one I already have, and it should not be that difficult for the function itself to do it.
Comments are closed.