Ok, I know this is going to seem odd. but, I am trying to figure out how to dynamically determine what data type to set a column to using an SQL query in a MySql database.
I have a number of tables that get created dynamically via script. I initially create them, and size all of columns as 'text' data types with a length of 2000. I do this so that I don't have to worry about what is actually going into the table during the insert process. Once I'm done inserting the data for the table, I go through the table, examine the maximum length of the data stored and what kind of data is stored for each and every column and table. There are a lot of tables -and a lot of columns. I've come up with a process -but, I'm a little concerned that it is going to miss something one of these times. The DB has about 1250+ tables and about 300+ million rows in total. And the really unfortunate thing is that data types are never guaranteed to be consistent. Ever. Which is why I have to handle them after the import.
Basically, I run a query to get the MAX CHAR_LENGTH for the column and store that. Then, I run a query that selects N records in the database and then checks to see if they are numeric or characters. I know, I know. Not the best way to go about it.
Since there is absolutely no way that I can be certain of the incoming data (truly a nightmare) I am trying to figure if there is a way to query to column to see what kind of data resides in it? I am not looking for a query to see how the column has been defined (obviously, as I already defined it with a 'generic' type of "text").
I know I could handle this in the script during the import process but am hoping to just run an ALTER TABLE command instead of writing a bunch of extra code.
Any thoughts or ideas are greatly appreciated!
Use the PROCEDURE ANALYSE()
option:
SELECT * FROM `tablename` PROCEDURE ANALYSE();
Here's an example of the output:
mysql> select * from zip procedure analyse() \G
*************************** 1. row ***************************
Field_name: zip.zip
Min_value: 00501
Max_value: 99950
Min_length: 5
Max_length: 5
Empties_or_zeros: 0
Nulls: 0
Avg_value_or_avg_length: 5.0000
Std: NULL
Optimal_fieldtype: MEDIUMINT(5) UNSIGNED NOT NULL
*************************** 2. row ***************************
Field_name: zip.city
Min_value: Aaronsburg
Max_value: Zwolle
Min_length: 3
Max_length: 26
Empties_or_zeros: 1009
Nulls: 0
Avg_value_or_avg_length: 8.3869
Std: NULL
Optimal_fieldtype: VARCHAR(26) NOT NULL
*************************** 3. row ***************************
Field_name: zip.state_id
Min_value: AA
Max_value: WY
Min_length: 2
Max_length: 2
Empties_or_zeros: 0
Nulls: 0
Avg_value_or_avg_length: 2.0000
Std: NULL
Optimal_fieldtype: ENUM('AA','AE','AK','AL','AP','AR','AS','AZ','CA','CO','CT','DC','DE','FL','FM','GA','GU','HI','IA','ID','IL','IN','KS','KY','LA','MA','MD','ME','MH','MI','MN','MO','MP','MS','MT','NC','ND','NE','NH','NJ','NM','NV','NY','OH','OK','OR','PA','PR','PW','RI','SC','SD','TN','TX','UT','VA','VI','VT','WA','WI','WV','WY') NOT NULL
*************************** 4. row ***************************
Field_name: zip.latitude
Min_value: -7.209975
Max_value: 71.299525
Min_length: 2
Max_length: 8
Empties_or_zeros: 1009
Nulls: 0
Avg_value_or_avg_length: 37.599173975674866
Std: 7.949323125673274
Optimal_fieldtype: FLOAT NOT NULL
*************************** 5. row ***************************
Field_name: zip.longitude
Min_value: -176.63675
Max_value: -64.734694
Min_length: 3
Max_length: 8
Empties_or_zeros: 1009
Nulls: 0
Avg_value_or_avg_length: -88.79028976104503
Std: 20.6017874416888
Optimal_fieldtype: FLOAT NOT NULL
*************************** 6. row ***************************
Field_name: zip.timezone_id
Min_value: 3
Max_value: 11
Min_length: 1
Max_length: 2
Empties_or_zeros: 0
Nulls: 1009
Avg_value_or_avg_length: 8.1563
Std: 1.0430
Optimal_fieldtype: ENUM('3','5','6','7','8','9','11')
*************************** 7. row ***************************
Field_name: zip.dst
Min_value: 1
Max_value: 1
Min_length: 1
Max_length: 1
Empties_or_zeros: 2618
Nulls: 0
Avg_value_or_avg_length: 0.9376
Std: 0.2419
Optimal_fieldtype: ENUM('0','1') NOT NULL
*************************** 8. row ***************************
Field_name: zip.status
Min_value: Active
Max_value: Active
Min_length: 6
Max_length: 6
Empties_or_zeros: 0
Nulls: 0
Avg_value_or_avg_length: 6.0000
Std: NULL
Optimal_fieldtype: ENUM('Active') NOT NULL
*************************** 9. row ***************************
Field_name: zip.created
Min_value: 2010-09-25 11:43:41
Max_value: 2011-05-13 15:56:00
Min_length: 19
Max_length: 19
Empties_or_zeros: 0
Nulls: 0
Avg_value_or_avg_length: 19.0000
Std: NULL
Optimal_fieldtype: ENUM('2010-09-25 11:43:41','2010-09-25 11:43:56','2010-11-01 09:49:32','2011-05-13 15:56:00') NOT NULL
*************************** 10. row ***************************
Field_name: zip.updated
Min_value: 2010-09-24 23:13:41
Max_value: 2011-05-13 15:56:00
Min_length: 19
Max_length: 19
Empties_or_zeros: 0
Nulls: 0
Avg_value_or_avg_length: 19.0000
Std: NULL
Optimal_fieldtype: ENUM('2010-09-24 23:13:41','2010-09-24 23:13:56','2010-09-29 12:40:56','2010-11-01 09:49:32','2011-05-13 15:56:00') NOT NULL
10 rows in set (0.15 sec)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With