There are times I can use force index option to use particular index on a query to make the query faster. Then after some times the data in that table may change. And the force index which I used may not be the right index search for that query. My questions are <ul> <li>Is there anyway to suggest the Query Optimiser to use a force index as a possible option during its plan generation. If it is a slow plan on using the forced index then it can use normal query planning. </li> <li>Or Else Is there anyway to edit the Code of MySql/PSQL to suggest the Optimiser to use a force index as a possible option during its plan generation.</li> </ul> Additional Info: I wanted to add my plan to the optimiser plan list( that Optimiser already created many plan for a query). So that while choosing the best plan for a query I wanted the optimiser to consider my plan also. If this way is possible then the optimiser need not consider force index every time. It can keep the force index as a possible option to the plan generation I tried to make the question clear as of my knowledge. If anyone couldn't understand comment your queries.

The general concept with which you are dealing is "internal database statistics" (not an official term). That is, the information the database engine uses to make the query plan. The statistics store details like the key distribution within the indexes, table row sizes, counts, percentage of null rows, distinctness of data, and so on. For historical (and relevant) performance reasons, the internal database statistics are not updated on-the-fly, but are typically updated periodically at the the DBA's behest. In Postgres land, this information is succinctly available via the <code>pg_statistic</code> table. Unfortunately, MySQL does not make this information readily available, but you can see what you can glean from the INFORMATION_SCHEMA and SHOW INDEXES. To your problem at hand -- the fleeting usefulness of query plan hints -- is that the DB's statistics are not representative of the table data. The general response then is: update the DB's statistics of the tables in question, and you won't need to provide optimizer hints in your queries: <pre class="prettyprint"><code>ANALYZE TABLE <table_name>; </code></pre> There are some caveats of which to be aware: <ul> <li>Without extra measures, this analysis by the DB engine is ephemeral: it will not persist across a database restart/reboot. The options are to reanalyze tables upon DB start, or to look in to statistics persistence.</li> <li>This generates statistics by sampling the data -- looking at a subset of random table rows. By nature, a random sampling means that the statistics gathered might, on any particular run, not be a good representation of the actual data, leading to poor query plans. You can increase the number of sample pages via <code>innodb_stats_persistent_sample_pages</code></li> <li><code>ANALYZE</code> is not <code>OPTIMIZE</code>, and your tables might also need optimization. <code>OPTIMIZE</code> is an expensive operation, requiring a lock on the table, but can be hugely beneficial under certain conditions.</li> <li><code>ANALYZE</code> is table specific, and cannot make up for poor table layout. For example, I was recently tasked with speeding up a slow running set of report queries. The culprit? The schema was storing time and date columns as strings, rather than as native data types. This had huge implications as the queries first had to cast the strings to dates (including with errors in the data) before comparison, resulting in multiple entire table scans. Very bad.</li> <li><code>ANALYZE</code> cannot make up for poor queries. For example, as with the previous bullet point, something like <code>WHERE SOME_FUNCTION( col ) > some_value</code> means that the query planner likely cannot utilize the indexes as each row must first execute the FUNCTION to get the condition result. Not always the case, but often found in naive SQL queries.</li> <li><code>ANALYZE</code> cannot make up for poor normalization. Not withstanding the potentially large algorithmic constants, you will get far bigger big-O algorithmic wins by having proper normalization.</li> <li><code>ANALYZE</code> does not create indexes. If a query accesses columns that have high-cardinality but no indexes, too bad. Knowing on which columns to put indexes per the known or expected queries is up to the DBA.</li> </ul> And as usual in cases like these, take all of StackOverflow advices with grains of salt: you are in charge of the data or programs and have the actual running product in front of you. You can test, you can measure, you can poke around. We can't.

Suggest Plan to Query Optimiser

Tags:

database

mysql

postgresql

psql

innodb

There are times I can use force index option to use particular index on a query to make the query faster.

Then after some times the data in that table may change. And the force index which I used may not be the right index search for that query.

My questions are

Is there anyway to suggest the Query Optimiser to use a force index as a possible option during its plan generation. If it is a slow plan on using the forced index then it can use normal query planning.
Or Else Is there anyway to edit the Code of MySql/PSQL to suggest the Optimiser to use a force index as a possible option during its plan generation.

Additional Info: I wanted to add my plan to the optimiser plan list( that Optimiser already created many plan for a query). So that while choosing the best plan for a query I wanted the optimiser to consider my plan also. If this way is possible then the optimiser need not consider force index every time. It can keep the force index as a possible option to the plan generation

I tried to make the question clear as of my knowledge. If anyone couldn't understand comment your queries.

444

asked Apr 12 '18 06:04

vinieth

2 Answers

The only way is to remove FORCE INDEX from your query.

You can override the optimizer's plan, or else you let the optimizer choose the plan.

The optimizer can't tell if using the index will be slow. It tries to make its best guess at choosing a plan that will be fast.

The only way the optimizer could be more accurate is to run both queries, and compare the performance difference, and then run the fast query. That's obviously not practical, so the optimizer is meant to estimate the cost of the query, based on some statistical model. Most of the time this is accurate enough that the optimizer's choice is best.

answered Sep 28 '22 12:09

Bill Karwin

The general concept with which you are dealing is "internal database statistics" (not an official term). That is, the information the database engine uses to make the query plan. The statistics store details like the key distribution within the indexes, table row sizes, counts, percentage of null rows, distinctness of data, and so on.

For historical (and relevant) performance reasons, the internal database statistics are not updated on-the-fly, but are typically updated periodically at the the DBA's behest. In Postgres land, this information is succinctly available via the pg_statistic table. Unfortunately, MySQL does not make this information readily available, but you can see what you can glean from the INFORMATION_SCHEMA and SHOW INDEXES.

To your problem at hand -- the fleeting usefulness of query plan hints -- is that the DB's statistics are not representative of the table data. The general response then is: update the DB's statistics of the tables in question, and you won't need to provide optimizer hints in your queries:

ANALYZE TABLE <table_name>;

There are some caveats of which to be aware:

Without extra measures, this analysis by the DB engine is ephemeral: it will not persist across a database restart/reboot. The options are to reanalyze tables upon DB start, or to look in to statistics persistence.
This generates statistics by sampling the data -- looking at a subset of random table rows. By nature, a random sampling means that the statistics gathered might, on any particular run, not be a good representation of the actual data, leading to poor query plans. You can increase the number of sample pages via innodb_stats_persistent_sample_pages
ANALYZE is not OPTIMIZE, and your tables might also need optimization. OPTIMIZE is an expensive operation, requiring a lock on the table, but can be hugely beneficial under certain conditions.
ANALYZE is table specific, and cannot make up for poor table layout. For example, I was recently tasked with speeding up a slow running set of report queries. The culprit? The schema was storing time and date columns as strings, rather than as native data types. This had huge implications as the queries first had to cast the strings to dates (including with errors in the data) before comparison, resulting in multiple entire table scans. Very bad.
ANALYZE cannot make up for poor queries. For example, as with the previous bullet point, something like WHERE SOME_FUNCTION( col ) > some_value means that the query planner likely cannot utilize the indexes as each row must first execute the FUNCTION to get the condition result. Not always the case, but often found in naive SQL queries.
ANALYZE cannot make up for poor normalization. Not withstanding the potentially large algorithmic constants, you will get far bigger big-O algorithmic wins by having proper normalization.
ANALYZE does not create indexes. If a query accesses columns that have high-cardinality but no indexes, too bad. Knowing on which columns to put indexes per the known or expected queries is up to the DBA.

And as usual in cases like these, take all of StackOverflow advices with grains of salt: you are in charge of the data or programs and have the actual running product in front of you. You can test, you can measure, you can poke around. We can't.

answered Sep 28 '22 11:09

hunteke

Related questions
                            
                                MySQL: How to create trigger for setting creation date for new rows
                            
                                Can't optimize innoDB table
                            
                                why we give in query sql 1=0
                            
                                Java MySQL JDBC Memory Leak
                            
                                Mysql "where" and "or" statement
                            
                                DBCP: removeAbandoned VS eviction
                            
                                Query when parameter is none django
                            
                                Comparing 2 exact same strings returns false
                            
                                DELETE FROM HAVING COUNT(*) in MySQL
                            
                                Mysql: What is the difference between "slow_query_log" vs "log_slow_queries"
                            
                                mysql UNION result has less rows than sum of table row counts
                            
                                MySQL reverse-i-search
                            
                                WAMP Virtual Host not working
                            
                                Laravel MySQL how to order results in the same order as in whereIn clause
                            
                                Laravel 5: return number of affected rows MySQL
                            
                                Error in MySQL library for Node.js
                            
                                How to fallback to a different value in 'WHERE' clause sql?
                            
                                laravel 5 - How can i add ->index() to existing foreign key
                            
                                SQL Insert Into with Inner Join
                            
                                node.js mysql pool beginTransaction & connection

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With