Assume I have a single mySQL table (users) with the following fields:
userid gender region age ethnicity income
I want to be able to return the number of total records based on the number a user enters. Furthermore, they will also be providing additional criteria.
In the simplest example, they may ask for 1,000 records, where 600 records should have gender = 'Male' and 400 records where gender = 'Female'. That's simple enough to do.
Now, go one step further. Assume they now want to specify Region:
GENDER Male: 600 records Female: 400 records REGION North: 100 records South: 200 records East: 300 records West: 400 records
Again, only 1000 records should be returned, but in the end, there must be 600 males, 400 females, 100 Northerners, 200 Southerners, 300 Easterners and 400 Westerners.
I know this isn't valid syntax, but using pseudo-mySQL code, it hopefully illustrates what I'm trying to do:
(SELECT * FROM users WHERE gender = 'Male' LIMIT 600 UNION SELECT * FROM users WHERE gender = 'Female' LIMIT 400) INTERSECT (SELECT * FROM users WHERE region = 'North' LIMIT 100 UNION SELECT * FROM users WHERE region = 'South' LIMIT 200 UNION SELECT * FROM users WHERE region = 'East' LIMIT 300 UNION SELECT * FROM users WHERE region = 'West' LIMIT 400)
Note that I'm not looking for a one-time query. The total number of records and the number of records within each criteria will constantly be changing based on input by the user. So, I'm trying to come up with a generic solution that can be re-used over and over, not a hard-coded solution.
To make things more complicated, now add more criteria. There could also be age, ethnicity and income each with their own set number of records for each group, additional code appended to above:
INTERSECT (SELECT * FROM users WHERE age >= 18 and age <= 24 LIMIT 300 UNION SELECT * FROM users WHERE age >= 25 and age <= 36 LIMIT 200 UNION SELECT * FROM users WHERE age >= 37 and age <= 54 LIMIT 200 UNION SELECT * FROM users WHERE age >= 55 LIMIT 300) INTERSECT etc.
I'm not sure if this is possible to write in one query or if this requires multiple statements and iterations.
INTERSECT Operator using IN and Subquery The following syntax uses the IN and Subquery clause for returning the distinct rows from both tables: mysql> SELECT DISTINCT column_list FROM table_name1. WHERE column_name IN (SELECT column_list FROM table_name2);
Example - With Single Expression SELECT category_id FROM products INTERSECT SELECT category_id FROM inventory; Since you can't use the INTERSECT operator in MySQL, you will use the IN operator to simulate the INTERSECT query as follows: SELECT products. category_id FROM products WHERE products.
AFAIR, MySQL implements INTERSECT through INNER JOIN. No, an INNER JOIN produces a Cartesian product. That means that every permutation of (row-from-first-table, row-from-second-table) is generated. However, with an appropriate WHERE clause this can be used to apply the same logic as INTERSECT would, see my answer.
The UNION operation combines the results of two subqueries into a single result that comprises the rows that are returned by both queries. The INTERSECT operation combines the results of two queries into a single result that comprises all the rows common to both queries.
You can flatten your multi-dimensional criteria into a single level criteria
Now this criteria can be achieved in one query as follow
(SELECT * FROM users WHERE gender = 'Male' AND region = 'North' LIMIT 40) UNION ALL (SELECT * FROM users WHERE gender = 'Male' AND region = 'South' LIMIT 80) UNION ALL (SELECT * FROM users WHERE gender = 'Male' AND region = 'East' LIMIT 120) UNION ALL (SELECT * FROM users WHERE gender = 'Male' AND region = 'West' LIMIT 160) UNION ALL (SELECT * FROM users WHERE gender = 'Female' AND region = 'North' LIMIT 60) UNION ALL (SELECT * FROM users WHERE gender = 'Female' AND region = 'South' LIMIT 120) UNION ALL (SELECT * FROM users WHERE gender = 'Female' AND region = 'East' LIMIT 180) UNION ALL (SELECT * FROM users WHERE gender = 'Female' AND region = 'West' LIMIT 240)
Problem
Let say that there is less than 40 users whose are male and from north. Then, you need to adjust other criteria quantity to cover the missing quantity from "Male" and "North". I believe it is not possible to do it with bare SQL. This is pseudo code that I have in mind. For sake of simplification, I think we will only query for Male, Female, North, and South
conditions.add({ gender: 'Male', region: 'North', limit: 40 }) conditions.add({ gender: 'Male', region: 'South', limit: 80 }) conditions.add({ gender: 'Female', region: 'North', limit: 60 }) conditions.add({ gender: 'Female', region: 'South', limit: 120 }) foreach(conditions as condition) { temp = getResultFromDatabaseByCondition(condition) conditions.remove(condition) // there is not enough result for this condition, // increase other condition quantity if (temp.length < condition.limit) { adjust(...); } }
Let say that there are only 30 northener male. So we need to adjust +10 male, and +10 northener.
To Adjust --------------------------------------------------- Male +10 North +10 Remain Conditions ---------------------------------------------------- { gender: 'Male', region: 'South', limit: 80 } { gender: 'Female', region: 'North', limit: 60 } { gender: 'Female', region: 'South', limit: 120 }
'Male' + 'South' is the first condition that match the 'Male' adjustment condition. Increase it by +10, and remove it from the "remain condition" list. Since, we increase the South, we need to decrease it back at other condition. So add "South" condition into "To Adjust" list
To Adjust --------------------------------------------------- South -10 North +10 Remain Conditions ---------------------------------------------------- { gender: 'Female', region: 'North', limit: 60 } { gender: 'Female', region: 'South', limit: 120 } Final Conditions ---------------------------------------------------- { gender: 'Male', region: 'South', limit: 90 }
Find condition that match the 'South' and repeat the same process.
To Adjust --------------------------------------------------- Female +10 North +10 Remain Conditions ---------------------------------------------------- { gender: 'Female', region: 'North', limit: 60 } Final Conditions ---------------------------------------------------- { gender: 'Female', region: 'South', limit: 110 } { gender: 'Male', region: 'South', limit: 90 }
And finally
{ gender: 'Female', region: 'North', limit: 70 } { gender: 'Female', region: 'South', limit: 110 } { gender: 'Male', region: 'South', limit: 90 }
I haven't come up with the exact implementation of adjustment yet. It is more difficult than I have expected. I will update once I can figure out how to implement it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With