I am trying to join two tables, products and products_markets. While products is under a million records, product_markets is closer to 20 million records. The data has been changed so there might be a typo or two in the schema create tables:
CREATE TABLE `products_markets` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`product_id` int(10) unsigned NOT NULL,
`country_code_id` int(10) unsigned NOT NULL,
`created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`updated_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
UNIQUE KEY `unique_index` (`product_id`,`country_code_id`)
) ENGINE=InnoDB AUTO_INCREMENT=21052102 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
CREATE TABLE `products` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`manufacturer_id` int(10) unsigned NOT NULL,
`department_id` int(10) unsigned NOT NULL,
`code` varchar(100) COLLATE utf8mb4_unicode_ci NOT NULL,
`popularity` int(11) DEFAULT NULL,
`name` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
`value` bigint(20) unsigned NOT NULL,
`created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`updated_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
UNIQUE KEY `products_code_unique` (`code`),
KEY `products_department_id_foreign` (`department_id`),
KEY `products_manufacturer_id_foreign` (`manufacturer_id`),
CONSTRAINT `products_department_id_foreign`
FOREIGN KEY (`department_id`) REFERENCES `departments` (`id`),
CONSTRAINT `products_manufacturer_id_foreign`
FOREIGN KEY (`manufacturer_id`) REFERENCES `manufacturers` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=731563 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
I am trying to return 50 records of the most popular products available in a specific country and I am running into times around ~50 seconds which seems higher than expected.
I've tried a few different queries with no success:
select `products_markets`.`product_id`
from products_markets
left join
( SELECT products.id, products.popularity
from products
) p ON p.id = products_markets.product_id
where products_markets.country_code_id = 121
order by `popularity` desc, `p`.`id` asc
limit 50
and
select `products`.*
from `products`
where products.id in (
SELECT product_id
from products_markets
where products_markets.country_code_id = 121
)
group by `products`.`name`, `products`.`manufacturer_id`
order by `popularity` desc, `products`.`id` asc
limit 50
this query's explain is:
id select_type table type possible_keys key key_len refs rows extra
1 PRIMARY products ALL PRIMARY NULL NULL NULL 623848 Using temporary; Using filesort
1 PRIMARY products_markets ref unique_index unique_index 4 main.products.id 14 Using where; Using index; FirstMatch(products)
One option I am entertaining is splitting up products_markets into individual tables for each country to lessen the query. I've tried adding more memory to the server without much success. Can anyone identify anything glaringly wrong with the database design/query?
What other options are available to make this query a fraction of its current ~50 seconds?
Get rid of id in products_markets and add
PRIMARY KEY(country_code_id, product_id)
Then get rid of the UNIQUE key unless it is needed for some other query.
This will shrink the disk footprint of that large table significantly, thereby potentially speeding up all queries touching it.
And it will help with Hamaza's suggested reformulation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With