I am trying to learn some (advance or more complex) SQL.
Let's say I have a table of cars, with information of every car.
Then I have another table with cars being sold, some are new, and some are used.
I want the user to be able to look up a car, a Honda civic 2016 for instance, and see the car info.
But also want the user to see all Honda civic 2016 cars being sold, including the highest and lowest price for that particular year/model, organized by new and used.
What would be the most efficient way to retrieve all the information - the car info and the ones being sold to display on a page!
These are my tables.
CREATE TABLE Users(
id BIGINT(20) NOT NULL AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(16) NOT NULL,
last VARCHAR(16) NOT NULL,
email VARCHAR(128) NOT NULL,
phone CHAR(10) NOT NULL,
joined DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE Cars(
id BIGINT(20) NOT NULL AUTO_INCREMENT PRIMARY KEY,
make VARCHAR(32) NOT NULL,
model VARCHAR(32) NOT NULL,
year INT(4) NOT NULL,
trim VARCHAR(16) NOT NULL
);
CREATE TABLE Market(
id BIGINT(20) NOT NULL AUTO_INCREMENT,
user_id BIGINT(20) NOT NULL,
car_id BIGINT(20) NOT NULL,
condition VARCHAR(5) NOT NULL,
notes VARCHAR(1024) NOT NULL,
PRIMARY KEY(id),
CONSTRAINT cfk FOREIGN KEY (car_id) REFERENCES cars(id) ON DELETE CASCADE ON UPDATE CASCADE,
CONSTRAINT ufk FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE ON UPDATE CASCADE
);
It looks redundant to me and doomed to get slow as the tables get bigger. So I would appreciate if somebody shows me a better way.
/* Get car information*/
SELECT *
FROM Cars
WHERE make = 'Honda'
AND model = 'Civic'
AND year = '2017'
AND trim = 'EX';
/* I also would like to get the min and max price for this particular car*/
/* ?? How ?? */
/* Get (new) cars being sold and sellers */
SELECT M.*, U.*
FROM Market M
INNER JOIN Users ON M.user_id = U.id
WHERE make = 'Honda'
AND model = 'Civic'
AND year = '2017'
AND color = 'white'
AND trim = 'EX'
AND condition = 'NEW';
/* Get (used) cars being sold and sellers */
SELECT M.*, U.*
FROM Market M
INNER JOIN Users ON M.user_id = U.id
WHERE make = 'Honda'
AND model = 'Civic'
AND year = '2017'
AND color = 'white'
AND trim = 'EX'
AND condition = 'USED';
I ultimately would like to get something like the following using PHP:
{
car: {
make: "Honda",
model: "Civic",
year: 2017,
trim: "EX"
},
market: {
new: {
min: 'overall min',
max: 'overall max',
data: [{
seller:{
name: "John",
last: "Smith",
phone: "xxx-xxx-xxxx",
email: "[email protected]",
},
car: {
price: 15000,
color: "white",
condition: "used",
notes: "Some notes about the car"
}
}]
},
used: {
min: 'overall min',
max: 'overall max',
data: [{
seller:{
name: "John",
last: "Smith",
phone: "xxx-xxx-xxxx",
email: "[email protected]",
},
car: {
price: 15000,
color: "white",
condition: "new",
notes: "Some notes about the car"
}
}]
}
}
}
I am able to put into that format once I retrieve the information. Also I would have to paginate through the database.
Basically I am interested on knowing the best way to do what Amazon does. An item is for sale from different vendors, and at different prices. Amazon provides information about the item being sold; its condition, price, seller, etc. Also, Amazon gives you the lowest, highest price and the information about the item in general. What is the best approach?
Four queries should be fine:
For the min/max sales, you are needing to do an aggregation, so GROUP BY
is your friend. Try this:
SELECT `condition`, MIN(price) min_price, MAX(price) max_price
FROM Cars
JOIN Market ON (Cars.id = Market.car_id)
WHERE make = 'Honda'
AND model = 'Civic'
AND year = '2017'
AND trim = 'EX'
GROUP BY `condition`;
Your other queries look good. As the tables grow and you want to keep the queries quick, indexing will help. Basic rule is that the fields that are part of your WHERE
predicate are good to have indexed. Also any JOIN
keys between tables are usually good to have indexed. Try an index on Cars(make, model, year, trim)
.
Also, condition
is a reserved word in MySQL 5.7, which is why I escaped using backticks. Consider using cond
instead, and if you only have a few conditions {"new", "used"}, consider using an ENUM
data type. Beware MIN
and MAX
are also reserved db words.
You can retrieve the information you need with two queries. For example - You can get the lowest price for new and used cars along with the car info in one query using correlated subqueries in the SELECT clause:
SELECT c.*,
(SELECT MIN(m.price) FROM Market m WHERE m.car_id = c.id and m.condition = 'NEW') as new_min_price,
(SELECT MIN(m.price) FROM Market m WHERE m.car_id = c.id and m.condition = 'USED') as used_min_price
FROM Cars c
WHERE c.make = 'Honda'
AND c.model = 'Civic'
AND c.year = '2017'
AND c.trim = 'EX'
For best performance I would create composite indexes Cars(make, model, year, trim)
and Market(car_id, condition, price)
.
The order of the columns in the first index isn't important and you can change it.
The optimizer can adjust the execution plan to the order you define.
However the order for the second index must be this way to get the lowest price most efficiently.
If you have an index on Market(car_id)
(which is probably created by the FOREIGN KEY definition), you can drop it.
The new composite index can be used instead.
Note that I didn't include the highest price, since I don't think anybody cares.
But you can get it the same way as you get the lowest price by useing MAX()
instead of MIN()
.
There is another way to get the same data using "conditional aggregation":
SELECT c.*,
MIN(CASE m.condition = 'NEW' THEN m.price END) as new_min_price,
MIN(CASE m.condition = 'USED' THEN m.price END) as used_min_price
FROM Cars c
JOIN Market m ON m.car_id = c.id
WHERE c.make = 'Honda'
AND c.model = 'Civic'
AND c.year = '2017'
AND c.trim = 'EX'
GROUP BY c.id
But this query can't use the indexes in the best way, since the engine will need to loop through all offers in order to find the lowest price.
Now to get all the offers from the Market
table, you don't need to execute two queries (one for each condition).
You can fetch them all with one query and group the result by the condition
field in your application language.
Also since you already know the car_id
from the first query, you can use it as the search criteria, so you don't need to touch the cars
table again.
(Note that your last two queries wouldn't work without a JOIN with the cars
table, since the market table doesn't have the columns make
, model
, year
and trim
.)
SELECT
m.condition
m.id as market_id,
m.user_id,
u.name,
u.last
FROM Market m
INNER JOIN Users u ON m.user_id = u.id
WHERE m.car_id = ?
ORDER BY price
Note that if you use SELECT m*, u*
, the id
field will be ambiguous since it is defined in both tables.
So you should list all fields you want to fetch and use an alias for ambiguous columns like m.id as market_id
.
You didn't mention the language you use - So I can't tell you exactly how to group the result by the condition
field.
For example with PHP PDO you could use the PDO::FETCH_GROUP
mode.
However if your language doesn't support such functionality you can group/split the result in a simple loop.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With