I have this query for gathering the information about a single order and its become quite complex.
I don't have any data to test with so i'm asking, if anyone has experience with this in small and large data sets, is there a limit to how many joins you can or should make in a single query? Would it be advisable to split the large queries into smaller parts or does this not make a significant difference?
Also, is it legal to have a WHERE
clause after each INNER JOIN
?
Thanks for your advice.
Here is the query:
# Order: Get Order
function getOrder($order_id) {
$sql = "SELECT (order.id, order.created, o_status.status,
/* payment info */
order.total, p_status.status,
/* ordered by */
cust_title.title, cust.forename, cust.surname,
customer.phone, customer.email,
cust.door_name, cust.street1,
cust.street2, cust.town,
cust.city, cust.postcode,
/* deliver to */
recip_title.title, recipient.forename, recipient.surname,
recipient.door_name, recipient.street1,
recipient.street2, recipient.town,
recipient.city, recipient.postcode,
/* deliver info */
shipping.name, order.memo,
/* meta data */
order.last_update)
FROM tbl_order AS order
INNER JOIN tbl_order_st AS o_status
ON order.order_status_id = o_status.id
INNER JOIN tbl_payment_st AS p_status
ON order.payment_status_id = p_status.id
INNER JOIN (SELECT (cust_title.title, cust.forename, cust.surname,
customer.phone, customer.email,
/* ordered by */ cust.door_name, cust.street1,
cust.street2, cust.town,
cust.city, cust.postcode)
FROM tbl_customer AS customer
INNER JOIN tbl_contact AS cust
ON customer.contact_id = cust.id
INNER JOIN tbl_contact_title AS cust_title
ON cust.contact_title_id = cust_title.id
WHERE order.customer_id = customer.id)
ON order.customer_id = customer.id
INNER JOIN (SELECT (recip_title.title, recipient.forename, recipient.surname,
/* deliver to */ recipient.door_name, recipient.street1,
recipient.street2, recipient.town,
recipient.city, recipient.postcode)
FROM tbl_contact AS recipient
INNER JOIN tbl_contact_title AS recip_title
ON recipient.contact_title_id = recip_title.id
WHERE order.contact_id = recipient.id)
ON order.contact_id = recipient.id
INNER JOIN tbl_shipping_opt AS shipping
ON order.shipping_option_id = shipping.id
WHERE order.id = '?';";
dbQuery($sql, array((int)$order_id));
$rows = dbRowsAffected();
if ($rows == 1)
return dbFetchAll();
else
return null;
}
Since someone requested the schema for this query, here it is:
# TBL_CONTACT_TITLE
DROP TABLE IF EXISTS tbl_contact_title;
CREATE TABLE tbl_contact_title(
id INT NOT NULL AUTO_INCREMENT,
PRIMARY KEY(id),
title CHAR(3)
) ENGINE = InnoDB;
INSERT INTO tbl_contact_title
(title)
VALUES ('MR'),
('MRS'),
('MS');
# TBL_CONTACT
DROP TABLE IF EXISTS tbl_contact;
CREATE TABLE tbl_contact(
id INT NOT NULL AUTO_INCREMENT,
PRIMARY KEY(id),
contact_title_id INT,
FOREIGN KEY(contact_title_id) REFERENCES tbl_contact_title(id) ON DELETE SET NULL,
forename VARCHAR(50),
surname VARCHAR(50),
door_name VARCHAR(25),
street1 VARCHAR(40),
street2 VARCHAR(40),
town VARCHAR(40),
city VARCHAR(40),
postcode VARCHAR(10),
currency_id INT,
FOREIGN KEY(currency_id) REFERENCES tbl_currency(id) ON DELETE SET NULL
) ENGINE = InnoDB;
# TBL_CUSTOMER
DROP TABLE IF EXISTS tbl_customer;
CREATE TABLE tbl_customer(
id INT NOT NULL AUTO_INCREMENT,
PRIMARY KEY(id),
contact_id INT,
FOREIGN KEY(contact_id) REFERENCES tbl_contact(id) ON DELETE SET NULL,
birthday DATE,
is_male TINYINT,
phone VARCHAR(20),
email VARCHAR(50) NOT NULL
) ENGINE = InnoDB, AUTO_INCREMENT = 1000;
# TBL_ORDER_ST
DROP TABLE IF EXISTS tbl_order_st;
CREATE TABLE tbl_order_st(
id INT NOT NULL AUTO_INCREMENT,
PRIMARY KEY(id),
status VARCHAR(25)
) ENGINE = InnoDB;
INSERT INTO tbl_order_st
(status)
VALUES
('NEW'),
('PROCESSING'),
('SHIPPED'),
('COMPLETED'),
('CANCELLED');
# TBL_SHIPPING_OPT
DROP TABLE IF EXISTS tbl_shipping_opt;
CREATE TABLE tbl_shipping_opt(
id INT NOT NULL AUTO_INCREMENT,
PRIMARY KEY(id),
name VARCHAR(50),
description VARCHAR(255),
cost DECIMAL(6,3)
) ENGINE = InnoDB;
INSERT INTO tbl_shipping_opt
(name, description, cost)
VALUES
('UK Premier', 'U.K. Mainland upto 30KG, Next Working Day', 8.00),
('Europe Standard', 'Most European Destinations* upto 30KG, 2 to 5 Working Days *please check before purchase', 15.00);
# TBL_PAYMENT_ST
DROP TABLE IF EXISTS tbl_payment_st;
CREATE TABLE tbl_payment_st(
id INT NOT NULL AUTO_INCREMENT,
PRIMARY KEY(id),
status VARCHAR(25)
) ENGINE = InnoDB;
INSERT INTO tbl_payment_st
(status)
VALUES
('UNPAID'),
('PAID');
# TBL_ORDER
DROP TABLE IF EXISTS tbl_order;
CREATE TABLE tbl_order(
id INT NOT NULL AUTO_INCREMENT,
PRIMARY KEY(id),
customer_id INT,
FOREIGN KEY(customer_id) REFERENCES tbl_customer(id) ON DELETE SET NULL,
contact_id INT,
FOREIGN KEY(contact_id) REFERENCES tbl_contact(id) ON DELETE SET NULL,
created DATETIME,
last_update TIMESTAMP,
memo VARCHAR(255),
order_status_id INT,
FOREIGN KEY(order_status_id) REFERENCES tbl_order_st(id),
shipping_option_id INT,
FOREIGN KEY(shipping_option_id) REFERENCES tbl_shipping_opt(id),
coupon_id INT,
FOREIGN KEY(coupon_id) REFERENCES tbl_coupon(id) ON DELETE SET NULL,
total DECIMAL(9,3),
payment_status_id INT,
FOREIGN KEY(payment_status_id) REFERENCES tbl_payment_st(id)
) ENGINE = InnoDB, AUTO_INCREMENT = 1000;
You don't have anywhere near the limit of JOINs for MySQL. Your number of joins isn't bad. However, joining on a derived table (your inner subquery) as you're doing can cause performance issues, since derived tables don't have indexes. Performing a join on a derived table without indexes can be slow.
You should consider making a real temporary table with indexes for joining, or figure out a way to avoid the subquery.
A JOIN in MySQL is basically like doing a lookup (seek) for each joined row. So, MySQL will have to perform many lookups if you are joining many records. It's less about how many tables you join than the number of rows that you join that can be a problem.
Anyway, MySQL will only perform so many seeks before it will give up and just read the whole table. It does a pretty good job at deciding which will be less expensive.
Perhaps the best thing you can do is help it guess by updating the index statistics with ANALYZE TABLE.
You can have one WHERE clause per SELECT. So your inner subquery will have a WHERE clause and your outer query will have a WHERE clause, and these get applied after the JOIN (at least logically, though MySQL will generally apply them first for performance).
Also, all of this is assuming you know how to properly use indexes.
Well I once tried it myself to see the limit for the number of joins and I tested a join with 100 tables on mysql 5.0.4 if I recall it correctly.
I was given the following error:
Too many tables. MySQL can only use 61 tables in a join.
I think the limit for MySql 4 was 31 tables.
A word of advice if you plan to use that many tables in a query then it's time to start thinking about simplifying your query design.
I was searching for this as well, and then I found this other solution, adding "AND" as a part of a JOIN.
INNER JOIN kopplaMedlemIntresse
ON intressen.id = kopplaMedlemIntresse.intresse_id AND kopplaMedlemIntresse.medlem_id = 3
I do not have any general advice, just my own expirience.
I had once a problem with very bad performance where I added tonz of tables using JOIN. Was around 20 JOIN I did. When I removed a few JOINS in a test the speed went up again. Due the fact I just needed single information I was able to replace most of the JOINS with sub selects. This solved my problem from 25 seconds for my query down to less then 1 second.
It is probably a fact of your table design, your amount of columns of the joined tables, and your indexes on the joined where clauses.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With