Currently have a database formatted as follows:
id (unique id)
url: http://domain.com/page.html
Urls are both from the same domain and from different domains.
This also needs to be accomplished in one query so that I can use things like limits when displaying the data to the end user.
Data
1, http://domain.com/about.html
2, http://domain.com/index.html
3, http://anotherdomain.com/index.html
4, http://anotherdomain.com/contact.html
Expected Output
( I only want to return the first url (first being in which ever order the data is being sorted in, this example being "id ASC") of each group, where a groups is made up of urls that share a root domain.)
1, http://domain.com/about.html
3, http://anotherdomain.com/index.html
This should work for you:
SELECT T2.id, url FROM (
SELECT MIN(id) AS id FROM Table1
GROUP BY SUBSTRING_INDEX(url, '/', 3)
) AS T1
JOIN Table1 AS T2
ON T1.id = T2.id
For your test data I get this:
1, 'http://domain.com/about.html'
3, 'http://anotherdomain.com/index.html'
This should work but may run into issues when joining tables
SELECT REPLACE(REPLACE(SUBSTRING_INDEX(LOWER(table.url), '/', 3), 'www.', ''), 'http://', '') AS domain FROM table GROUP BY domain
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With