I have a large database that contains many urls, there are many domains repeating and i;m trying to get only the domain. eg:
http://example.com/someurl.html
http://example.com/someurl_on_the_same_domain.html
http://example.net/myurl.php
http://example.org/anotherurl.php
and i want to get only domains, eg:
http://example.com
http://example.net
http://example.org
My query is:
SELECT
id
,site
FROMtable
GROUP BYsite
ORDER BYid
DESC LIMIT 50
I need to use regex i think but i'm not mysql guru.
SELECT
SUBSTR(site, 1 , LOCATE('/', site, 8)-1)
as OnlyDomain
FROM table
GROUP BY OnlyDomain
ORDER BY id DESC LIMIT 50
[EDIT] : After OP request, here's the updated answer that will show correct results even if domain names does not have trailing slashes:
SELECT
SUBSTR(site, 1 , IF(LOCATE('/', site, 8), LOCATE('/', site, 8)-1, LENGTH(site)))
as OnlyDomain
FROM tablename
GROUP BY OnlyDomain
ORDER BY id DESC LIMIT 50
SELECT
COUNT(*) AS nCount,
SUBSTRING_INDEX(REPLACE(REPLACE(REPLACE(site,'http://',''),'https://',''),'www.',''),'/',1) AS sDomain
FROM tbl_table
GROUP BY sDomain
ORDER BY 1 DESC
Addon after JQman sollution with also the www. replaced and the group by + count
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With