I fount this answer, but wanted to expand on the question and couldn't find any solutions here on stack or through searching google.
Substring domainname from URL SQL
Basically the link above solves my problem with a simple URL like parsing "www.google.com" with the result of google.
What I am looking for to expand on that is the solution from the link above doesn't help with url's like 'www.maps.google.com' that just returns maps.
WHat I would like is to have it return 'google' from the url 'www.maps.google.com' or return 'example' from 'www.test.example.com'.
If anyone has a solution to this, I would greatly appreciate it.
Update: To be more specific I will also need parsing on second level domains etc. 'www.maps.google.com.au' to return 'google'
Here is my Sql function.
CREATE FUNCTION [dbo].[parseURL] (@strURL varchar(1000))
RETURNS varchar(1000)
AS
BEGIN
IF CHARINDEX('.', REPLACE(@strURL, 'www.','')) > 0
SELECT @strURL = LEFT(REPLACE(@strURL, 'www.',''), CHARINDEX('.',REPLACE(@strURL, 'www.',''))-1)
Else
SELECT @strURL = REPLACE(@strURL, 'www.','')
RETURN @strURL
END
You need to use SUBSTRING_INDEX() function from MySQL to extract part of a URL.
The parsing stage involves separating the pieces of a SQL statement into a data structure that other routines can process. The database parses a statement when instructed by the application, which means that only the application, and not the database itself, can reduce the number of parses.
A domain is essentially a data type with optional constraints (restrictions on the allowed set of values). The user who defines a domain becomes its owner. If a schema name is given (for example, CREATE DOMAIN myschema. mydomain ... ) then the domain is created in the specified schema.
I'd suggest this
DECLARE @URL nvarchar(max) = 'www.maps.google.com'
DECLARE @X xml = CONVERT(xml,'<root><part>' + REPLACE(@URL, '.','</part><part>') + '</part></root>')
SELECT [Domain] = T.c.value('.','varchar(20)')
FROM @X.nodes('/root/part[position() = last() - 1]') T(c)
The approach is to convert the URL to XML and then use XPath to find the domain.
UPDATE
Regarding second-level domains, I believe the only reliable way it to have them all in a table (top-level domains should probably be in a table too) and then you could use this query:
DECLARE @URL nvarchar(max) = 'www.maps.google.com'
DECLARE @X xml = CONVERT(xml,'<root><part>' + REPLACE(REVERSE(@URL), '.','</part><part>') + '</part></root>')
;WITH SplitCTE AS
(
SELECT
(SELECT REVERSE(T.c.value('.', 'nvarchar(256)')) FROM @X.nodes('/root/part[. = ../part[position() = 1]]') T(c)) AS TLD,
(SELECT REVERSE(T.c.value('.', 'nvarchar(256)')) FROM @X.nodes('/root/part[. = ../part[position() = 2]]') T(c)) AS D2,
(SELECT REVERSE(T.c.value('.', 'nvarchar(256)')) FROM @X.nodes('/root/part[. = ../part[position() = 3]]') T(c)) AS D3
)
SELECT
CASE
WHEN SLD.Domain IS NULL THEN S.D2 ELSE S.D3
END AS Domain
FROM
SplitCTE AS S
LEFT JOIN TLD ON TLD.Domain = S.TLD
LEFT JOIN SLD ON SLD.Domain = S.D2
The TLD/SLD tables I used for this example are below. The full list of domains is in this wiki. Be careful to use NVARCHAR
as some are localized.
CREATE TABLE dbo.TLD
(
Domain nvarchar(10)
)
GO
CREATE TABLE dbo.SLD
(
Domain nvarchar(10)
)
GO
INSERT TLD VALUES ( 'com')
INSERT TLD VALUES ( 'uk')
INSERT SLD VALUES ( 'co')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With