Basically, I need to get those rows which contain domain and subdomain name from a URL or the whole website name excluding www
.
My DB table looks like this:
+----------+------------------------+
| id | website |
+----------+------------------------+
| 1 | https://www.google.com |
+----------+------------------------+
| 2 | http://www.google.co.in|
+----------+------------------------+
| 3 | www.google.com |
+----------+------------------------+
| 4 | www.google.co.in |
+----------+------------------------+
| 5 | google.com |
+----------+------------------------+
| 6 | google.co.in |
+----------+------------------------+
| 7 | http://google.co.in |
+----------+------------------------+
Expected output:
google.com
google.co.in
google.com
google.co.in
google.com
google.co.in
google.co.in
My Postgres Query looks like this:
select id, substring(website from '.*://([^/]*)') as website_domain from contacts
But above query give blank websites. So, how I can get the desired output?
You must use the "non capturing" match ?:
to cope with the non "http://" websites.
like
select
id,
substring(website from '(?:.*://)?(?:www\.)?([^/?]*)') as website_domain
from contacts;
SQL Fiddle: http://sqlfiddle.com/#!17/f890c/2/0
PostgreSQL's regular expressions: https://www.postgresql.org/docs/9.3/functions-matching.html#POSIX-ATOMS-TABLE
You may use
SELECT REGEXP_REPLACE(website, '^(https?://)?(www\.)?', '') from tbl;
See the regex demo.
Details
^
- start of string(https?://)?
- 1 or 0 occurrences of http://
or https://
(www\.)?
- 1 or 0 occurrences of www.
See the PostgreSQL demo:
CREATE TABLE tb1
(website character varying)
;
INSERT INTO tb1
(website)
VALUES
('https://www.google.com'),
('http://www.google.co.in'),
('www.google.com'),
('www.google.co.in'),
('google.com'),
('google.co.in'),
('http://google.co.in')
;
SELECT REGEXP_REPLACE(website, '^(https?://)?(www\.)?', '') from tb1;
Result:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With