Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Scala regex extract domain from urls

Tags:

regex

scala

I want to extract bell.com from these following input using Scala regex. I have tried a few variations without success.

"www.bell.com"
"bell.com"
"http://www.bell.com"
"https://www.bell.com"
"https://bell.com/about"
"https://www.bell.com?token=123"

This is my code but not working.

val pattern = """(?:([http|https]://)?)(?:(www\.)?)([A-Za-z0-9._%+-]+)[/]?(?:.*)""".r
url match {
  case pattern(domain) =>
    print(domain)
  case _ => print("not found!")
}

EDIT: My regex is wrong. Thanks to @Tabo. This is correct one.

(?:https?://)?(?:www\.)?([A-Za-z0-9._%+-]+)/?.*
like image 260
angelokh Avatar asked Dec 14 '22 15:12

angelokh


1 Answers

You can use Java URL class to get Host, or you can check Apache library

new URL("https://www.bell.com?token=123").getHost
like image 116
Eugene Zhulenev Avatar answered Jan 06 '23 19:01

Eugene Zhulenev