As far as I understand, an URL consists of the folowing fields:
as
protocol://user:password@host:port/path/document?arg1=val1&arg2=val2#part
I need a code to get value (or null/empty value if not set) of any of these fields from any given URL string. Am I to implement this myself or there is already a code for this so I don't need to invent a wheel?
I am particularly interested in Scala or Java code. C#, PHP, Python or Perl code can also be useful.
URL Parsing. The URL parsing functions focus on splitting a URL string into its components, or on combining URL components into a URL string.
Method 1: In this method, we will use createElement() method to create a HTML element, anchor tag and then use it for parsing the given URL. Method 2: In this method we will use URL() to create a new URL object and then use it for parsing the provided URL.
The URL class provides several methods that let you query URL objects. You can get the protocol, authority, host name, port number, path, query, filename, and reference from a URL using these accessor methods: getProtocol. Returns the protocol identifier component of the URL.
The URL class gives you everything you need. See http://download.oracle.com/javase/6/docs/api/java/net/URL.html
URL url = new URL("protocol://user:password@host:port/path/document?arg1=val1&arg2=val2#part");
url.getProtocol();
url.getUserInfo();
url.getAuthority();
url.getHost();
url.getPort();
url.getPath(); // document part is contained within the path field
url.getQuery();
url.getRef(); // gets #part
Use the java.net.URI class for this. URLs are for real resources and real protocols. URIs are for possibly non-existent protocols and resources.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With