Can anyone help me understand how does this line of code works:
String s = new Scanner(new URL("http://example.com").openStream(), "UTF-8").useDelimiter("\\A").next();
The code is used to directly read from the webpage. How exactly is the scanner object converted to a string and why we use delimiter.
Thanks.
URLVoid is one of the best and most powerful online tools for URL scanning. It scans a website using more than 30 blocklist engines and web assessment utilities, making finding malicious and deceptive URLs easier. Enter the URL in the space provided and click on “Scan Website” to check for malicious code.
Use a website safety checker To find out if a link is safe, just copy/paste the URL into the search box and hit Enter. Google Safe Browsing's URL checker will test the link and report back on the site's legitimacy and reputation in just seconds. It's that easy to use Google's URL scanner.
A link scanner is a handy tool that helps you identify known malicious links so that you avoid opening them. Besides testing the status of URLs, some link scanners also check images. However, this cybersecurity tool only offers you protection against known URLs.
How to Check if a Link is Safe. To check if a link is safe, plug it into a link checker. Link checkers are free online tools that can analyze any link's security issues (or lack thereof) and alert you if the link will direct you to a compromised website, malware, ransomware, or other safety risks.
Here is what happens, with abuse of indentations
new Scanner( // A new scanner is created
new URL("http://example.com") // the scanner takes a Stream
// which is obtained from a URL
.openStream(), // - openStream returns the stream
"UTF-8") // Now the scanner can parse the
// stream character by character
// with UTF-8 encoding
.useDelimiter("\\A") // Now the scanner set as
// delimiter the [Regexp for \A][1]
// \A stands for :start of a string!
.next(); // Here it returns the first(next)
// token that is before another
// start of string.
// Which, I'm not sure
// what it will be
From the Java documentation
A simple text scanner which can parse primitive types and strings using regular expressions. A Scanner breaks its input into tokens using a delimiter pattern, which by default matches whitespace. The resulting tokens may then be converted into values of different types using the various next methods.
So you just replaced \A
as delimiter (instead of whitespace).
BUT \A
has a specific meaning when evaluating as regular expression!
If your stream contains only the following text
\Ahello world!\A Goodbye!\A
Your code will return the entire line \Ahello world!\A Goodbye!\A
If you wanted to strip on the sequence of a backslash followed by a upper case A, then you should use \\\\A
.
Thanks to @Faux Pas to point out that!
Adding to Kuzeko's answer, \A matches the beginning of the entire text. So, I don't think his 'hello world' example is valid.
Scanner is not "converted". On the freshly created instance, useDelimiter
is called, which returns a Scanner instance with the delimiter property set accordingly, then on that instance next
is called which returns a String
.
You may want to lookup Scanner in Java Doc for further reading: https://docs.oracle.com/javase/7/docs/api/java/util/Scanner.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With