Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

URL , Scanner & Delimiter : How does this Java Line of Code Works?

Can anyone help me understand how does this line of code works:

String s = new Scanner(new URL("http://example.com").openStream(), "UTF-8").useDelimiter("\\A").next();

The code is used to directly read from the webpage. How exactly is the scanner object converted to a string and why we use delimiter.

Thanks.

like image 786
Nuhman Avatar asked Mar 29 '16 07:03

Nuhman


People also ask

How do I scan a website URL?

URLVoid is one of the best and most powerful online tools for URL scanning. It scans a website using more than 30 blocklist engines and web assessment utilities, making finding malicious and deceptive URLs easier. Enter the URL in the space provided and click on “Scan Website” to check for malicious code.

How do I check if a URL is safe?

Use a website safety checker To find out if a link is safe, just copy/paste the URL into the search box and hit Enter. Google Safe Browsing's URL checker will test the link and report back on the site's legitimacy and reputation in just seconds. It's that easy to use Google's URL scanner.

How does a URL scanner work?

A link scanner is a handy tool that helps you identify known malicious links so that you avoid opening them. Besides testing the status of URLs, some link scanners also check images. However, this cybersecurity tool only offers you protection against known URLs.

Can you scan a link for viruses?

How to Check if a Link is Safe. To check if a link is safe, plug it into a link checker. Link checkers are free online tools that can analyze any link's security issues (or lack thereof) and alert you if the link will direct you to a compromised website, malware, ransomware, or other safety risks.


3 Answers

Here is what happens, with abuse of indentations

     new Scanner(                           // A new scanner is created
             new URL("http://example.com")  // the scanner takes a Stream 
                                            // which is obtained from a URL
          .openStream(),                    // - openStream returns the stream
       "UTF-8")                             // Now the scanner can parse the        
                                            // stream character by character
                                            // with UTF-8 encoding

     .useDelimiter("\\A")                   // Now the scanner set as 
                                            // delimiter the [Regexp for \A][1]
                                            // \A stands for :start of a string!

   .next();                                 // Here it returns the first(next) 
                                            // token that is before another
                                            // start of string. 
                                            // Which, I'm not sure 
                                            // what it will be

From the Java documentation

A simple text scanner which can parse primitive types and strings using regular expressions. A Scanner breaks its input into tokens using a delimiter pattern, which by default matches whitespace. The resulting tokens may then be converted into values of different types using the various next methods.

So you just replaced \A as delimiter (instead of whitespace). BUT \A has a specific meaning when evaluating as regular expression!

If your stream contains only the following text

\Ahello world!\A Goodbye!\A

Your code will return the entire line \Ahello world!\A Goodbye!\A

If you wanted to strip on the sequence of a backslash followed by a upper case A, then you should use \\\\A.

Thanks to @Faux Pas to point out that!

like image 187
Kuzeko Avatar answered Sep 22 '22 02:09

Kuzeko


Adding to Kuzeko's answer, \A matches the beginning of the entire text. So, I don't think his 'hello world' example is valid.

like image 34
Faux Pas Avatar answered Sep 21 '22 02:09

Faux Pas


Scanner is not "converted". On the freshly created instance, useDelimiter is called, which returns a Scanner instance with the delimiter property set accordingly, then on that instance next is called which returns a String.

You may want to lookup Scanner in Java Doc for further reading: https://docs.oracle.com/javase/7/docs/api/java/util/Scanner.html

like image 39
Fildor Avatar answered Sep 20 '22 02:09

Fildor