How do I extract all URLs from a plain text file in Ruby?
I tried some libraries but they fail in some cases. What's the best way?
URL extraction is achieved from a text file by using regular expression. The expression fetches the text wherever it matches the pattern. Only the re module is used for this purpose.
About URL ExtractorThis tool will extract all URLs from text. It works with all standard links, including with non-English characters if the link includes a trailing / followed by text. This tool extracts all URLs from your text. If you want to remove duplicate URLs, please use our Remove Duplicate Lines tool.
If you like using what's already provided for you in Ruby:
require "uri" URI.extract("text here http://foo.example.org/bla and here mailto:[email protected] and here also.") # => ["http://foo.example.org/bla", "mailto:[email protected]"]
Read more: http://railsapi.com/doc/ruby-v1.8/classes/URI.html#M004495
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With