Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to extract URLs from text

Tags:

How do I extract all URLs from a plain text file in Ruby?

I tried some libraries but they fail in some cases. What's the best way?

like image 826
tapioco123 Avatar asked Sep 08 '10 06:09

tapioco123


People also ask

How do I extract a link from text?

URL extraction is achieved from a text file by using regular expression. The expression fetches the text wherever it matches the pattern. Only the re module is used for this purpose.

What is URL Extractor?

About URL ExtractorThis tool will extract all URLs from text. It works with all standard links, including with non-English characters if the link includes a trailing / followed by text. This tool extracts all URLs from your text. If you want to remove duplicate URLs, please use our Remove Duplicate Lines tool.


1 Answers

If you like using what's already provided for you in Ruby:

require "uri" URI.extract("text here http://foo.example.org/bla and here mailto:[email protected] and here also.") # => ["http://foo.example.org/bla", "mailto:[email protected]"] 

Read more: http://railsapi.com/doc/ruby-v1.8/classes/URI.html#M004495

like image 113
behe Avatar answered Sep 21 '22 16:09

behe