Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extract Hyperlinks From Rich Text Clipboard Contents Or Text Selection On The Mac

I would like to be able to get a list of all the hyperlinked URLs in any formatted text that I select on the Mac (formatted text such as a web page or a word processor document).

Preferably I'd like use Applescript or Automator to extract this list of hyperlinks from the text (so that I can then use Applescript to perform further processing on these URLs).

Note that I am talking about hyperlinks being extracted from formatted text, not just extracting URLs from text containing plaintext URLs.

This hyperlink extraction from formatted text seems like it should be a simple programming task, but I have been struggling to find a way to do this in either Applescript or Automator.

Automator can be set to accept rich text input from a text selection, or can input rich text from the clipboard, but I cannot find any way to access this rich text as a string within Automator or Applescript, such that I can then extract the hyperlinked URLs from the string of rich text data.

Once I get access to the rich text data as a string, there will be no problem in extracting the URLs.

Any suggestions on how I might solve this issue are gratefully received.

like image 989
Ash90 Avatar asked Sep 18 '15 02:09

Ash90


1 Answers

Applescript itself does not unpack embedded text, so you'll have to use a helper app one way or another. You can use do shell script 'textutil' to do some unimbedding of links:

perl -ne 'print chr foreach unpack("C*",pack("H*",substr($_,11,-3)))' | 
textutil -stdin -stdout -convert html -format rtf

Then, you'll have to extract the URLs. I would suggest using the Automator action 'Extract Data' to do this. If you set the whole thing up as an Automator Workflow, you could invoke it from Applescript. Or, if you save it as a Service, you can just run the whole thing from the Service. Here's a screenshot of that method that should show what you need: Workflow Example

Let me know if you have questions. You can see variations on this technique here.

Update: If you want to create this into a Service, it is difficult to coerce the built-in input from Automator into RTF. An effective method is to ignore the input and do a

keystroke "c" using command down

to copy the selected text to the clipboard and then use the workflow from there. See example: Service Workflow Example

like image 51
jweaks Avatar answered Sep 24 '22 15:09

jweaks