How do I take a cell in Excel, which has text that is hyperlinked, and extract the hyperlink part?
Select the cell containing the hyperlink and press Ctrl + K to open the Edit Hyperlink menu. This will open the Edit Hyperlink menu and you can copy and paste the URL from the Address just like before.
To read in the first tab of your excel sheet, simply enclose your file name inside the read_excel() function. From there, you can then choose which sheet to read with the sheet argument: either referencing the sheet's name or its index (number).
Importing Excel files into R using readxl packageThe readxl package, developed by Hadley Wickham, can be used to easily import Excel files (xls|xlsx) into R without any external dependencies.
I found a super convoluted way to extract the hyperlinks:
library(XML)
# rename file to .zip
my.zip.file <- sub("xlsx", "zip", my.excel.file)
file.copy(from = my.excel.file, to = my.zip.file)
# unzip the file
unzip(my.zip.file)
# unzipping produces a bunch of files which we can read using the XML package
# assume sheet1 has our data
xml <- xmlParse("xl/worksheets/sheet1.xml")
# finally grab the hyperlinks
hyperlinks <- xpathApply(xml, "//x:hyperlink/@display", namespaces="x")
Derived from this blogpost.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With