I'm trying to convert, for example, '9¼"'to '9.25' but cannot seem to read the fraction correctly.
Here's the data I'm working with:
library(XML)
url <- paste("http://mockdraftable.com/players/2014/", sep = "")
combine <- readHTMLTable(url,which=1, header=FALSE, stringsAsFactors=F)
names(combine) <- c("Name", "Pos", "Hght", "Wght", "Arms", "Hands",
"Dash40yd", "Dash20yd", "Dash10yd", "Bench", "Vert", "Broad",
"Cone3", "ShortShuttle20")
As an example, the Hands column in the first row is '9¼"', how would I make combine$Hands become 9.25? Same for all of the other fractions 1/8 - 7/8.
Any help would be appreciated.
You can try to transform the unicode encoding to ASCII directly when reading the XML using a special return function:
library(stringi)
readHTMLTable(url,which=1, header=FALSE, stringsAsFactors=F,elFun=function(node) {
val = xmlValue(node); stri_trans_general(val,"latin-ascii")})
You can then use @Metrics' suggestion to convert it to numbers.
You could do for example, using @G. Grothendieck's function from this post clean up the Arms
data:
library(XML)
library(stringi)
library(gsubfn)
#the calc function is by @G. Grothendieck
calc <- function(s) {
x <- c(if (length(s) == 2) 0, as.numeric(s), 0:1)
x[1] + x[2] / x[3]
}
url <- paste("http://mockdraftable.com/players/2014/", sep = "")
combine<-readHTMLTable(url,which=1, header=FALSE, stringsAsFactors=F,elFun=function(node) {
val = xmlValue(node); stri_trans_general(val,"latin-ascii")})
names(combine) <- c("Name", "Pos", "Hght", "Wght", "Arms", "Hands",
"Dash40yd", "Dash20yd", "Dash10yd", "Bench", "Vert", "Broad",
"Cone3", "ShortShuttle20")
sapply(strapplyc(gsub('\"',"",combine$Arms), "\\d+"), calc)
#[1] 30.000 31.500 30.000 31.750 31.875 29.875 31.000 31.000 30.250 33.000 32.500 31.625 32.875
There might be some encoding issues depending on your machine (see the comments)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With