Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

I find this error message when I'm using xml in r

Tags:

r

xml

Hi I'm working with xml in Rstudio. The objective is to convert a xml to an r data frame and I'm trying on the sample data called tides.xml in the package folder.

tides = system.file("exampleData", "tides.xml", package = "XML")

Maybe we can see the items in the first few columns are constant:

Something like this

                       origin
                   NOAA/NOS/CO-OPS
                   NOAA/NOS/CO-OPS
                   NOAA/NOS/CO-OPS
                   NOAA/NOS/CO-OPS
                   NOAA/NOS/CO-OPS
                   NOAA/NOS/CO-OPS
                   NOAA/NOS/CO-OPS

Therefore when I use

xmlToDataFrame(xmlRoot(tides.str))

it returns error:

Error in `[<-.data.frame`(`*tmp*`, i, names(nodes[[i]]), value = c("2010/11/13Sat06:08    AM4.74H",  : 
duplicate subscripts for columns

I know I can do something like this:

xmlToDataFrame(nodes = xmlChildren(xmlRoot(tides.str)[["data"]]))

to produce a data frame but it is just a subset and I need to manually insert the first few columns.

So I am thinking is there anything I can do to remove the error by just changing some of the arguments in xmlToDataFrame() function and using the whole xml data?

Thanks in advance.

like image 727
Lambo Avatar asked Nov 10 '22 04:11

Lambo


1 Answers

I'm not sure if it's possible with xmlToDataFrame. But you can extract all the non-data nodes and turn it into a data.frame yourself without too much trouble.

library(XML)
tides = system.file("exampleData","tides.xml", package="XML")

tides.str<-xmlParse(tides)
detaildf<-xmlToDataFrame(nodes = getNodeSet(tides.str, "/datainfo/data/item"))

header <- getNodeSet(tides.str, "/datainfo/*[not(self::data)]")
headerdf <- as.data.frame(as.list(setNames(xmlSApply(header, xmlValue), 
    xmlSApply(header, xmlName))))

merge(headerdf, detaildf)

And then at the end we just "merge" the two parts to repeat the header for each line in the detail.

like image 135
MrFlick Avatar answered Nov 15 '22 06:11

MrFlick