I am trying to parse the XML file in R, so that I can analysis the data. I am trying to get the mean and standard deviation of the price. Also I would like to be able to get the rate of change in the time of the share price changing. I have tried entering the data by hand but am having problems with the date structure ( I have tried the following:
z <- strptime ("HH:MM:SS.ms, "%H:%m:%S.%f")
but it failed to work). I know the XML file only has a small few numbers but is it a process that could be automated and if so what packages would I need? (I am new to R). Any help would be much appreciated.
Thanks, Anthony.
<?xml version = "1.0"?>
<Company >
<shareprice>
<timeStamp> 12:00:00:01</timeStamp>
<Price> 25.02</Price>
</shareprice>
<shareprice>
<timeStamp> 12:00:00:02</timeStamp>
<Price> 15</Price>
</shareprice>
<shareprice>
<timeStamp> 12:00:00:025</timeStamp>
<Price> 15.02</Price>
</shareprice>
<shareprice>
<timeStamp> 12:00:00:031</timeStamp>
<Price> 18.25</Price>
</shareprice>
<shareprice>
<timeStamp> 12:00:00:039</timeStamp>
<Price> 18.54</Price>
</shareprice>
<shareprice>
<timeStamp> 12:00:00:050</timeStamp>
<Price> 16.52</Price>
</shareprice>
<shareprice>
<timeStamp> 12:00:01:01</timeStamp>
<Price> 17.50</Price>
</shareprice>
</Company>
In
z <- strptime ("HH:MM:SS.ms, "%H:%m:%S.%f")
you miss a closing "
so it is invalid syntax.
Next, the data is non-standard as we would use a dot for seconds.subseconds, ie 12:23:34.567 to denote a timestamp. The milliseconds can be parsed this way
> ts <- "12:00:00.050"
> strptime(ts, "%H:%M:%OS")
[1] "2010-07-09 12:00:00 CDT"
>
So you not only need to get it out of XML first, but also need to convert the string. Else, you can parse the string an fill a POSIXlt
time structure 'by hand'.
Postscriptum: Forgot to mention that you need to enable printing of sub-second times:
> options("digits.secs"=3) # shows milliseconds (three digits)
> strptime(ts, "%H:%M:%OS")
[1] "2010-07-09 12:00:00.05 CDT" # suppresses trailing zero
>
Postscriptum 2: You are also in luck with respect to your file thanks to the XML package:
> library(XML)
> xmlToDataFrame("c:/Temp/foo.xml") # save your data as c:/Temp/foo.xml
timeStamp Price
1 12:00:00:01 25.02
2 12:00:00:02 15
3 12:00:00:025 15.02
4 12:00:00:031 18.25
5 12:00:00:039 18.54
6 12:00:00:050 16.52
7 12:00:01:01 17.50
>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With