Now the string is looks like:
"Interest.USD,Vol=[Integrated,(0,0.101),(0.2,0.108),(1,0.110),(2,0.106),
(3,0.102),(4,0.09),(5,0.091),(6,0.09128272)],Drift=[Integrated,(0.002,0.09),
(0.24,0.0007),(0.4,0.007),(1,-0.033),(2,-0.005),(3,-0.0041),
(4,-0.3505),(5,-0.65),(7,-0.08346),(8,-0.049),(9,-0.0613),(10,-0.019)],
Risk_Neutral=YES,Lambda=0.09,FX_Volatility=0.01,FX_Correlation=0.9"
I want to grab the data following the "Vol" and "Drift" in a matrix format like:
Vol matrix:
0,0.101
0.2,0.108
1,0.110
2,0.106
3,0.102
4,0.09
5,0.091
6,0.09128272
and also the single value like 0.09 for Lambda. I guess I shuold use regular expression, but I not that familiar with that. Any suggestion? :)
P.S. I tried using:
str_extract_all(text,'[ .+? ]')
try to get the data bewteen [ and ], but it returns "."
Here's a way to extract those values in R. Let's assume that strings you posted is stored in a variable named a
. In order to make things easier, i'm going to use a helper function: getcapturedmatches(). Then you can do
expr <- "(Vol|Drift)=\\[Integrated,([^\\]]*)\\]"
mm <- regcapturedmatches(a,gregexpr(expr,a, perl=T))[[1]]
expr <- "\\(([^,]+),([^,]+)\\)"
vv <- regcapturedmatches(mm[,2],gregexpr(expr,mm[,2], perl=T))
First we do a pass to extract the Vol and Drift elements in mm
and then we split the comma delimited lists into vv
. Now we can combine the data into one large data.frame
tt <- Map(data.frame, col=mm[,1], val=lapply(vv,
function(x) {class(x)<-"numeric"; x}))
dd<-do.call(rbind, unname(tt))
In the end dd
will look like
col val.1 val.2
1 Vol 0.000 0.10100000
2 Vol 0.200 0.10800000
3 Vol 1.000 0.11000000
4 Vol 2.000 0.10600000
5 Vol 3.000 0.10200000
6 Vol 4.000 0.09000000
7 Vol 5.000 0.09100000
8 Vol 6.000 0.09128272
9 Drift 0.002 0.09000000
10 Drift 0.240 0.00070000
11 Drift 0.400 0.00700000
12 Drift 1.000 -0.03300000
13 Drift 2.000 -0.00500000
14 Drift 3.000 -0.00410000
15 Drift 4.000 -0.35050000
16 Drift 5.000 -0.65000000
17 Drift 7.000 -0.08346000
18 Drift 8.000 -0.04900000
19 Drift 9.000 -0.06130000
20 Drift 10.000 -0.01900000
This method allows for any number of repeated values in each of those sections.
If you did just want simple matrices then
Map(function(a,b) {class(b)<-"numeric"; b}, mm[,1],
lapply(vv, function(x) {class(x)<-"numeric"; x}))
will give you a named list of the matrices.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With