If I have a vector v
with names:
John Murray Lisa Mike Joe Ann
0.0832090 0.0475580 -0.2797860 0.1086225 0.0104590 -0.0028250
What is time complexity of v['Joe']
versus v[4]
? I guess the former would take O(log n) as it should involves binary search, but I'm still not sure whether the latter is O(1) or not.
Also, does the result generalize to the case when v
is a list/data frame rather than an atomic vector?
It seems to be approximately O(n), i.e. a vector scan, in the case of name look ups. Your conjecture of O(1) for lookup using indices seems sound...
# Unique names for longish vector
nms <- apply( expand.grid( letters , letters , letters , letters ) , 1 , paste , collapse = "" )
length(nms)
#[1] 456976
length(unique(nms))
#[1] 456976
# Start of names
head(nms)
#[1] "aaaa" "baaa" "caaa" "daaa" "eaaa" "faaa"
# End of names
tail(nms)
#[1] "uzzz" "vzzz" "wzzz" "xzzz" "yzzz" "zzzz"
# Large named vector
x <- setNames( runif( 456976 ) , nms )
# Small named vector
y <- setNames( runif(26) , letters )
# Timing information
require( microbenchmark )
bm <- microbenchmark( x['daaa'] , x[4] , x['vzzz'] , x[456972] , y['d'] , y[4] )
print( bm , order = 'median' , unit = 'relative' , digits = 3 )
#Unit: relative
# expr min lq median uq max neval
# x[456972] NaN 1.00e+00 1.00 1.00 1.000 100
# x[4] Inf 1.00e+00 1.33 1.07 0.957 100
# y[4] NaN 5.01e-01 1.33 1.14 0.191 100
# y["d"] Inf 1.00e+00 2.00 1.25 0.265 100
# x["vzzz"] Inf 6.57e+04 44412.24 9969.64 3439.154 100
# x["daaa"] Inf 6.59e+04 44582.73 10049.63 1207.337 100
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With