I have a list of lists, where each list contains tickers (names) and their values. These tickers stay the same for each list but the values differ. Now, I want to see what is the average value of each of these tickers. The issue is that I don't know how to specify to look into a specific ticker in each list and extract the value. For instance, I want the mean value of "jpm" within this 3 lists. It would be mean(c(0.08620690,0.10000000,0.10000000))
=
0.095402. How can I do so?
What I have so far:
dput(degree.l)
list(c(schwab = 0, pnc = 0.0344827586206897, jpm = 0.0862068965517241,
amex = 0.0862068965517241, gs = 0.103448275862069, ms = 0.103448275862069,
bofa = 0.103448275862069, citi = 0.103448275862069, wf = 0.120689655172414,
spgl = 0.120689655172414, brk = 0.137931034482759), c(schwab = 0.0166666666666667,
pnc = 0.05, ms = 0.0666666666666667, spgl = 0.0833333333333333,
jpm = 0.1, bofa = 0.1, wf = 0.1, amex = 0.1, gs = 0.116666666666667,
brk = 0.116666666666667, citi = 0.15), c(schwab = 0.0428571428571429,
gs = 0.0714285714285714, pnc = 0.0714285714285714, citi = 0.0857142857142857,
amex = 0.0857142857142857, spgl = 0.0857142857142857, jpm = 0.1,
brk = 0.1, ms = 0.114285714285714, wf = 0.114285714285714, bofa = 0.128571428571429
))
degree.unl <- unlist(degree.l)
How to find the mean of list elements in R? To find the mean of list elements we need to unlist those elements. For example, if we have a list named as List that contains three elements of equal or different sizes such element1, element2, and element3 then we can find the mean of all the list elements by using mean (unlist (List)).
In this article, you will learn how to calculate the mean of a list in Python. Let’s say you have a list named ‘a’ with value [1, 2, 3, 4, 5]. In order to calculate the mean of a list, you can use the statistics.mean () method. # Import statistics module import statistics a = [1, 2, 3, 4, 5] print (statistics.mean (a)) # => 3
You can create lists with evenly spaced elements. [1,...,10] is a list of the integers between 1 and 10. [1,3,...,11] is a list of the odd integers between 1 and 11. Table columns are available as lists in the rest of the system. You can use lists anywhere in expressions that you would use a number.
Access Elements in a List of Lists in Python We can access the contents of a list using the list index. In a flat list or 1-d list, we can directly access the list elements using the index of the elements.
We can use aggregate
with stack
in base R
aggregate(values ~ ind, do.call(rbind, lapply(degree.l, stack)), FUN = mean)
-ouptut
ind values
1 schwab 0.01984127
2 pnc 0.05197044
3 jpm 0.09540230
4 amex 0.09064039
5 gs 0.09718117
6 ms 0.09480022
7 bofa 0.11067323
8 citi 0.11305419
9 wf 0.11165846
10 spgl 0.09657909
11 brk 0.11819923
Or another option is Reduce
(assuming no NAs) to do elementwise addition (+
) and divide by the length
of the list
Reduce(`+`, degree.l)/length(degree.l)
schwab pnc jpm amex gs ms bofa citi wf spgl brk
0.01984127 0.05197044 0.07476738 0.08508484 0.09638752 0.09638752 0.10114943 0.10114943 0.11721401 0.11721401 0.13883415
Or as the OP unlist
ed the dataset, then using that object, group by the names
and use tapply
tapply(degree.unl, names(degree.unl), FUN = mean)
amex bofa brk citi gs jpm ms pnc schwab spgl wf
0.09064039 0.11067323 0.11819923 0.11305419 0.09718117 0.09540230 0.09480022 0.05197044 0.01984127 0.09657909 0.11165846
Another option:
get_ticker <- function(t) mean(sapply(d, "[[", t))
sapply(names(degree.l[[1]]), get_ticker)
Before unlist
ing,
apply(do.call(rbind, degree.l), 2, mean)
# schwab pnc jpm amex gs ms bofa
# 0.01984127 0.05197044 0.07476738 0.08508484 0.09638752 0.09638752 0.10114943
# citi wf spgl brk
# 0.10114943 0.11721401 0.11721401 0.13883415
Edit: since you say you can't assume that tickers are in order, we can fix that:
nms <- unique(unlist(lapply(degree.l, names)))
nms
# [1] "schwab" "pnc" "jpm" "amex" "gs" "ms" "bofa" "citi"
# [9] "wf" "spgl" "brk"
apply(do.call(rbind, lapply(degree.l, `[`, nms)), 2, mean)
# schwab pnc jpm amex gs ms bofa
# 0.01984127 0.05197044 0.09540230 0.09064039 0.09718117 0.09480022 0.11067323
# citi wf spgl brk
# 0.11305419 0.11165846 0.09657909 0.11819923
For fun, we can jumble them to confirm this works:
set.seed(42)
degree.l.jumbled <- lapply(degree.l, sample)
degree.l.jumbled
# [[1]]
# schwab gs brk wf pnc amex bofa
# 0.00000000 0.10344828 0.13793103 0.12068966 0.03448276 0.08620690 0.10344828
# spgl citi ms jpm
# 0.12068966 0.10344828 0.10344828 0.08620690
# [[2]]
# amex wf spgl schwab jpm bofa gs
# 0.10000000 0.10000000 0.08333333 0.01666667 0.10000000 0.10000000 0.11666667
# pnc brk citi ms
# 0.05000000 0.11666667 0.15000000 0.06666667
# [[3]]
# ms bofa citi amex jpm brk spgl
# 0.11428571 0.12857143 0.08571429 0.08571429 0.10000000 0.10000000 0.08571429
# wf gs pnc schwab
# 0.11428571 0.07142857 0.07142857 0.04285714
apply(do.call(rbind, lapply(degree.l.jumbled, `[`, nms)), 2, mean)
# schwab pnc jpm amex gs ms bofa
# 0.01984127 0.05197044 0.09540230 0.09064039 0.09718117 0.09480022 0.11067323
# citi wf spgl brk
# 0.11305419 0.11165846 0.09657909 0.11819923
A data.table
option using rbindlist
+ colMeans
> colMeans(rbindlist(Map(function(x) data.frame(t(x)), degree.1), use.names = TRUE))
schwab pnc jpm amex gs ms bofa
0.01984127 0.05197044 0.09540230 0.09064039 0.09718117 0.09480022 0.11067323
citi wf spgl brk
0.11305419 0.11165846 0.09657909 0.11819923
Then, if you want to retrieve the mean by any name, e.g., schwab
, you can try it like below
colMeans(rbindlist(Map(function(x) data.frame(t(x)), degree.1), use.names = TRUE))["schwab"]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With