Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to compute descriptive statistics on a set of differently sized vectors

Tags:

r

In a problem, I have a set of vectors. Each vector has sensor readings but are of different lengths. I'd like to compute the same descriptive statistics on each of these vectors. My question is, how should I store them in R. Using c() concatenates the vectors. Using list() seems to cause functions like mean() to misbehave. Is a data frame the right object?

What is the best practice for applying the same function to vectors if different sizes? Supposing the data resides in a SQL server, how should it be imported?

like image 417
speciousfool Avatar asked Dec 23 '22 01:12

speciousfool


1 Answers

Vectors of different sizes should be combined into a list: a data.frame expects each column to be the same length.

Use lapply to fetch your data. Then use lapply again to get the descriptive statistics.

x <- lapply(ids, sqlfunction)
stats <- lapply(x, summary)

Where sqlfunction is some function you created to query your database. You can collapse the stats list into a data.frame by calling do.call(rbind, stats) or by using plyr:

library(plyr)
x <- llply(ids, sqlfunction)
stats <- ldply(x, summary)
like image 166
Shane Avatar answered Jan 31 '23 01:01

Shane