Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is difference between dataframe and list in R?

Tags:

list

dataframe

r

What is difference between dataframe and list in R? Which one should be used when? Which is easier to loop over?

Exact problem: I have to first store 3 string elements like "a", "b", "c". Later for each of these, I need to append 3 more elements; for instance for "a" I have to add "a1", "a2", "a3". Later I have to use nested for loops to access these elements.

So I am confused to use dataframe or list or some other data type, in which I could first store and then append (kind of each column)?

Currently I am getting errors, like "number of items to replace is not a multiple of replacement length"

like image 478
ShazSimple Avatar asked Apr 09 '13 11:04

ShazSimple


People also ask

Is data frame a list in R?

A Data frame is simply a List of a specified class called “data. frame”, but the components of the list must be vectors (numeric, character, logical), factors, matrices (numeric), lists, or even other data frames.

What does list () do in R?

The list() function in R is used to create a list of elements of different types. A list can contain numeric, string, or vector elements.

What is the difference between list and vector in R?

A list holds different data such as Numeric, Character, logical, etc. Vector stores elements of the same type or converts implicitly. Lists are recursive, whereas vector is not. The vector is one-dimensional, whereas the list is a multidimensional object.


1 Answers

The question isn't as stupid as some people think it is. I know plenty of people struggling with that difference, and what to use where. To summarize :

Lists are by far the most flexible data structure in R. They can be seen as a collection of elements without any restriction on the class, length or structure of each element. The only thing you need to take care of, is that you don't give two elements the same name. That might cause a lot of confusion, and R doesn't give errors for that:

> X <- list(a=1,b=2,a=3) > X$a [1] 1 

Data frames are lists as well, but they have a few restrictions:

  • you can't use the same name for two different variables
  • all elements of a data frame are vectors
  • all elements of a data frame have an equal length.

Due to these restrictions and the resulting two-dimensional structure, data frames can mimick some of the behaviour of matrices. You can select rows and do operations on rows. You can't do that with lists, as a row is undefined there.

All this implies that you should use a data frame for any dataset that fits in that twodimensional structure. Essentially, you use data frames for any dataset where a column coincides with a variable and a row coincides with a single observation in the broad sense of the word. For all other structures, lists are the way to go.

Note that if you want a nested structure, you have to use lists. As elements of a list can be lists themselves, you can create very flexible structured objects.

like image 100
Joris Meys Avatar answered Oct 12 '22 18:10

Joris Meys