Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are R levels?

Tags:

r

csv

I am trying to read a csv file with R. I can read the file but I have levels when I call a variable. What are these levels and how can I remove them? The file can be downloaded here file

> data=read.csv("Documents/bet/I1.csv",sep=",")
> data$HomeTeam
  [1] Sampdoria  Verona     Cagliari   Inter      Lazio      Livorno    Napoli     Parma     
  [9] Torino     Fiorentina Chievo     Juventus   Atalanta   Bologna    Catania    Genoa     
 [17] Milan      Roma       Sassuolo   Udinese    Inter      Napoli     Torino     Fiorentina
 [25] Lazio      Livorno    Sampdoria  Udinese    Verona     Parma      Cagliari   Chievo    
 [33] Genoa      Atalanta   Bologna    Catania    Juventus   Milan      Roma       Sassuolo  
 [41] Udinese    Bologna    Chievo     Lazio      Livorno    Napoli     Parma      Sampdoria 
 [49] Torino     Inter      Genoa      Milan      Atalanta   Cagliari   Catania    Roma      
 [57] Sassuolo   Torino     Verona     Fiorentina Bologna    Catania    Napoli     Parma     
 [65] Sampdoria  Udinese    Juventus   Lazio      Chievo     Inter      Roma       Cagliari  
 [73] Milan      Atalanta   Fiorentina Genoa      Livorno    Sassuolo   Verona     Torino    
 [81] Inter      Sampdoria  Bologna    Catania    Chievo     Juventus   Lazio      Napoli    
 [89] Parma      Udinese    Atalanta   Cagliari   Fiorentina Genoa      Juventus   Livorno   
 [97] Milan      Sassuolo   Verona     Roma       Milan      Napoli     Parma      Lazio     
[105] Livorno    Sampdoria  Torino     Udinese    Verona     Bologna    Catania    Inter     
[113] Atalanta   Cagliari   Chievo     Genoa      Parma      Roma       Fiorentina Juventus  
[121] Milan      Napoli     Verona     Bologna    Livorno    Sampdoria  Sassuolo   Torino    
[129] Udinese    Roma      
20 Levels: Atalanta Bologna Cagliari Catania Chievo Fiorentina Genoa Inter Juventus ... Verona
like image 974
Donbeo Avatar asked Dec 01 '13 16:12

Donbeo


2 Answers

When you use ?read.csv to read a file, the argument stringsAsFactors is set by default to TRUE, you just need to set it to false to not get this result. This should work:

data = read.csv("Documents/bet/I1.csv", sep=",", stringsAsFactors=FALSE)

Under the default, columns (variables) in the file that contain strings are assumed to be factors. A factor is a categorical variable that can take only one of a fixed, finite set of possibilities. Those possible categories are the levels. You can read about factors in the R Intro manual here, and this is another tutorial.

In addition, since you are using read.csv, adding the sep="," is redundant. It doesn't harm anything, but the comma is taken as the separator by default.

like image 173
gung - Reinstate Monica Avatar answered Nov 17 '22 01:11

gung - Reinstate Monica


The presence of levels for your variable HomeTeam indicates that it is a factor (with 20 levels). You can specify StringAsFactors=FALSE argument in the read.csv function to remove it.

like image 29
alittleboy Avatar answered Nov 16 '22 23:11

alittleboy