Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: Filtering out non numerical values in dataframe

Tags:

dataframe

r

csv

I have the following dataframe from a .csv file which consists of more rows but in order to keep it simple I've narrowed it down to those three. You can also access the csv file here: https://dl.dropboxusercontent.com/u/16277659/filter.csv

NAME;       YEAR;   VALUE
SAMPLE1;    1969;   6
SAMPLE1;    1970;   -6
SAMPLE1;    1971;   -7
SAMPLE1;    1972;   =-X
SAMPLE1;    1972;   ST
SAMPLE1;    1972;   3
SAMPLE1;    1975;   -7
SAMPLE1;    1976;   3
SAMPLE1;    1977;   3
SAMPLE1;    1978;   0
SAMPLE2;    1991;   -15
SAMPLE2;    1992;   =X
SAMPLE2;    1992;   -58
SAMPLE2;    1994;   -40

What I'd like to do is the following: I sometimes have qualitative values (like =-X, ST etc) which I don't necessarily want to loose but if there is a numerical value for the same year (in SAMPLE1 1972 =-X and ST), I would like to keep only the numerical value and get rid of the other values.

How would you do this? Thanks for your help.

like image 899
kurdtc Avatar asked Sep 17 '25 22:09

kurdtc


1 Answers

I haven't mastered regex, so my mind first goes here:

dat <- read.csv2("filter.csv", as.is=TRUE)
dat$IsNum <- !(is.na(as.numeric(dat$VALUE)))

> dat
      NAME YEAR VALUE IsNum
1  SAMPLE1 1969     6  TRUE
2  SAMPLE1 1970    -6  TRUE
3  SAMPLE1 1971    -7  TRUE
4  SAMPLE1 1972   =-X FALSE
5  SAMPLE1 1972    ST FALSE
6  SAMPLE1 1972     3  TRUE
7  SAMPLE1 1975    -7  TRUE
8  SAMPLE1 1976     3  TRUE
9  SAMPLE1 1977     3  TRUE
10 SAMPLE1 1978     0  TRUE
11 SAMPLE2 1991   -15  TRUE
12 SAMPLE2 1992    =X FALSE
13 SAMPLE2 1992   -58  TRUE
14 SAMPLE2 1994   -40  TRUE

From there it's a simple matter of checking if IsNum == TRUE

like image 72
Adrian Avatar answered Sep 19 '25 10:09

Adrian