I have the following dataframe from a .csv file which consists of more rows but in order to keep it simple I've narrowed it down to those three. You can also access the csv file here: https://dl.dropboxusercontent.com/u/16277659/filter.csv
NAME;       YEAR;   VALUE
SAMPLE1;    1969;   6
SAMPLE1;    1970;   -6
SAMPLE1;    1971;   -7
SAMPLE1;    1972;   =-X
SAMPLE1;    1972;   ST
SAMPLE1;    1972;   3
SAMPLE1;    1975;   -7
SAMPLE1;    1976;   3
SAMPLE1;    1977;   3
SAMPLE1;    1978;   0
SAMPLE2;    1991;   -15
SAMPLE2;    1992;   =X
SAMPLE2;    1992;   -58
SAMPLE2;    1994;   -40
What I'd like to do is the following: I sometimes have qualitative values (like =-X, ST etc) which I don't necessarily want to loose but if there is a numerical value for the same year (in SAMPLE1 1972 =-X and ST), I would like to keep only the numerical value and get rid of the other values.
How would you do this? Thanks for your help.
I haven't mastered regex, so my mind first goes here:
dat <- read.csv2("filter.csv", as.is=TRUE)
dat$IsNum <- !(is.na(as.numeric(dat$VALUE)))
> dat
      NAME YEAR VALUE IsNum
1  SAMPLE1 1969     6  TRUE
2  SAMPLE1 1970    -6  TRUE
3  SAMPLE1 1971    -7  TRUE
4  SAMPLE1 1972   =-X FALSE
5  SAMPLE1 1972    ST FALSE
6  SAMPLE1 1972     3  TRUE
7  SAMPLE1 1975    -7  TRUE
8  SAMPLE1 1976     3  TRUE
9  SAMPLE1 1977     3  TRUE
10 SAMPLE1 1978     0  TRUE
11 SAMPLE2 1991   -15  TRUE
12 SAMPLE2 1992    =X FALSE
13 SAMPLE2 1992   -58  TRUE
14 SAMPLE2 1994   -40  TRUE
From there it's a simple matter of checking if IsNum == TRUE
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With