I have the following dataframe from a .csv file which consists of more rows but in order to keep it simple I've narrowed it down to those three. You can also access the csv file here: https://dl.dropboxusercontent.com/u/16277659/filter.csv
NAME; YEAR; VALUE
SAMPLE1; 1969; 6
SAMPLE1; 1970; -6
SAMPLE1; 1971; -7
SAMPLE1; 1972; =-X
SAMPLE1; 1972; ST
SAMPLE1; 1972; 3
SAMPLE1; 1975; -7
SAMPLE1; 1976; 3
SAMPLE1; 1977; 3
SAMPLE1; 1978; 0
SAMPLE2; 1991; -15
SAMPLE2; 1992; =X
SAMPLE2; 1992; -58
SAMPLE2; 1994; -40
What I'd like to do is the following: I sometimes have qualitative values (like =-X, ST etc) which I don't necessarily want to loose but if there is a numerical value for the same year (in SAMPLE1 1972 =-X and ST), I would like to keep only the numerical value and get rid of the other values.
How would you do this? Thanks for your help.
I haven't mastered regex, so my mind first goes here:
dat <- read.csv2("filter.csv", as.is=TRUE)
dat$IsNum <- !(is.na(as.numeric(dat$VALUE)))
> dat
NAME YEAR VALUE IsNum
1 SAMPLE1 1969 6 TRUE
2 SAMPLE1 1970 -6 TRUE
3 SAMPLE1 1971 -7 TRUE
4 SAMPLE1 1972 =-X FALSE
5 SAMPLE1 1972 ST FALSE
6 SAMPLE1 1972 3 TRUE
7 SAMPLE1 1975 -7 TRUE
8 SAMPLE1 1976 3 TRUE
9 SAMPLE1 1977 3 TRUE
10 SAMPLE1 1978 0 TRUE
11 SAMPLE2 1991 -15 TRUE
12 SAMPLE2 1992 =X FALSE
13 SAMPLE2 1992 -58 TRUE
14 SAMPLE2 1994 -40 TRUE
From there it's a simple matter of checking if IsNum == TRUE
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With