I have a dataframe where some of the values are NA. I would like to remove these columns.
My data.frame looks like this
v1 v2 1 1 NA 2 1 1 3 2 2 4 1 1 5 2 2 6 1 NA
I tried to estimate the col mean and select the column means !=NA. I tried this statement, it does not work.
data=subset(Itun, select=c(is.na(colMeans(Itun))))
I got an error,
error : 'x' must be an array of at least two dimensions
Can anyone give me some help?
If we need to drop such columns that contain NA, we can use the axis=column s parameter of DataFrame. dropna() to specify deleting the columns. By default, it removes the column where one or more values are missing.
To remove observations with missing values in at least one column, you can use the na. omit() function. The na. omit() function in the R language inspects all columns from a data frame and drops rows that have NA's in one or more columns.
The dropna() function is used to remove missing values. Determine if rows or columns which contain missing values are removed. 0, or 'index' : Drop rows which contain missing values.
There are two easy methods to select columns of an R data frame without missing values, first one results in a vector and other returns a matrix. For example, if we have a data frame called df then the first method can be used as df[,colSums(is.na(df))==0] and the second method will be used as t(na.
The data:
Itun <- data.frame(v1 = c(1,1,2,1,2,1), v2 = c(NA, 1, 2, 1, 2, NA))
This will remove all columns containing at least one NA
:
Itun[ , colSums(is.na(Itun)) == 0]
An alternative way is to use apply
:
Itun[ , apply(Itun, 2, function(x) !any(is.na(x)))]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With