I have two csv files, and each of which consists of one column of data For instance, vecA.csv is like <pre class="prettyprint"><code>id 1 2 </code></pre> vecB.csv is like <pre class="prettyprint"><code>id 3 2 </code></pre> I read the data set as follows: <pre class="prettyprint"><code>vectorA<-read.table("vecA.csv",sep=",",header=T) vectorB<-read.table("vecB.csv",sep=",",header=T) </code></pre> I want to generate a vector consisting of elements belonging to B only.

You are looking for the function <code>setdiff</code> <pre class="prettyprint"><code>setdiff(vectorB$id, vectorA$id) </code></pre> If you did not want this reduced to unique values, you could create a <code>not in</code> function (kudos to @joran here Match with negation) <pre class="prettyprint"><code>'%nin%' <- Negate('%in%') vectorB$id[vectorB$id %nin% vectorA$id] </code></pre>

If your vector's are instead <code>data.table</code>s, then all you need are five characters: <pre class="prettyprint"><code>B[!A] </code></pre> <hr> <pre class="prettyprint"><code>library(data.table) # read in your data, wrap in data.table(..., key="id") A <- data.table(read.table("vecA.csv",sep=",",header=T), key="id") B <- data.table(read.table("vecB.csv",sep=",",header=T), key="id") # Then this is all you need B[!A] </code></pre> <hr> [Matthew] And in v1.8.7 it's simpler and faster to read the file as well : <pre class="prettyprint"><code>A <- setkey(fread("vecA.csv"), id) B <- setkey(fread("vecB.csv"), id) B[!A] </code></pre>

generating a vector of difference between two vectors

Tags:

r

I have two csv files, and each of which consists of one column of data

For instance, vecA.csv is like

id 1 2

vecB.csv is like

id 3 2

I read the data set as follows:

vectorA<-read.table("vecA.csv",sep=",",header=T) vectorB<-read.table("vecB.csv",sep=",",header=T)

I want to generate a vector consisting of elements belonging to B only.

542

asked Feb 19 '13 04:02

user785099

2 Answers

You are looking for the function setdiff

setdiff(vectorB$id, vectorA$id)

If you did not want this reduced to unique values, you could create a not in function

(kudos to @joran here Match with negation)

'%nin%' <- Negate('%in%')  vectorB$id[vectorB$id %nin% vectorA$id]

answered Oct 11 '22 08:10

mnel

If your vector's are instead data.tables, then all you need are five characters:

B[!A]

library(data.table)  # read in your data, wrap in data.table(..., key="id")  A <- data.table(read.table("vecA.csv",sep=",",header=T), key="id") B <- data.table(read.table("vecB.csv",sep=",",header=T), key="id")  # Then this is all you need B[!A]

[Matthew] And in v1.8.7 it's simpler and faster to read the file as well :

A <- setkey(fread("vecA.csv"), id) B <- setkey(fread("vecB.csv"), id) B[!A]

answered Oct 11 '22 08:10

Ricardo Saporta

Related questions
                            
                                How to change order of array dimensions
                            
                                What is integer overflow in R and how can it happen?
                            
                                How to access single elements in a table in R
                            
                                Fixing set.seed for an entire session
                            
                                How do you change the default directory in RStudio (or R)?
                            
                                R shiny: How to get an reactive data frame updated each time pressing an actionButton without creating a new reactive data frame?
                            
                                Understand the `Reduce` function
                            
                                How to hide code in RMarkdown, with option to see it
                            
                                how to insert new line in R shiny string
                            
                                How to ignore case when using str_detect?
                            
                                Using a pre-defined color palette in ggplot
                            
                                Can I use a list as a hash in R? If so, why is it so slow?
                            
                                Find windows user name within R
                            
                                Convert hour:minute:second (HH:MM:SS) string to proper time class
                            
                                How to end a 'debug' mode? [duplicate]
                            
                                Column standard deviation R [duplicate]
                            
                                Extend contigency table with proportions (percentages)
                            
                                How to create a consecutive group number
                            
                                Storing ggplot objects in a list from within loop in R
                            
                                Create new dummy variable columns from categorical variable

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With