I'm struggling to solve this problem in R. I have data like this:
item id
1 500
2 500
2 600
2 700
3 500
3 600
data.frame(item = c(1, 2, 2, 2, 3, 3),
id = c(500, 500, 600, 700, 500, 600))
And I want to count the number of times a pair of items is linked to the same id. So I want this output:
item1 item2 count
1 2 1
2 3 2
1 3 2
I've tried approaching this with commands like:
x_agg = aggregate(x, by=list(x$id), c)
and then
x_agg_id = lapply(x_agg$item, unique)
thinking that I could then count the occurrence of each item. But the by
function seems to create an object of lists, which I don't know how to manipulate. I am hoping there is a simpler way....
The GROUP BY statement is often used with aggregate functions ( COUNT() , MAX() , MIN() , SUM() , AVG() ) to group the result-set by one or more columns.
In SQL, you can make a database query and use the COUNT function to get the number of rows for a particular group in the table. Here is the basic syntax: SELECT COUNT(column_name) FROM table_name; COUNT(column_name) will not include NULL values as part of the count.
Then, in the ORDER BY clause, you use the aggregate function COUNT, which counts the number of values in the column of your choice; in our example, we count distinct IDs with COUNT(id) . This effectively counts the number of elements in each group.
Using COUNT, without GROUP BY clause will return a total count of a number of rows present in the table. Adding GROUP BY, we can COUNT total occurrences for each unique value present in the column.
# your data
df<-read.table(text="item id
1 500
2 500
2 600
2 700
3 500
3 600",header=TRUE)
library(tnet)
item_item<-projecting_tm(df, method="sum")
names(item_item)<-c("item1","item2","count")
item_item
#item1 item2 count
#1 1 2 1
#2 1 3 1
#3 2 1 1
#4 2 3 2
#5 3 1 1
#6 3 2 2
EDIT
how many ids and items do you have? you could always rename things. e.g.
numberitems<-length(unique(df$id))+9000
items<-data.frame(item=unique(df$item),newitems=c(9000:(numberitems-1)))
numberids<-length(unique(df$id))+1000
ids<-data.frame(id=unique(df$id),newids=c(1000:(numberids-1)))
newdf<-merge(df,items,by="item")
newdf<-merge(newdf,ids,by="id")
DF<-data.frame(item=newdf$newitems,id=newdf$newids)
library(tnet)
item_item<-projecting_tm(DF, method="sum")
names(item_item)<-c("item1","item2","count")
then merge back the original names afterwards....
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With