Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does ".N" means in data table in r?

Tags:

I have a data.table dt:

library(data.table)
dt = data.table(a=LETTERS[c(1,1:3)],b=4:7)

   a b
1: A 4
2: A 5
3: B 6
4: C 7

The result of dt[, .N, by=a] is

   a N
1: A 2
2: B 1
3: C 1

I know the by=a or by="a" means grouped by a column and the N column is the sum of duplicated times of a. However, I don't use nrow() but I get the result. The .N is not just the column name? I can't find the document by ??".N" in R. I tried to use .K, but it doesn't work. What does .N means?

like image 476
Eric Chang Avatar asked Oct 13 '15 12:10

Eric Chang


People also ask

What is .n in data table?

table's . N symbol, where . N stands for “number of rows.” It can be the total number of rows, or number of rows per group if you're aggregating in the “by” section. This expression returns the total number of rows in the data.table: mydt[, . N]

What does .SD mean in data table?

SD stands for "Subset of Data. table". The dot before SD has no significance but doesn't let it clash with a user-defined column name.

Is data table DT == true?

data. table(DT) is TRUE. To better description, I put parts of my original code here. So you may understand where goes wrong.


1 Answers

Think of .N as a variable for the number of instances. For example:

dt <- data.table(a = LETTERS[c(1,1:3)], b = 4:7)

dt[.N] # returns the last row
#    a b
# 1: C 7

Your example returns a new variable with the number of rows per case:

dt[, new_var := .N, by = a]
dt
#    a b new_var
# 1: A 4       2 # 2 'A's
# 2: A 5       2
# 3: B 6       1 # 1 'B'
# 4: C 7       1 # 1 'C'

For a list of all special symbols of data.table, see also https://www.rdocumentation.org/packages/data.table/versions/1.10.0/topics/special-symbols

like image 191
David Avatar answered Sep 19 '22 16:09

David