Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

data.table := assigning and grouping

Working on country specific data. Need to assign and group countries into pre-defined country groups. Wrote code as below. Like to know if there is more efficient way to script by NOT typing each NEW country every time it comes into the database into the section of assigning into NON-CORE group? Sounds like if else. But don't know how to code that.

library(data.table)
data<- data.table(data)
setkey(data,Region.Group)
data[list(c(
  "Australia",
  "Bangladesh",
  "Cambodia",
  "Estonia",
  "Finland",
  "France",
  "India",
  "Indonesia",
  "Korea",
  "Lithuania",
  "Malaysia",
  "Middle East",
  "Norway",
  "Philippines",
  "Poland",
  "Russia",
  "Spain",
  "Sri Lanka",
  "Sweden",
  "Switzerland",
  "TAT Region",
  "Thailand",
  "Ukraine",
  "Vietnam",
  "New Zealand",
  "Israel",
  "Myanmar",
  "Pakistan",
  "Portugal",
  "Turkey",
  "Portugal")), Core:="NON-CORE"]
data[list(c(
  "Belgium",
  "Netherlands")), Core:="Benelux"]
data[list(c(
  "China Group")), Core:="China"]
data[list(c(
  "Germany")), Core:="Germany"]
data[list(c(
  "Hong Kong Group")), Core:="Hong Kong"]
data[list(c(
  "Italy")), Core:="Italy"]
data[list(c(
  "Japan")), Core:="Japan"]
data[list(c(
  "North America Central",
  "North America East",
  "North America North",
  "North America South",
  "North America West")), Core:="N.America"]
data[list(c(
  "Singapore")), Core:="Singapore"]
data[list(c(
  "Taiwan")), Core:="Taiwan"]
data[list(c(
  "United Kingdom")), Core:="UK"]
like image 918
KFB Avatar asked Jan 22 '26 22:01

KFB


1 Answers

I guess you need to put the country in the correct group at some point. How about a list (shortened here), where we don't bother to put NON-CORE countries:

coregroup <- list(
    Benelux     =   c("Belgium","Netherlands"),
    Germany     =   "Germany"
)

Then you can make a data.table out of this list

dt_coregroup <- data.table(
    Core=rep(names(coregroup),lapply(coregroup,length)),
    Region.Group=unlist(coregroup)
)
#       Core Region.Group
# 1: Benelux      Belgium
# 2: Benelux  Netherlands
# 3: Germany      Germany

and merge it back into your original data. I've put in some nonsense data and renamed it to "dt_start", because apparently "data" is already an R function.

dt_start <- data.table(Region.Group=c("Germany","Belgium","Australia"),Period=rep("2013Q1",3),Qty1=1:3)
setkey(dt_start,Region.Group)
setkey(dt_coregroup,Region.Group)

dt_new <- dt_coregroup[dt_start]
#    Region.Group    Core Period Qty1
# 1:    Australia      NA 2013Q1    3
# 2:      Belgium Benelux 2013Q1    2
# 3:      Germany Germany 2013Q1    1

Finally, in the last step, we assign any ungrouped countries to NON-CORE:

dt_new[is.na(Core),Core:="NON-CORE"]
#    Region.Group     Core Period Qty1
# 1:    Australia NON-CORE 2013Q1    3
# 2:      Belgium  Benelux 2013Q1    2
# 3:      Germany  Germany 2013Q1    1
like image 64
Frank Avatar answered Jan 24 '26 12:01

Frank



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!