I created these vectors:
Letters <- c("A","C","E","G","H","J","K")
Numbers <- c(0,1,2,3,4,6,7,9)
AlphaNumeric <- c(Letters, Numbers)
I would like to receive a dataframe of all 3-element combinations (e.g. AA1, G26 etc.) using all elements mentioned above following three conditions:
1.) The first element is a letter
2.) The second element is a number or the SAME letter as the first element
3.) The third element is a number
Approach:
I have tried to use expand.grid()
and successfully managed to get ALL combinations with 3 elements. Then I tried expand.grid(x = Letters, y = AlphaNumeric, z = Numbers)
and managed to achieve 1.) and 3.) but failed to manage 2.) so far.
Unsatisfying Solution: I have figured out a way of doing this with a for-loop, but I guess there must be a way easier way of doing it other than:
LNN <- expand.grid(x = Letters, y = Numbers, z = Numbers)
for ( Element in Letters) {
currentLLN <- expand.grid(x = Element, y = Element, z = Numbers)
LNN <- merge(LNN, currentLLN, all = TRUE)}
Any help would be greatly appreciated, thank you, Christian
Full details here! This combination generator will quickly find and list all possible combinations of up to 7 letters or numbers, or a combination of letters and numbers. Plus, you can even choose to have the result set sorted in ascending or descending order.
If you need to generate all possible combinations based on multiple columns data, maybe, there is not a good way of dealing with the task. But, Kutools for Excel 's List All Combinations utility can help you to list all possible combinations quickly and easily.
Click Kutools > Insert > List All Combinations, see screenshot: 2. In the List All Combinations dialog box, do the operations as below demo shown: 3. Then all the specified values and separators have been listed into the dialog box, see screenshot:
After installing Kutools for Excel, please do as this: 1. Click Kutools > Insert > List All Combinations, see screenshot: 2. In the List All Combinations dialog box, do the operations as below demo shown:
You could create two dataframes, one where the second element is a number, and one where the second element is the same as the first element, and then rbind
those. An example is given below, note that I have limited your example data for illustration purposes.
Letters <- LETTERS[1:3]
Numbers <- c(1,2)
df1 = expand.grid(v1=Letters,v3=Numbers,stringsAsFactors = F)
df1$v2 = df1$v1
df1 = df1[,c('v1','v2','v3')]
df2 = expand.grid(v1=Letters,v2=as.character(Numbers),v3=Numbers, stringsAsFactors = F)
df = rbind(df1,df2)
Output:
> df
v1 v2 v3
1 A A 1
2 B B 1
3 C C 1
4 A A 2
5 B B 2
6 C C 2
7 A 1 1
8 B 1 1
9 C 1 1
10 A 2 1
11 B 2 1
12 C 2 1
13 A 1 2
14 B 1 2
15 C 1 2
16 A 2 2
17 B 2 2
18 C 2 2
Hope this helps!
Although both answers run very fast and Parfait's solution is a nice solution to your problem and I certainly do not want to discredit his answer, I think it is good to point out that creating extra combinations and subsetting will become a larger issue when you data is larger. A benchmark is shown below.
Letters <- c(LETTERS[1:26],letters[1:4])
Numbers <- seq(30)
AlphaNumeric <- c(Letters, Numbers)
f_flo <- function()
{
df1 = expand.grid(v1=Letters,v3=Numbers,stringsAsFactors = F)
df1$v2 = df1$v1
df1 = df1[,c('v1','v2','v3')]
df2 = expand.grid(v1=Letters,v2=as.character(Numbers),v3=Numbers, stringsAsFactors = F)
df = rbind(df1,df2)
}
f_parfait <- function()
{
df <- expand.grid(x = Letters, y = AlphaNumeric, z = Numbers, stringsAsFactors = FALSE)
sub <- subset(df, (x == y | grepl("[0-9]", y)) & grepl("[0-9]", z) )
sub <- with(sub, sub[order(x, y, z),]) # SORT DATAFRAME
rownames(sub) <- NULL # RESET ROWNAMES
}
library(dplyr)
one_letter <- function(l) {
expand.grid(l, c(l, Numbers), Numbers, stringsAsFactors = FALSE)
}
f_stibu <- function(){
df <- bind_rows(lapply(Letters, one_letter))
}
library(microbenchmark)
library(ggplot2)
run_times = microbenchmark(f_flo(),f_parfait(),f_stibu())
autoplot(run_times)
Results:
Unit: milliseconds
expr min lq mean median uq max neval cld
f_flo() 1.900719 2.047591 3.666935 2.314258 3.922053 78.74793 100 a
f_parfait() 138.028364 142.529904 152.876116 144.159444 146.835958 246.92318 100 b
f_stibu() 4.130464 4.333130 5.169664 4.585028 6.209233 10.23139 100 a
Simply subset your expand.grid()
dataframe with grepl
calls:
df <- expand.grid(x = Letters, y = AlphaNumeric, z = Numbers, stringsAsFactors = FALSE)
sub <- subset(df, (x == y | grepl("[0-9]", y)) )
sub <- with(sub, sub[order(x, y, z),]) # SORT DATAFRAME
rownames(sub) <- NULL # RESET ROWNAMES
head(sub, 10)
# x y z
# 1 A 0 0
# 2 A 0 1
# 3 A 0 2
# 4 A 0 3
# 5 A 0 4
# 6 A 0 6
# 7 A 0 7
# 8 A 0 9
# 9 A 1 0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With