Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

All combinations of letters/numbers under specific conditions

I created these vectors:

Letters <- c("A","C","E","G","H","J","K")  
Numbers <- c(0,1,2,3,4,6,7,9) 
AlphaNumeric <- c(Letters, Numbers)

I would like to receive a dataframe of all 3-element combinations (e.g. AA1, G26 etc.) using all elements mentioned above following three conditions:

1.) The first element is a letter

2.) The second element is a number or the SAME letter as the first element

3.) The third element is a number

Approach: I have tried to use expand.grid() and successfully managed to get ALL combinations with 3 elements. Then I tried expand.grid(x = Letters, y = AlphaNumeric, z = Numbers) and managed to achieve 1.) and 3.) but failed to manage 2.) so far.

Unsatisfying Solution: I have figured out a way of doing this with a for-loop, but I guess there must be a way easier way of doing it other than:

   LNN <- expand.grid(x = Letters, y = Numbers, z = Numbers)

   for ( Element in Letters) {
       currentLLN <- expand.grid(x = Element, y = Element, z = Numbers)
       LNN <- merge(LNN, currentLLN, all = TRUE)}

Any help would be greatly appreciated, thank you, Christian

like image 725
Christian Schano Avatar asked Feb 23 '18 15:02

Christian Schano


People also ask

How many combinations can a combination generator find?

Full details here! This combination generator will quickly find and list all possible combinations of up to 7 letters or numbers, or a combination of letters and numbers. Plus, you can even choose to have the result set sorted in ascending or descending order.

How to list all possible combinations based on multiple columns data?

If you need to generate all possible combinations based on multiple columns data, maybe, there is not a good way of dealing with the task. But, Kutools for Excel 's List All Combinations utility can help you to list all possible combinations quickly and easily.

How to list all combinations of values and separators in Excel?

Click Kutools > Insert > List All Combinations, see screenshot: 2. In the List All Combinations dialog box, do the operations as below demo shown: 3. Then all the specified values and separators have been listed into the dialog box, see screenshot:

How to list all combinations in Excel using kutools?

After installing Kutools for Excel, please do as this: 1. Click Kutools > Insert > List All Combinations, see screenshot: 2. In the List All Combinations dialog box, do the operations as below demo shown:


2 Answers

You could create two dataframes, one where the second element is a number, and one where the second element is the same as the first element, and then rbind those. An example is given below, note that I have limited your example data for illustration purposes.

Letters <- LETTERS[1:3]  
Numbers <- c(1,2)

df1 = expand.grid(v1=Letters,v3=Numbers,stringsAsFactors = F)
df1$v2 = df1$v1
df1 = df1[,c('v1','v2','v3')]
df2 = expand.grid(v1=Letters,v2=as.character(Numbers),v3=Numbers, stringsAsFactors = F)
df = rbind(df1,df2)

Output:

> df
   v1 v2 v3
1   A  A  1
2   B  B  1
3   C  C  1
4   A  A  2
5   B  B  2
6   C  C  2
7   A  1  1
8   B  1  1
9   C  1  1
10  A  2  1
11  B  2  1
12  C  2  1
13  A  1  2
14  B  1  2
15  C  1  2
16  A  2  2
17  B  2  2
18  C  2  2

Hope this helps!


Although both answers run very fast and Parfait's solution is a nice solution to your problem and I certainly do not want to discredit his answer, I think it is good to point out that creating extra combinations and subsetting will become a larger issue when you data is larger. A benchmark is shown below.

Letters <- c(LETTERS[1:26],letters[1:4])
Numbers <- seq(30)
AlphaNumeric <- c(Letters, Numbers)


f_flo <- function()
{
  df1 = expand.grid(v1=Letters,v3=Numbers,stringsAsFactors = F)
  df1$v2 = df1$v1
  df1 = df1[,c('v1','v2','v3')]
  df2 = expand.grid(v1=Letters,v2=as.character(Numbers),v3=Numbers, stringsAsFactors = F)
  df = rbind(df1,df2)
}

f_parfait <- function()
{
  df <- expand.grid(x = Letters, y = AlphaNumeric, z = Numbers, stringsAsFactors = FALSE)
  sub <- subset(df,  (x == y | grepl("[0-9]", y)) &  grepl("[0-9]", z) )
  sub <- with(sub, sub[order(x, y, z),])   # SORT DATAFRAME
  rownames(sub) <- NULL                    # RESET ROWNAMES
}

library(dplyr)
one_letter <- function(l) {
  expand.grid(l, c(l, Numbers), Numbers, stringsAsFactors = FALSE)
}

f_stibu <- function(){
  df <- bind_rows(lapply(Letters, one_letter))
}


library(microbenchmark)
library(ggplot2)

run_times = microbenchmark(f_flo(),f_parfait(),f_stibu())
autoplot(run_times)

Results:

Unit: milliseconds
        expr        min         lq       mean     median         uq       max neval cld
     f_flo()   1.900719   2.047591   3.666935   2.314258   3.922053  78.74793   100  a 
 f_parfait() 138.028364 142.529904 152.876116 144.159444 146.835958 246.92318   100   b
   f_stibu()   4.130464   4.333130   5.169664   4.585028   6.209233  10.23139   100  a 

enter image description here

like image 200
Florian Avatar answered Nov 02 '22 22:11

Florian


Simply subset your expand.grid() dataframe with grepl calls:

df <- expand.grid(x = Letters, y = AlphaNumeric, z = Numbers, stringsAsFactors = FALSE)

sub <- subset(df,  (x == y | grepl("[0-9]", y)) )

sub <- with(sub, sub[order(x, y, z),])   # SORT DATAFRAME
rownames(sub) <- NULL                    # RESET ROWNAMES

head(sub, 10)    
#    x y z
# 1  A 0 0
# 2  A 0 1
# 3  A 0 2
# 4  A 0 3
# 5  A 0 4
# 6  A 0 6
# 7  A 0 7
# 8  A 0 9
# 9  A 1 0
like image 26
Parfait Avatar answered Nov 02 '22 23:11

Parfait