I have an interesting (only for me, perhaps, :)) question. I have text like: <pre class="prettyprint"><code>"abbba" </code></pre> The question is to find all possible substrings of length n in this string. For example, if <code>n = 2</code>, the substrings are <pre class="prettyprint"><code>'ab','bb','ba' </code></pre> and if <code>n = 3</code>, the substrings are <pre class="prettyprint"><code>'abb','bbb','bba' </code></pre> I thought to use something like this: <pre class="prettyprint"><code>x <- 'abbba' m <- matrix(strsplit(x, '')[[1]], nrow=2) apply(m, 2, paste, collapse='') </code></pre> But I got a warning and it doesn't work for len = 3.

We may use <pre class="prettyprint"><code>x <- "abbba" allsubstr <- function(x, n) unique(substring(x, 1:(nchar(x) - n + 1), n:nchar(x))) allsubstr(x, 2) # [1] "ab" "bb" "ba" allsubstr(x, 3) # [1] "abb" "bbb" "bba" </code></pre> where <code>substring</code> extracts a substring from <code>x</code> starting and ending at specified positions. We exploit the fact that <code>substring</code> is vectorized and pass <code>1:(nchar(x) - n + 1)</code> as starting positions and <code>n:nchar(x)</code> as ending positions.

Find all possible substrings of length n

Tags:

r

I have an interesting (only for me, perhaps, :)) question. I have text like:

"abbba"

The question is to find all possible substrings of length n in this string. For example, if n = 2, the substrings are

'ab','bb','ba'

and if n = 3, the substrings are

'abb','bbb','bba'

I thought to use something like this:

x <- 'abbba'
m <- matrix(strsplit(x, '')[[1]], nrow=2)
apply(m, 2, paste, collapse='')

But I got a warning and it doesn't work for len = 3.

836

asked Feb 22 '16 18:02

Lionir

1 Answers

We may use

x <- "abbba"
allsubstr <- function(x, n) unique(substring(x, 1:(nchar(x) - n + 1), n:nchar(x)))
allsubstr(x, 2)
# [1] "ab" "bb" "ba"
allsubstr(x, 3)
# [1] "abb" "bbb" "bba"

where substring extracts a substring from x starting and ending at specified positions. We exploit the fact that substring is vectorized and pass 1:(nchar(x) - n + 1) as starting positions and n:nchar(x) as ending positions.

138

answered Nov 15 '22 05:11

Julius Vainora

Related questions
                            
                                shiny how to update value that store in the reactive?
                            
                                ggplot2 - Change `geom_rect` colour in a stacked barplot
                            
                                Language dependent sorting with R
                            
                                apply diff() only on consecutive days
                            
                                List files on HTTP/FTP server in R
                            
                                Load all files from folder and subfolders
                            
                                Mutate with dplyr using multiple conditions
                            
                                Converting day of week to number in R
                            
                                Concatenate (paste) elements based on indices
                            
                                Group by and select min date with data.table
                            
                                R: error installing packages UBUNTU - Error in dyn.load(file, DLLpath = DLLpath, ...) : unable to load shared object
                            
                                RPostgreSQL - import dataframe into a table
                            
                                Vectorization of a for-loop in R
                            
                                R strsplit doesn't split on "."?
                            
                                Create a two-mode frequency matrix in R
                            
                                Rmarkdown table with cells that have two values
                            
                                Change color of specific tick in ggplot2
                            
                                How to create a conditional dummy in R?
                            
                                Create N random integers with no gaps
                            
                                Reading multiple JSON files in a directory into one Data Frame

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With