I know I've come across this problem before, but I'm having a bit of a mental block at the moment. and as I can't find it on SO, I'll post it here so I can find it next time. I have a dataframe that contains a field representing an ID label. This label has two parts, an alpha prefix and a numeric suffix. I want to split it apart and create two new fields with these values in. <pre class="prettyprint"><code>structure(list(lab = c("N00", "N01", "N02", "B00", "B01", "B02", "Z21", "BA01", "NA03")), .Names = "lab", row.names = c(NA, -9L ), class = "data.frame") df$pre<-strsplit(df$lab, "[0-9]+") df$suf<-strsplit(df$lab, "[A-Z]+") </code></pre> Which gives <blockquote> <pre class="prettyprint"><code> lab pre suf 1 N00 N , 00 2 N01 N , 01 3 N02 N , 02 4 B00 B , 00 5 B01 B , 01 6 B02 B , 02 7 Z21 Z , 21 8 BA01 BA , 01 9 NA03 NA , 03 </code></pre> </blockquote> So, the first strsplit works fine, but the second gives a list, each having two elements, an empty string and the result I want, and stuffs them both into the dataframe column. How can I select the second sub-element from each element of the list ? (or, is there a better way to do this)

To select the second element of each list item: <pre class="prettyprint"><code>R> sapply(df$suf, "[[", 2) [1] "00" "01" "02" "00" "01" "02" "21" "01" "03" </code></pre> An alternative approach using regular expressions: <pre class="prettyprint"><code>df$pre <- sub("^([A-Z]+)[0-9]+", "\\1", df$lab) df$suf <- sub("^[A-Z]+([0-9]+)", "\\1", df$lab) </code></pre>

with purrr::map this would be <pre class="prettyprint"><code>df$suf %>% map_chr(c(2)) </code></pre> for further info on purrr::map

First of all: if you use <code>str(df)</code> you'll see that <code>df$pre</code> is <code>list</code>. I think you want <code>vector</code> (but I might be wrong). Return to problem - in this case I will use <code>gsub</code>: <pre class="prettyprint"><code>df$pre <- gsub("[0-9]", "", df$lab) df$suf <- gsub("[A-Z]", "", df$lab) </code></pre> This guarantee that both columns are vectors, but it fail if your label is not from key (i.e. <code>'AB01B'</code>).

How to get the second sub element of every element in a list

Tags:

r

I know I've come across this problem before, but I'm having a bit of a mental block at the moment. and as I can't find it on SO, I'll post it here so I can find it next time.

I have a dataframe that contains a field representing an ID label. This label has two parts, an alpha prefix and a numeric suffix. I want to split it apart and create two new fields with these values in.

structure(list(lab = c("N00", "N01", "N02", "B00", "B01", "B02", 
"Z21", "BA01", "NA03")), .Names = "lab", row.names = c(NA, -9L
), class = "data.frame")

df$pre<-strsplit(df$lab, "[0-9]+")
df$suf<-strsplit(df$lab, "[A-Z]+")

Which gives

   lab pre  suf
1  N00   N , 00
2  N01   N , 01
3  N02   N , 02
4  B00   B , 00
5  B01   B , 01
6  B02   B , 02
7  Z21   Z , 21
8 BA01  BA , 01
9 NA03  NA , 03

So, the first strsplit works fine, but the second gives a list, each having two elements, an empty string and the result I want, and stuffs them both into the dataframe column.

How can I select the second sub-element from each element of the list ? (or, is there a better way to do this)

792

asked May 10 '10 14:05

PaulHurleyuk

3 Answers

To select the second element of each list item:

R> sapply(df$suf, "[[", 2)
[1] "00" "01" "02" "00" "01" "02" "21" "01" "03"

An alternative approach using regular expressions:

df$pre <- sub("^([A-Z]+)[0-9]+", "\\1", df$lab)
df$suf <- sub("^[A-Z]+([0-9]+)", "\\1", df$lab)

190

answered Oct 05 '22 04:10

rcs

with purrr::map this would be

df$suf %>%  map_chr(c(2))

for further info on purrr::map

answered Oct 05 '22 06:10

Uwe Sterr

First of all: if you use str(df) you'll see that df$pre is list. I think you want vector (but I might be wrong).
Return to problem - in this case I will use gsub:

df$pre <- gsub("[0-9]", "", df$lab)
df$suf <- gsub("[A-Z]", "", df$lab)

This guarantee that both columns are vectors, but it fail if your label is not from key (i.e. 'AB01B').

answered Oct 05 '22 06:10

Marek

Related questions
                            
                                R plot: size and resolution
                            
                                How do you use multiple versions of the same R package?
                            
                                How to save a plot made with ggplot2 as SVG
                            
                                Getting strings recognized as variable names in R
                            
                                Summarizing multiple columns with data.table
                            
                                Function to calculate geospatial distance between two points (lat,long) using R [duplicate]
                            
                                How to convert R Markdown to HTML? I.e., What does "Knit HTML" do in Rstudio 0.96?
                            
                                Combining paste() and expression() functions in plot labels
                            
                                Decompress gz file using R
                            
                                R adding days to a date [duplicate]
                            
                                Legend placement, ggplot, relative to plotting region
                            
                                Unable to install packages in latest version of RStudio and R Version.3.1.1 [duplicate]
                            
                                Moving columns within a data.frame() without retyping
                            
                                Adding custom image to geom_polygon fill in ggplot
                            
                                Test if an argument of a function is set or not in R
                            
                                Library is not writable
                            
                                avoid string printed to console getting truncated (in RStudio)
                            
                                Reading text file with multiple space as delimiter in R
                            
                                Getting path of an R script
                            
                                Lattice: multiple plots in one window?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With