Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to split a number into digits in R

Tags:

split

dataframe

r

I have a data frame with a numerical ID variable which identify the Primary, Secondary and Ultimate Sampling Units from a multistage sampling scheme. I want to split the original ID variable into three new variables, identifying the different sampling units separately:

Example:

>df[1:2,]
ID Var        var1     var2      var3     var4         var5  
501901          9    SP.1          1        W         12.10    
501901          9    SP.1          2        W         17.68  

What I want:

>df[1:2,]
ID1    ID2     ID3   var1   var2  var3     var4    var5  
5      01      901    9    SP.1    1        W     12.10    
5      01      901    9    SP.1    2        W     17.68  

I know there is some functions available in R to split character strings, but I could not find same facilities for numbers.

Thank you,

Juan

like image 214
jrs-x Avatar asked Mar 19 '13 11:03

jrs-x


People also ask

How do I separate numbers in numbers in R?

To split a number into digits in R, we can use strsplit function by reading the number with as. character and then reading the output with as. numeric.

How do you convert a number into a digit?

Step 1 − Divide the decimal number to be converted by the value of the new base. Step 2 − Get the remainder from Step 1 as the rightmost digit (least significant digit) of new base number. Step 3 − Divide the quotient of the previous divide by the new base.

How do you split a string of numbers?

To split a string into a list of integers: Use the str. split() method to split the string into a list of strings. Use the map() function to convert each string into an integer.

How do you split a number into digits in Python?

To split an integer into digits:Use the str() class to convert the integer to a string. Use a for loop to iterate over the string. Use the int() class to convert each substring to an integer and append them to a list.

How to split each digit from a number?

Now, we have to split the digit 1 from number 12. This can be achieved by dividing the number by 10 and take the modulo 10. Using above method, we can split each digit from a number.

Is it possible to split character strings into numbers in R?

I know there is some functions available in R to split character strings, but I could not find same facilities for numbers. why don't you try convert your id to string with as.character () then to use strsplit () and then back to numbers with as.numeric () ?

What is split in R and how to use it?

The split function allows dividing data in groups based on factor levels. In this tutorial we are going to show you how to split in R with different examples, reviewing all the arguments of the function.

How to split a vector into two vectors in R?

Split vector in R. Suppose you have a named vector, where the name of each element corresponds to the group the element belongs. Hence, you can split the vector in two vectors where the elements are of the same group, passing the names of the vector with the names function to the argument f.


3 Answers

You could use for example use substring:

df <- data.frame(ID = c(501901, 501902))

splitted <- t(sapply(df$ID, function(x) substring(x, first=c(1,2,4), last=c(1,3,6))))
cbind(df, splitted)
#      ID 1  2   3
#1 501901 5 01 901
#2 501902 5 01 902
like image 198
EDi Avatar answered Oct 20 '22 23:10

EDi


Yet another alternative is to re-read the first column using read.fwf and specify the widths:

cbind(read.fwf(file = textConnection(as.character(df[, 1])), 
               widths = c(1, 2, 3), colClasses = "character", 
               col.names = c("ID1", "ID2", "ID3")), 
      df[-1])
#   ID1 ID2 ID3 var1 var2 var3 var4  var5
# 1   5  01 901    9 SP.1    1    W 12.10
# 2   5  01 901    9 SP.1    2    W 17.68

One advantage here is being able to set the resulting column names in a convenient manner, and ensure that the columns are characters, thus retaining any leading zeroes that might be present.

like image 26
A5C1D2H2I1M1N2O1R2T1 Avatar answered Oct 20 '22 21:10

A5C1D2H2I1M1N2O1R2T1


This should work:

df <- cbind(do.call(rbind, strsplit(gsub('(.)(..)(...)', '\\1 \\2 \\3', paste(df[,1])),' ')), df[,-1]) # You need that paste() there because gsub() works only with text.

Or with substr()

df <- cbind(ID1=substr(df[, 1],1,1), ID2=substr(df[, 1],2,3), ID3=substr(df[, 1],4,6), df[, -1])
like image 38
Rcoster Avatar answered Oct 20 '22 21:10

Rcoster