I have a data frame of this type
string1,string2,value1
string3,string1,value2
string3,string5,value3
...
...
I would convert srings in unique integers:
1,2,value1
3,1,value2
3,5,value3
...
...
I am trying with c() operator, that convert the string in a unique integer. The problem is how to manage the two columns of the data frame. How can I do this?
If you want to assign numbers to the strings, rather than removing the text 'string', you can use a factor with known levels, then coerce to numeric.
d <- read.csv(header=TRUE, file=textConnection("a,b,c
string1,string2,value1
string3,string1,value2
string3,string5,value3"))
l=unique(c(as.character(d$a), as.character(d$b)))
d1 <- data.frame(a=as.numeric(factor(d$a, levels=l)), b=as.numeric(factor(d$b, levels=l)), c=d$c)
> d1
a b c
1 1 3 value1
2 2 1 value2
3 2 4 value3
Note that the numeric values chosen do not agree with the numerals in the strings, but each string is given a unique number.
Here's a simple solution using match
:
df <- read.csv(text="string1,string2,value1
string3,string1,value2
string3,string5,value3", header = FALSE)
cbind(sapply(df[-3], match, unique(unlist(df[-3]))), df[3])
V1 V2 V3
1 1 3 value1
2 2 1 value2
3 2 4 value3
How it works: The values of both columns are matched with a vector of unique numbers of these columns. This returns their positions.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With