Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Joining factor levels of two columns

Tags:

dataframe

r

I have 2 columns of data with the same type of data (Strings).

I want to join the levels of the columns. ie. we have:

col1   col2
Bob    John
Tom    Bob
Frank  Jane
Jim    Bob
Tom    Bob
...    ... (and so on)

now col1 has 4 levels (Bob, Tom Frank, Jim) and col2 has 3 levels (John, Jane, Bob)

But I want both columns to have all the factor levels (Bob, Tom, Frank, Jim, Jane, John), as to later replace each of the 'names' with a unique id, such that the final output would be:

col1   col2
1      5
2      1
3      6
4      1
2      1

that is Bob -> 1, Tom -> 2, etc. in both columns.

Any ideas :) ?

edit: Thanks all for the wonderful answers! You are all awesome as far as I know :)

like image 244
abcde123483 Avatar asked Jan 31 '11 19:01

abcde123483


People also ask

How do you convert multiple columns to factors?

To convert the data type of all columns from integer to factor, we can use lapply function with factor function.

How do I combine two factor columns in R?

How do I concatenate two columns in R? To concatenate two columns you can use the <code>paste()</code> function. For example, if you want to combine the two columns A and B in the dataframe df you can use the following code: <code>df['AB'] <- paste(df$A, df$B)</code>.

What are column factors?

Column factor is equal to sum of entries of row divide by sum of entries of column and vice versa for the Row factor (so I will have 12 factors).


2 Answers

x <- structure(list(col1 = structure(c(1L, 4L, 2L, 3L, 4L), .Label = c("Bob", "Frank", "Jim", "Tom"), class = "factor"), col2 = structure(c(3L, 1L, 2L, 1L, 1L), .Label = c("Bob", "Jane", "John"), class = "factor")), .Names = c("col1", "col2"), class = "data.frame", row.names = c(NA, -5L))

Make a simple union of factor names:

both <- union(levels(x$col1), levels(x$col2))

And relevel the two factors:

x$col1 <- factor(x$col1, levels=both)
x$col2 <- factor(x$col2, levels=both)

After editing: added example to make numeric values from factors

You could simply transform the factor levels to numeric values, e.g.:

as.numeric(x$col1)

Or a more simpler, nicer solution based on @Gavin Simpson's hint below in one step:

data.matrix(x)
like image 176
daroczig Avatar answered Nov 23 '22 05:11

daroczig


You want the factors to include all the unique names from both columns.

col1 <- factor(c("Bob", "Tom", "Frank", "Jim", "Tom"))
col2 <- factor(c("John", "Bob", "Jane", "Bob", "Bob"))
mynames <- unique(c(levels(col1), levels(col2)))
fcol1 <- factor(col1, levels = mynames)
fcol2 <- factor(col2, levels = mynames)

EDIT: a little nicer if you replace the third line with this:

mynames <- union(levels(col1), levels(col2))
like image 38
J. Win. Avatar answered Nov 23 '22 05:11

J. Win.