Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Transform table to one-hot-encoding of single column value

I have a table with two columns:

+---------+--------+
| keyword | color  |
+---------+--------+
| foo     | red    |
| bar     | yellow |
| fobar   | red    |
| baz     | blue   |
| bazbaz  | green  |
+---------+--------+

I need to do some kind of one-hot encoding and transform table in PostgreSQL to:

+---------+-----+--------+-------+------+
| keyword | red | yellow | green | blue |
+---------+-----+--------+-------+------+
| foo     |   1 |      0 |     0 |    0 |
| bar     |   0 |      1 |     0 |    0 |
| fobar   |   1 |      0 |     0 |    0 |
| baz     |   0 |      0 |     0 |    1 |
| bazbaz  |   0 |      0 |     1 |    0 |
+---------+-----+--------+-------+------+

Is it possible to do with SQL only? Any tips on how to get started?

like image 886
Ernest Avatar asked Aug 10 '17 18:08

Ernest


People also ask

How do you one-hot encode the column?

In this technique, the categorical parameters will prepare separate columns for both Male and Female labels. So, wherever there is Male, the value will be 1 in Male column and 0 in Female column, and vice-versa.

How do I do one-hot encoding in Excel?

One-hot encoding. Create as many columns as there are unique values in a variable. Put a 1 in a cell if the column and row represent the same value, otherwise put a zero in the cell. Use these new columns to create ML models.

Is one-hot encoding data transformed?

In these cases, one-hot encoding comes in help because it transforms categorical data into numerical; in other words: it transforms strings into numbers so that we can apply our Machine Learning algorithms without any problems.


1 Answers

If I correctly understand, you need conditional aggregation:

select keyword,
count(case when color = 'red' then 1 end) as red,
count(case when color = 'yellow' then 1 end) as yellow
-- another colors here
from t
group by keyword
like image 109
Oto Shavadze Avatar answered Sep 23 '22 09:09

Oto Shavadze