I'm looking to 'flatten' my dataset in order to facilitate data mining. Each categorical column should be changed to multiple Boolean columns. I have a column with categorical values, e.g.: <pre class="prettyprint"><code> ID col1 1 A 2 B 3 A </code></pre> I'm looking for a way to pivot this table, and have an aggregated function telling me whether this ID has value A or B: Result: <pre class="prettyprint"><code> ID col1A col1B 1 1 0 2 0 1 3 1 0 </code></pre> I tried using PIVOT but have no idea which aggregated function to use inside it. Also looked for answers in SF but couldn't find any... I'm using MS-SQL 2012. Any help would be appreciated! Omri EDIT: The number of categories in col1 is unknown, therefore the solution must be dynamic. Thanks :)

try this: <pre class="prettyprint"><code>select ID, col1A=(case when col1='A' then 1 else 0 end), col1B=(case when col1='B' then 1 else 0 end) from <table> </code></pre> IF you have one ID with both A and B and you want to have distinct ID in the output you could do <pre class="prettyprint"><code> select ID, col1A=max(case when col1='A' then 1 else 0 end), col1B=max(case when col1='B' then 1 else 0 end) from <table> group by id </code></pre> EDIT As per your comment, If you do not know the number of options for col1, then you can go for dynamic PIVOT <pre class="prettyprint"><code>DECLARE @cols AS NVARCHAR(MAX), @query AS NVARCHAR(MAX) select @cols = STUFF((SELECT distinct ',' + QUOTENAME(col1) from <table> FOR XML PATH(''), TYPE ).value('.', 'NVARCHAR(MAX)') ,1,1,'') set @query = 'SELECT id, ' + @cols + ' from <table> pivot ( count([col1]) for col1 in (' + @cols + ') ) p ' print(@query) execute(@query) </code></pre> <h3>SQL Fiddle Demo</h3>

Pivot cateorical values into boolean columns SQL

Tags:

sql

tsql

sql-server-2012

data-manipulation

pivot

I'm looking to 'flatten' my dataset in order to facilitate data mining. Each categorical column should be changed to multiple Boolean columns. I have a column with categorical values, e.g.:

I'm looking for a way to pivot this table, and have an aggregated function telling me whether this ID has value A or B:

Result:

 ID    col1A    col1B
  1     1        0
  2     0        1
  3     1        0

I tried using PIVOT but have no idea which aggregated function to use inside it.

Also looked for answers in SF but couldn't find any...

I'm using MS-SQL 2012.

Any help would be appreciated! Omri

EDIT:

The number of categories in col1 is unknown, therefore the solution must be dynamic. Thanks :)

586

asked Aug 21 '12 05:08

Omri374

1 Answers

try this:

select ID,
         col1A=(case when col1='A' then 1 else 0 end),
         col1B=(case when col1='B' then 1 else 0 end)
  from <table>

IF you have one ID with both A and B and you want to have distinct ID in the output you could do

 select ID,
         col1A=max(case when col1='A' then 1 else 0 end),
         col1B=max(case when col1='B' then 1 else 0 end)
  from <table> 
  group by id

EDIT

As per your comment, If you do not know the number of options for col1, then you can go for dynamic PIVOT

DECLARE @cols AS NVARCHAR(MAX),
    @query  AS NVARCHAR(MAX)

select @cols = STUFF((SELECT distinct ',' + QUOTENAME(col1) 
                    from <table> 
            FOR XML PATH(''), TYPE
            ).value('.', 'NVARCHAR(MAX)') 
        ,1,1,'')

set @query = 'SELECT id, ' + @cols + ' from <table> 

            pivot 
            (
                count([col1])
                for col1 in (' + @cols + ')
            ) p '
print(@query)
execute(@query)

SQL Fiddle Demo

150

answered Oct 14 '22 01:10

Joe G Joseph

Related questions
                            
                                Using Merge statement inside a cursor
                            
                                Performance overhead of single column vs multiple column indexes
                            
                                How do I work with high precision decimals in PHP
                            
                                R RODBC putting list of numbers into an IN() statement
                            
                                PHP OOP MySQL Programming
                            
                                How can I select the nearest value less-than and greater-than a given value efficiently?
                            
                                sql server: need to escape [?
                            
                                Simple update query is taking too long
                            
                                How to test an if statement in PostgreSQL?
                            
                                connecting sqlalchemy to MSAccess
                            
                                Why is there a non-clustered index scan when counting all rows in a table?
                            
                                when choose CROSS APPLY and when EXISTS?
                            
                                merging content of two tables without duplicating content
                            
                                sql query for top 10 records
                            
                                SQL query to add values of two columns containing null values?
                            
                                Check if a value exists in a collection stored in XML data type column
                            
                                Android check when table was last updated
                            
                                How to remove duplicating rows from union statement
                            
                                MySQL error in a procedure #1351 - View's SELECT contains a variable or parameter
                            
                                SQLITE - transposing rows into columns properly

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With