Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculated Column in SQL Server

I have my data in table as:

id  Author_ID   Research_Area       Category_ID  Paper_Count   Paper_Year   Rank  
---------------------------------------------------------------------------------
1   677         feature extraction  8            1             2005         1
2   677         image annotation    11           1             2005         2
3   677         probabilistic model 12           1             2005         3
4   677         semantic            19           1             2007         1
5   677         feature extraction  8            1             2009         1
6   677         image annotation    11           1             2011         1  
7   677         semantic            19           1             2012         1  
8   677         video sequence      5            2             2013         1  
9   1359        adversary model     1            2             2005         1
10  1359        ensemble method     14           2             2005         2
11  1359        image represent     11           2             2005         3
12  1359        adversary model     1            7             2006         1
13  1359        concurrency control 17           5             2006         2
14  1359        information system  12           2             2006         3  
15  ...         
16  ...  

Whereas I want to have an output of query as:

id  Author_ID   Category_ID  Paper_Count   Category_Prob   Paper_Year   Rank  
---------------------------------------------------------------------------------
1   677         8            1             0.333           2005         1
2   677         11           1             0.333           2005         2
3   677         12           1             0.333           2005         3
4   677         19           1             1.0             2007         1
5   677         8            1             1.0             2009         1
6   677         11           1             1.0             2011         1  
7   677         19           1             1.0             2012         1  
8   677         5            2             1.0             2013         1  
9   1359        1            2             0.333           2005         1
10  1359        14           2             0.333           2005         2
11  1359        11           2             0.333           2005         3
12  1359        1            7             0.5             2006         1
13  1359        17           5             0.357           2006         2
14  1359        12           2             0.142           2006         3  
15  ...         
16  ...  

Whereas Category_Prob is a calculated column which is calculated in two steps as:

Step First, we have to have a SUM of Paper_Count in each Paper_Year for instance i.e. Paper_Year = 2005 and Author_ID = 677, the SUM(Paper_Count) = 3

Step Second, then for each Category_ID, we have to divide Paper_Count with value of SUM(Paper_Count) in that Paper_Year which will be 1/3 i.e. 0.333 and so on...

Moreover, I have tried this query:

SELECT 
    Author_ID, Abstract_Category, Paper_Count,
    [Category_Prob] = Paper_Count / SUM(Paper_Count),
    Paper_Year, Rank
FROM 
    Author_Areas
GROUP BY 
    Author_ID, Abstract_Category, Paper_Year, Paper_Count, Rank
ORDER BY 
    Author_ID, Paper_Year

But it returns just 1 in the column Category_Prob for all of the rows in the table.

like image 874
maliks Avatar asked Apr 18 '16 08:04

maliks


People also ask

What is calculated column in SQL Server?

A computed column is a virtual column that is not physically stored in the table, unless the column is marked PERSISTED. A computed column expression can use data from other columns to calculate a value for the column to which it belongs.

How do I create a calculated column in SQL?

Go to your database, right click on tables, select “New Table” option. Create all columns that you require and to mark any column as computed, select that column and go to column Properties window and write your formula for computed column.

What are calculated fields in SQL?

A calculated field is a field that uses existing database fields and applies additional logic — it allows you to create new data from your existing data. A calculated field either: performs some calculation on database fields to create a value that is not directly stored in the database or.


1 Answers

The problem with your query is that you are not grouping by Paper_Year, but also by Author_ID, Abstract_Category, Paper_Count, Rank. Hence SUM(Paper_Count) is equal to Paper_Count for each group.

You can use SUM OVER for this:

SELECT      id, Author_ID, Abstract_Category [Category_ID],  
            Paper_Count, 
            Paper_Count * 1.0 / SUM(Paper_Count)  
            OVER (PARTITION BY Author_ID, Paper_Year) AS [Category_Prob],
            Paper_Year, Rank
FROM        Author_Areas
ORDER BY    Author_ID, Paper_Year

Note: You have to multiply by 1.0 so as to avoid integer division. Note 2: Perhaps you have to add Author_ID field in the PARTITION BY clause as well, if your actual requirement is to group by author, year.

like image 156
Giorgos Betsos Avatar answered Sep 21 '22 13:09

Giorgos Betsos