Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Aggregate hstore column in PostreSQL

I have a table like this:

                         Table "public.statistics"

id         | integer                | not null default nextval('statistics_id_seq'::regclass)
goals      | hstore                 | 

items:

|id    |goals                  |
|30059 |"3"=>"123"             |
|27333 |"3"=>"200", "5"=>"10"  |

What I need to do for aggregate all values by key in hash?

I want to get result like this:

select sum(goals) from statistics

return

|goals                 |
|"3"=>"323", "5"=>"10" |
like image 683
Timothy Klim Avatar asked Oct 22 '12 21:10

Timothy Klim


People also ask

What is Hstore column?

The hstore module is used to implement the hstore data type in the form of key-value pairs for a single value within PostgreSQL. The hstore data type is remarkably effective in many cases, such as, multiple rows with multiple attributes which are rarely queried for or semi-structured data.

What is Hstore format?

This module implements the hstore data type for storing sets of key/value pairs within a single PostgreSQL value. This can be useful in various scenarios, such as rows with many attributes that are rarely examined, or semi-structured data. Keys and values are simply text strings.

What is PG Hstore?

pg-hstore is a node package for serializing and deserializing JSON data to hstore format.


2 Answers

Building on Laurence's answer, here's a pure SQL way to aggregate the summed key/value pairs into a new hstore using array_agg and the hstore(text[], text[]) constructor.

http://sqlfiddle.com/#!1/9f1fb/17

SELECT hstore(array_agg(hs_key), array_agg(hs_value::text))
FROM (
  SELECT
    s.hs_key, sum(s.hs_value::integer)
  FROM (
    SELECT (each(goals)).* FROM statistics
  ) as s(hs_key, hs_value)
  GROUP BY hs_key
) x(hs_key,hs_value)

I've also replaced to_number with a simple cast to integer and simplified the key/value iteration.

like image 94
Craig Ringer Avatar answered Oct 07 '22 11:10

Craig Ringer


There might be ways to do this to avoid the numeric conversion, but this should get the job done:

SELECT 
  key, Sum(to_number(value, '999999999999')) FROM (
  SELECT (each(goals)).key, (each(goals)).value FROM public.statistics
) as s
Group By
  key

http://sqlfiddle.com/#!1/eb745/10/0

This is a big smell that Postgres doesn't want to bend this way but:

create table test (id int, goals hstore);

Insert Into Test(id, goals) Values (30059, '3=>123');
Insert Into Test(id, goals) Values (27333, '3=>200,5=>10');

Create Function hagg() returns hstore As 
'Declare ret hstore := ('''' :: hstore); i hstore; c cursor for Select hstore(key, (x.Value::varchar)) From (Select key, Sum((s.value::int)) as Value From (Select (each(goals)).* From Test) as s Group By key) as x; BEGIN Open c; Loop Fetch c into i; Exit When Not FOUND; ret := i || ret; END LOOP; return ret; END' Language 'plpgsql';

I couldn't get sql fiddle to accept a multi line function body, in real postgres, you should be able to $$ quote this and break it up a bit.

http://sqlfiddle.com/#!1/e2ea7/1/0

like image 30
Laurence Avatar answered Oct 07 '22 11:10

Laurence