Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Timestream - Pivot data per dimensions

TimeStream stores data with a key:value approach.

Is there any simple way to pivot the data based on the dimensions to get for instance the max of all available measures in a column that is named as the measure name ?

Let take an example, with following dataset :

| time                              | instance_id   | measure_name          | measure_value::double     | measure_value::bigint     |
|-------------------------------    |-------------  |--------------------   |-----------------------    |-----------------------    |
| 2019-12-04 19:00:00.000000000     | A             | cpu_utilization       | 35                        | null                      |
| 2019-12-04 19:00:01.000000000     | A             | cpu_utilization       | 38.2                      | null                      |
| 2019-12-04 19:00:02.000000000     | B             | cpu_utilization       | 45.3                      | null                      |
| 2019-12-04 19:00:00.000000000     | A             | memory_utilization    | 54.9                      | null                      |
| 2019-12-04 19:00:01.000000000     | A             | memory_utilization    | 42.6                      | null                      |
| 2019-12-04 19:00:02.000000000     | B             | memory_utilization    | 33.3                      | null                      |

We want to construct a generic request that would provide the following result without needing any code transformation :

| instance_id   | cpu_utilization (max)     | memory_utilization (max)  |
|-------------  |-----------------------    |-------------------------- |
| A             | 38.2                      | 54.9                      |
| B             | 45.3                      | 33.3                      |

Of course, I know that by getting the following (see next table), with few code in my favorite language it is very easy to make the pivot, but I was wondering if this is possible on a native way.

| instance_id   | measure_name          | max(measure_value)    |
|-------------  |--------------------   |--------------------   |
| A             | cpu_utilization       | 38.2                  |
| B             | cpu_utilization       | 45.3                  |
| A             | memory_utilization    | 54.9                  |
| B             | memory_utilization    | 33.3                  |

Thank you

like image 219
Toto Avatar asked Dec 03 '20 10:12

Toto


People also ask

What is dimension in AWS Timestream?

Dimension represents the metadata attributes of the time series. For example, the name and Availability Zone of an EC2 instance or the name of the manufacturer of a wind turbine are dimensions. For constraints on dimension names, see Naming Constraints. Type: String. Length Constraints: Minimum length of 1.

Is Timestream NoSQL?

Being a type of NoSQL database, Timestream has its own type of data model distinct from both traditional SQL data models, and many other NoSQL data models. Timestream is considered a schema-less database as there is no enforced schema.

What is Amazon Timestream based on?

Amazon Timestream automatically scales your writes, storage, and query capacity based on usage. You can set the data retention policy for each table and choose to store data in an in-memory store or magnetic store. For detailed pricing, see the pricing page.


1 Answers

Do a GROUP BY. Use case expressions to do conditional aggregation:

select instance_id, 
       max(case when measure_name = 'cpu_utilization' then measure_value end),
       max(case when measure_name = 'memory_utilization' then measure_value end)
from tablename
group by instance_id
like image 92
jarlh Avatar answered Nov 03 '22 02:11

jarlh