Where can I find usage statistics in Redshift?

Tags:

Before all thank you for your help!

I want to find out which tables in the database are most heavily used, i.e. the amount of users that query the table, the amount of times it was queried, the resources that where consumed by users per table, the total time the tables where queried, and any other useful data. For now I would limit the analysis to 9 specific tables. I'd tried using stl_scan and pg_user using the next two querys:

SELECT
    s.perm_table_name           AS table_name,
    count(*)                    AS qty_query,
    count(DISTINCT s.userid)    AS qty_users
FROM stl_scan s
JOIN pg_user b
    ON s.userid = b.usesysid
JOIN temp_mone_tables tmt
    ON tmt.table_id = s.tbl AND tmt.table = s.perm_table_name
WHERE s.userid > 1
GROUP BY 1
ORDER BY 1;

SELECT
    b.usename                                       AS user_name,
    count(*)                                        AS qty_scans,
    count(DISTINCT s.tbl)                           AS qty_tables,
    count(DISTINCT trunc(starttime))                AS qty_days
FROM stl_scan s
JOIN pg_user b
    ON s.userid = b.usesysid
JOIN temp_mone_tables tmt
    ON tmt.table_id = s.tbl AND tmt.table = s.perm_table_name
WHERE s.userid > 1
GROUP BY 1
ORDER BY 1;

The temp_mone_tables is a temporal table that contains the id and name of the tables I'm interested.

With this queries I'm able to get some information but I need more details. Surprisingly there's not much data online about this kind of statistics.

Again thank you all beforehand!

804

asked May 04 '18 14:05

Nambu14

2 Answers

Nice work! You are on the right track using the stl_scan table. I'm not clear what further details you're looking for.

For detailed metrics on resource usage you may want to use the SVL_QUERY_METRICS_SUMMARY view. Note that this data is summarized by query not table because a query is the primary way resources are utilized.

Generally, have a look at the admin queries (and views) in our Redshift Utils library on GitHub, particularly v_get_tbl_scan_frequency.sql

answered Oct 16 '22 20:10

Joe Harris

Thanks to Joe Harris' answer I was able to add a lot of information to my previous query. With svl_query_metrics_summary joined to stl_scan you get important data about resources consumption, this information can be extended joining them to the vast number of views listed in Joe's answer.

For me the solution begins with the next query:

SELECT *
FROM stl_scan ss
JOIN pg_user pu
    ON ss.userid = pu.usesysid
JOIN svl_query_metrics_summary sqms
    ON ss.query = sqms.query
JOIN temp_mone_tables tmt
    ON tmt.table_id = ss.tbl AND tmt.table = ss.perm_table_name

The query gives you a lot of data that can be summarized in multiple ways as wanted.

Remember that temp_mone_tables is a temp table that contains the tableid and name of the tables I'm interested.

answered Oct 16 '22 22:10

Nambu14

Related questions
                            
                                Group records by consecutive dates when dates are not exactly consecutive
                            
                                Update with Dapper using dynamic column name
                            
                                Spring @Query with Lower and Wildcards
                            
                                Find out if a business is currently open in T-SQL
                            
                                Pyspark: cast array with nested struct to string
                            
                                How to get the 2 greatest values between multiple columns?
                            
                                How to check if at least one of a group of rows has a specific value
                            
                                How can I continue a transaction in Spring Boot with PostgreSQL after an Exception occured?
                            
                                SQL Deduplicate List of Tuples
                            
                                How to check are there JSON Functions by SQL query?
                            
                                How to store key value pairs in MySQL?
                            
                                Substituting value in empty field after using split_part
                            
                                Where is the postgres sql 'cast a tuple' idiom documented?
                            
                                Postgresql: Violates check constraint. Failing row contains
                            
                                Why do you need to include a field in GROUP BY when using OVER (PARTITION BY x)?
                            
                                Conditional JOIN based on column value
                            
                                Why is the Max function used when we pivot text columns in SQL Server?
                            
                                Wordpress SQL: get post category and tags
                            
                                Updating a column from a varchar to jsonb
                            
                                sp_OAGetProperty returning NULL with OUT variable declared as MAX

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Where can I find usage statistics in Redshift?

Tags:

sql

database-administration

amazon-redshift

usage-statistics

Nambu14

People also ask

2 Answers

Joe Harris

Nambu14

Recent Activity

Donate For Us