I need help also about paging and using UNION ALL
for multiple tables:
How do i implement an optimized paging when joining multiple tables using UNION ALL
and returning only specific number of rows...
declare @startRow int
declare @PageCount int
set @startRow = 0
set @PageCount = 20
set rowcount @PageCount
select Row_Number() OVER(Order by col1) as RowNumber, col1, col2
from
(
select col1, col2 from table1 where datetimeCol between (@dateFrom and @dateTo)
union all
select col1, col2 from table2 where datetimeCol between (@dateFrom and @dateTo)
union all
select col1, col2 from table3 where datetimeCol between (@dateFrom and @dateTo)
union all
select col1, col2 from table4 where datetimeCol between (@dateFrom and @dateTo)
union all
select col1, col2 from table5 where datetimeCol between (@dateFrom and @dateTo)
) as tmpTable
where RowNumber > @startRow
table 3, 4, & 5 have huge number of row (millions of rows) where table 1 & 2 may only have few thousand rows.
If startRow is "0", I only expect data from Row 1 to 20 (from Table1). I'm getting the correct result but has a high overhead on the remaining table while sql server tries to all all the data and filter it....
the longer the interval of the @dateFrom and @dateTo makes my query significantly slower while trying to retrieve only few rows from the overall result set
Please help how i can implement a simple but better approach with a similar logic. :(
Conclusion. Combining several tables to one large table is possible in all 3 ways. As we have seen, the behavior of UNION in SQL Server and UNION in DAX within Power BI is very similar.
Use UNION ALL instead of UNION whenever is possible That is why UNION ALL is faster. Because it does not remove duplicated values in the query. If there are few rows (let's say 1000 rows), there is almost no performance difference between UNION and UNION ALL. However, if there are more rows, you can see the difference.
The Union operator combines the results of two or more queries into a single result set that includes all the rows that belong to all queries in the Union. In simple terms, it combines the two or more row sets and keeps duplicates.
Consider using OFFSET FETCH clause (works starting with MSSQL 2012):
declare @startRow int
declare @PageCount int
set @startRow = 0
set @PageCount = 20
select col1, col2
from
(
select col1, col2 from table1 where datetimeCol between (@dateFrom and @dateTo)
union all
select col1, col2 from table2 where datetimeCol between (@dateFrom and @dateTo)
union all
select col1, col2 from table3 where datetimeCol between (@dateFrom and @dateTo)
union all
select col1, col2 from table4 where datetimeCol between (@dateFrom and @dateTo)
union all
select col1, col2 from table5 where datetimeCol between (@dateFrom and @dateTo)
) as tmpTable
order by col1
offset @startRow rows
fetch next @PageCount rows only
I also want to mention here, why this query always takes O(n*log(n)) time.
To execute this query, database needs to:
If the performance of this query is still poor and you want to increase in, try to:
There maybe an issue with your database design since you have 5 similar tables. But besides this, you could materialize your UNION ALL query into a permanent table or a temp #-table with appropriate indexes on it and finally paginate over materialized data set with ROW_NUMBER() clause.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With