My team has read-only access to a remote PostgreSQL database - which we query remotely from our local SQL Server instance using a linked server connection. We cannot modify the remote Server's databases in any way.
We currently make the remote server do the heavy lifting by running some fairly large ad-hoc SQL scripts, using EXECUTE (@strSQL) AT [RemoteServer], where @strSQL is some complex PostgreSQL SELECT statement. We're trying to capture some of these often-used queries as VIEWs, which works for the most part:
CREATE VIEW [dbo].[vwRemoteData] AS
SELECT * FROM OPENQUERY([RemoteServer], 'SELECT ... <PostgreSQL logic here>')
But this fails in cases where the query string is longer than 8000 characters.
The character string that starts with 'SELECT ... ' is too long. Maximum length is 8000.
Failed Attempts: Normally, a workaround for >8000 character strings is to DECLARE a variable, cast to nvarchar(max), and/or use string concatenation - but none of these things appear to be allowed within an OPENQUERY call, (and VIEWs do not support DECLARE or EXECUTE ... AT [RemoteServer]).
I've also tried getting rid of the call to EXECUTE ... AT or OPENQUERY by using the 4-part-identifier syntax, but this requires converting PostgreSQL logic to TSQL, which isn't always straightforward, and the performance seems much worse (hours versus minutes) - presumably because SQL is trying to stream remote table rows over the wire and perform some of the joins / filters locally.
As a workaround, we currently manually EXECUTE the query results INTO a new SQL Table, then work off that table, but then we have a manual data-synchronization issue on our hands, which we'd like to avoid.
Context: The intended usage is ultimately to point Power BI to some of these large VIEWs and have it refresh a corresponding dataset each night. We're trying to avoid the extra step of having to "materialize" the remote query results to our own table in SQL, in part because it's large (~60 million rows), but also because that would seem to require introducing some sort of daily ETL solution beyond the current SQL + Power BI setup. Power BI doesn't have direct connectivity to the remote Postgres server, only to our local SQL server.
Question: Can anyone think of a workaround that would let us capture this business logic as a view or similar query-able entity that Power BI can refresh a dataset from? Would you approach this problem a different way entirely?
I don't know if it works for you, but you can use EXECUTE AT in the views.
Something like this:
CREATE OR ALTER VIEW PostGres AS
select *
from openquery(LOCALHOST,'
set nocount ON;
declare @sql nvarchar(max) = ''select N'''''' + replicate(cast(''A'' as nvarchar(max)), 444) + ''''''''
exec(@sql) at POSTGRESERVER with result sets((A nvarchar(max)))')
This is one potential solution, since OPENQUERY can run any code, you can build the actual query inside it. The important parts is that you:
SET NOCOUNT ON as first statementwith result sets qualifierOf course it might still fail if remote server doesn't like long strings inside EXEC.
Then, you can actually store the query string inside PGSQL and use whatever measures you need to execute it dynamically.
This is somewhat of a hack, it would be much easier to just create a view on PostGre Side of things.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With