My goal is to read data from Cloud SQL Postgres to BigQuery via a Cloud Data Fusion pipeline.
For this, I set up a Cloud Data Fusion instance and assigned the following two permissions to the service account: (see https://cloud.google.com/data-fusion/docs/how-to/create-instance#setting_up_permissions)
As a next step I connected myself to the Cloud Data Fusion Instance, and navigated to Wrangler -> Add Connection -> Database -> Google Cloud SQL for PostgreSQL
.
As the driver I uploaded the postgres-socket-factory-1.0.13-jar-with-dependencies.jar
which I downloaded here: https://github.com/GoogleCloudPlatform/cloud-sql-jdbc-socket-factory
For the driver configuration, I set:
For the database connection, I set:
<PROJECT_NAME>:<REGION>:<INSTANCE_CONNECTION_NAME>
jdbc:postgresql://google/<DATABASE_NAME>?cloudSqlInstance=<INSTANCE_CONNECTION_NAME>&socketFactory=com.google.cloud.sql.postgres.SocketFactory
After clicking on Test Connection, I receive the org.postgresql.Driver error message.
Create a private Cloud SQL instance in the same VPC network as your Cloud Data Fusion instance. After the instance is created, go to the Cloud SQL Instances page and expand the Instance details. In the Connect to this instance section, copy the IP and the connection name.
Cloud SQL for PostgreSQL is a fully-managed database service that helps you set up, maintain, manage, and administer your PostgreSQL relational databases on Google Cloud Platform.
Most likely you need a driver and the JDBC Socket Factory to get it to work properly. You should bundle the postgres connector with the socket factory into a uber-jar and see if that solves the problem.
Here is a quick (untested) pom that should accomplish this:
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.example</groupId>
<artifactId>postgres-with-cloud-sql-socket-factory</artifactId>
<packaging>pom</packaging>
<version>0.0.1</version>
<dependencies>
<dependency>
<groupId>org.postgresql</groupId>
<artifactId>postgresql</artifactId>
<version>42.2.5</version>
</dependency>
<dependency>
<groupId>com.google.cloud.sql</groupId>
<artifactId>postgres-socket-factory</artifactId>
<version>1.0.13</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-assembly-plugin</artifactId>
<version>2.4.1</version>
<configuration>
<!-- get all project dependencies -->
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
</configuration>
<executions>
<execution>
<id>make-assembly</id>
<!-- bind to the packaging phase -->
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>
@Sebastian Auberger,
We had the same issue, basically Data Fusion creates a Service Account which tries to create a IAP Tunnel to the Cloud SQL Instance. We could get through this issue by giving the role "Cloud SQL Client".
Give it a shot and let us know!.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With