I have a HDInsight Hadoop cluster (Linux, deployed separately) on Azure VNet (restricting client IPs using NSG).
Azure SQL firewall has an option called "Allow access to Azure services", which allows Data Factory to access Azure SQL.
In VNet there is no such option, you have to either specify IP addresses range or set a tag (Internet, Virtual Network, AzureLoadBalancer). I thought AzureLoadBalancer will solve the issue, but no - HDInsight is still hidden from Azure Data Factory.
I tried to find Data Factory port ranges, unsuccessfully.
Is there a way to access secured HDInsight Linux cluster from Azure Data Factory?
With Azure Data Factory V2 the Scenario is supported. It requires deployment of an Azure self-hosted integration runtime (IR) in the vnet of the HDInsight cluster. The self-hosted IR allows Data Factory service to dispatch processing requests to a compute service such as HDInsight inside a virtual network. See also the following Documentation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With