Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Azure Data Factory pipelines are failing when no files available in the source

Currently – we do our data loads from Hadoop on-premise server to SQL DW [ via ADF Staged Copy and DMG on-premise server]. We noticed that ADF pipelines are failing – when there are no files in the Hadoop on-premise server location [ we do not expect our upstreams to send the files everyday and hence its valid scenario to have ZERO files on Hadoop on-premise server location ].

Do you have a solution for this kind of scenario ?

Error message given below

Failed execution Copy activity encountered a user error: ErrorCode=UserErrorFileNotFound,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Cannot find the 'HDFS' file. ,Source=Microsoft.DataTransfer.ClientLibrary,''Type=System.Net.WebException,Message=The remote server returned an error: (404) Not Found.,Source=System,'.

Thanks, Aravind

like image 804
Aravind Avatar asked Mar 10 '17 16:03

Aravind


2 Answers

This Requirement can be solved by using the ADFv2 Metadata Task to check for file existence and then skip the copy activity if the file or folder does not exist:

https://docs.microsoft.com/en-us/azure/data-factory/control-flow-get-metadata-activity

like image 62
Jason Horner Avatar answered Sep 28 '22 13:09

Jason Horner


You can change the File Path Type to Wildcard, add the name of the file and add a "*" at the end of the name or any other place that suits you.

Options in ADF

This is a simple way to stop the Pipeline failing when there is no file.

like image 32
Fernando Hidalgo Avatar answered Sep 28 '22 12:09

Fernando Hidalgo