Currently – we do our data loads from Hadoop on-premise server to SQL DW [ via ADF Staged Copy and DMG on-premise server]. We noticed that ADF pipelines are failing – when there are no files in the Hadoop on-premise server location [ we do not expect our upstreams to send the files everyday and hence its valid scenario to have ZERO files on Hadoop on-premise server location ].
Do you have a solution for this kind of scenario ?
Error message given below
Failed execution Copy activity encountered a user error: ErrorCode=UserErrorFileNotFound,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Cannot find the 'HDFS' file. ,Source=Microsoft.DataTransfer.ClientLibrary,''Type=System.Net.WebException,Message=The remote server returned an error: (404) Not Found.,Source=System,'.
Thanks, Aravind
This Requirement can be solved by using the ADFv2 Metadata Task to check for file existence and then skip the copy activity if the file or folder does not exist:
https://docs.microsoft.com/en-us/azure/data-factory/control-flow-get-metadata-activity
You can change the File Path Type to Wildcard, add the name of the file and add a "*" at the end of the name or any other place that suits you.
This is a simple way to stop the Pipeline failing when there is no file.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With