Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Azure data factory | incremental data load from SFTP to Blob

I created a (once run) DF (V2) pipeline to load files (.lta.gz) from a SFTP server into an azure blob to get historical data. Worked beautifully. Every day there will be several new files on the SFTP server (which cannot be manipulated or deleted). So I want to create an incremental load pipeline which checks daily for new files - if so ---> copy new files.

Does anyone have any tips for me how to achieve this?

like image 487
stenph Avatar asked Oct 24 '25 19:10

stenph


1 Answers

Thanks for using Data Factory!

To incrementally load newly generated files on SFTP server, you can leverage the GetMetadata activity to retrieve the LastModifiedDate property: https://learn.microsoft.com/en-us/azure/data-factory/control-flow-get-metadata-activity

Essentially you author a pipeline containing the following activities:

  • getMetadata (return list of files under a given folder)
  • ForEach (iterate through each file)
  • getMetadata (return lastModifiedTime for a given file)
  • IfCondition (compare lastModifiedTime with trigger WindowStartTime)
  • Copy (copy file from source to destination)

Have fun building data integration flows using Data Factory!

like image 179
ShirleyWang-MSFT Avatar answered Oct 26 '25 09:10

ShirleyWang-MSFT



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!