We have been using azure devops for a while now, and we have a very large suite in a repository, with an extensive yaml pipeline. It has all kinds of parallel jobs, and we have multiple hosted agents available to run the jobs in parallel. In order to speed up builds, I'm doing all kinds of optimizations (like caching nuget packages). However, because of the size of our repository, the pipeline jobs run about 2 and a half minutes before even starting any task, because it is running the checkout task to get the source to the hosted agent.
We probably added some large unnecessary files to the repository at the beginning of our project, and this probably caused the repository to bloat a bit. I've found some documentation on how to remove large files from the repository, but the document is quite vague about it. Is this a proper way to try to improve the checkout time? If so, is there anyone who can give me a detailed description on how to remove unwanted files from a git repository and pushing this to azure devops?
If there are any other things I can do to improve checkout speeds (apart from using private agents), I'm open to ideas
The checkout behaviour can be customized by the checkout
keyword. In particular it is possible to specify the fetchDepth
(defaults to no limit) to do a shallow fetch, which could improve performance.
From the Azure devops docs on Shallow Fetch:
If your repository is large, this option might make your build pipeline more efficient. Your repository might be large if it has been in use for a long time and has sizeable history. It also might be large if you added and later deleted large files.
Yaml pipeline example:
steps:
- checkout: self
clean: true
fetchDepth: 1 # Fetch only one commit
path: PutMyCodeHere
Azure devops documentation for how to specify fetchDepth in yaml pipelines
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With