I have an ItemGroup, and want to process all its items in parallel (using a custom task or an .exe).
/maxCpuCount
param, since otherwise I might end up over-parallelizing./maxCpuCount
only works for building different projects, not items (see code below)How can I process items from an ItemGroup in parallel?
Is there a way to author a custom task to work in parallel in conjunction with MSBuild's Parallel support?
<Project ToolsVersion="4.0" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
<Target Name="Build" >
<!-- Runs only once - I guess MSBuild detects it's the same project -->
<!--<MSBuild Projects="$(MSBuildProjectFullPath);$(MSBuildProjectFullPath)" Targets="Wait3000" BuildInParallel="true" />-->
<!-- Runs in parallel!. Note that b.targets is a copy of the original a.targets -->
<MSBuild Projects="$(MSBuildProjectFullPath);b.targets" Targets="Wait3000" BuildInParallel="true" />
<!-- Runs sequentially -->
<ItemGroup>
<Waits Include="3000;2000"/>
</ItemGroup>
<Wait DurationMs="%(Waits.Identity)" />
</Target>
<Target Name="Wait3000">
<Wait DurationMs="3000" />
</Target>
<UsingTask TaskName="Wait" TaskFactory="CodeTaskFactory" AssemblyFile="$(MSBuildToolsPath)\Microsoft.Build.Tasks.v4.0.dll" >
<ParameterGroup>
<DurationMs ParameterType="System.Int32" Required="true" />
</ParameterGroup>
<Task>
<Code Type="Fragment" Language="cs">
Log.LogMessage(string.Format("{0:HH\\:mm\\:ss\\:fff} Start DurationMs={1}", DateTime.Now, DurationMs), MessageImportance.High);
System.Threading.Thread.Sleep(DurationMs);
Log.LogMessage(string.Format("{0:HH\\:mm\\:ss\\:fff} End DurationMs={1}", DateTime.Now, DurationMs), MessageImportance.High);
</Code>
</Task>
</UsingTask>
</Project>
Multiple jobs are processed simultaneously on a given batch processing machine in parallel batching. The resulting batch is called a p-batch. Batching can lead to reduced production costs, but depending how the jobs are grouped into a batch can lead to better or worse delivery times of products.
You call that batch processing beacuse you are not taking part on it, you just run the script and forget about it until it's finished. The same way, you call that parallel processing because you have 20 processes going on at the same time (on different computers) regardless they need interaction or not.
Memory footprint: Since you said you have millions of records to process, parallel for each will aggregate all the processed records at the end and can possibly cause Out Of Memory. Batch job instead provides a BatchResult in the on complete phase where you can get the count of failures and success.
Each batch job contains three different phases: Load and Dispatch. Process. On Complete.
I know this is old, but if you get a few minutes, revisit your attempt to use the MSBuild
task. Using the Properties
and/or AdditionalProperties
reserved item metadata elements* will resolve the issue you described in your code sample ("Runs only once - I guess MSBuild detects it's the same project").
The MSBuild file below processes items from an ItemGroup in parallel via MSBuild's parallel support (including /maxCpuCount
). It does not use BuildTargetsInParallel
from the MSBuild Extension Pack, nor any other custom or inline task.
<Project ToolsVersion="4.0" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
<Target Name="Build" >
<ItemGroup>
<Waits Include="3000;2000"/>
</ItemGroup>
<ItemGroup>
<ProjectItems Include="$(MSBuildProjectFullPath)">
<Properties>
WaitMs=%(Waits.Identity)
</Properties>
</ProjectItems>
</ItemGroup>
<MSBuild Projects="@(ProjectItems)" Targets="WaitSpecifiedMs" BuildInParallel="true" />
</Target>
<Target Name="WaitSpecifiedMs">
<Wait DurationMs="$(WaitMs)" />
</Target>
</Project>
* Well-hidden under "Properties Metadata" on the MSBuild Task reference page.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With