Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Diagnosing TFS Build Hanging after 'Copy Files to Drop Location' step

Tags:

tfs

tfsbuild

I need some advice on how to diagnose a hanging build. It’s only been happening in the last week or two and I have good reason to suspect it’s something that I’ve done recently and not just a coincidence

Setup

  • TFS 2013
  • 4 machine setup - 2 app tiers (in process of deprecating one of them), 1 sql server, 1 build server running 2 agents.
  • Build Controller is running on 2nd app tier along with the Job Agent
  • 1st App tier is serving the website (although that machine will soon be shutdown and everything will be passed to the 2nd app tier as the machine is getting old)

Symptoms

  • All executed builds (doesn’t appear to matter which build process template) never get marked as done, the last step always seems to be the same step “Copy Files to Drop Location”/“Workspace and Copy Files to Drop Location”/”Copy Binaries to drop, Reset the environment” (named differently in each build template)
  • The files appear to be getting dropped successfully in the build drop folder
  • Looking at the task manager it appears that all the build processes on the build server are exited (only TFSBuildServiceHost
  • Builds show their normal steps/logging while executing
  • Primary app tier has related warnings in the event logs (see warnings below)

Recent Changes

  • Installed Xamarin Android/iOS on the build server
  • Installed a few custom built plugins for Job Agent, Message Queue, and Web Services (been using them for years just had them disabled the last few weeks due to a app tier migration)
  • Installed Tiago’s Task Board Enhancer (again been using this for a long time, just had it disabled recently)
  • About a month ago we added the 2nd app tier and moved the sql off to another machine

What I’ve Tried

  • Rebooting both App tiers and build server
  • Uninstalling Xamarin (although I suspect some parts are still floating around as the Bonjour service appears to still be installed)
  • Removing the custom plugins
  • Turned logging diagnostics right up on one of the builds – nothing particularly of interest seems to turn up
  • Run the Best Practice Analyzer (nothing too unusual shows up)
  • Multiple build process templates (defaulttemplate, defaulttemplate.11.1, tfvctemplate.12.xaml)
  • Multiple build definitions
  • Checked the event logs of both AppTiers and Build server

The Team Foundation service host request monitor has detected the following condition: Date (UTC): 3/02/2014 12:54:06 a.m. Machine: CODEBASE Application Domain: /LM/W3SVC/1/ROOT/tfs-1-130357641583538280 Assembly: Microsoft.TeamFoundation.Framework.Server, Version=12.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a; v4.0.30319 Service Host: 0dc282b5-59a8-4941-b541-a4f7d314cd0f Process Details: Process Name: w3wp Process Id: 2508 Thread Id: 2504

Detailed Message: A request for service host XXXX has been executing for 37 seconds, exceeding the warning threshold of 30. Request details: Request Context Details Url: /tfs/XXXX/XXXX/_api/_build/stop?__v=4 Method: ApiBuild.stop Parameters: uri = vstfs:///Build/Build/34064 User Agent: Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.102 Safari/537.36 Unique Id: 00000000-0000-0000-0000-000000000000

The Team Foundation service host request monitor has detected the following condition: Date (UTC): 30/01/2014 11:10:01 p.m. Machine: CODEBASE Application Domain: /LM/W3SVC/1/ROOT/tfs-1-130355232548668648 Assembly: Microsoft.TeamFoundation.Framework.Server, Version=12.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a; v4.0.30319 Service Host: 0dc282b5-59a8-4941-b541-a4f7d314cd0f Process Details: Process Name: w3wp Process Id: 70320 Thread Id: 14540

Detailed Message: A request for service host XXXX has been executing for 37 seconds, exceeding the warning threshold of 30. Request details: Request Context Details Url: /tfs/XXXX/Build/v4.0/BuildService.asmx Method: StopBuilds Parameters: uris[0] = vstfs:///Build/Build/34051 uris = Count = 1 User Agent: Team Foundation (devenv.exe, 12.0.21005.1, Premium, SKU:16) Unique Id: 4d2d3213-fd41-4c4d-8ab0-b87619c96a42

The Team Foundation service host request monitor has detected the following condition: Date (UTC): 31/01/2014 3:14:17 a.m. Machine: CODEBASE Application Domain: /LM/W3SVC/1/ROOT/tfs-1-130355232548668648 Assembly: Microsoft.TeamFoundation.Framework.Server, Version=12.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a; v4.0.30319 Service Host: Process Details: Process Name: w3wp Process Id: 70320
Thread Id: 14540

Detailed Message: There are no active requests for service host XXXX that exceed the warning threshold of 30.

A quick google suggests upping the timeout in the tfs registry (http://xavierdilipkumar.com/post/2013/07/04/TFS-event-7005-and-7006-warning.aspx) I've tried that and it doesn't appear to change anything.

like image 931
Betty Avatar asked Feb 03 '14 22:02

Betty


1 Answers

can you look in the tfs bs logs at

Event Viewer -> Applications and Services Logs -> Microsoft -> Team Foundation Server -> Build-Services -> Operational

these timeouts generally relate to permissions. you should look for TF215106 access denied events. Although the files appear to be there, are they all the current date or are there some with different (older) dates? Also are they any alerts/steps happening when the file drop occurs?

Other than that it could be timing out because one of the dependencies is being used by another service.

like image 53
Bozman Avatar answered Oct 05 '22 22:10

Bozman