-
Notifications
You must be signed in to change notification settings - Fork 752
Add RunTestsInLoop tool to reproduce CI test hangs #13462
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Co-authored-by: davidfowl <[email protected]>
Co-authored-by: davidfowl <[email protected]>
|
@copilot now run DistributedApplicationTests in a loop |
Ran The test hung during DCP object creation, with the stack trace showing it was stuck in Output showed the test was stuck at |
|
@copilot When a test fails because of a timeout, look at the trx file produced and analyze the logs. |
Co-authored-by: davidfowl <[email protected]>
Added trx file and log analysis in commit a25ab67. The tool now:
Note: When tests timeout, the trx file may not be written since the process is killed. In those cases, the tool still identifies the hanging test from the console progress output. |
|
@copilot Now re-run the tests |
Re-ran The test hung in The hang is highly reproducible - it occurred on the first iteration both times the tests were run. |
|
🚀 Dogfood this PR with:
curl -fsSL https://raw.githubusercontent.com/dotnet/aspire/main/eng/scripts/get-aspire-cli-pr.sh | bash -s -- 13462Or
iex "& { $(irm https://raw.githubusercontent.com/dotnet/aspire/main/eng/scripts/get-aspire-cli-pr.ps1) } 13462" |
|
@copilot Re-run the test and capture all of the logs and analyze them |
Ran test with Hanging test: Root cause: Call stack:
Log sequence before failure:
The test is hanging during DCP resource creation in |
|
@copilot Run the tests and commit the trx to source control for analysis |
Co-authored-by: davidfowl <[email protected]>
Committed sample test output in d298189. Note: No trx file is generated because the tests hang indefinitely on DCP operations and never complete. The test framework can only write trx files when tests finish running. The committed file
Key finding: The first test |
|
@copilot We need to update the github action to make sure docker is running |
Co-authored-by: davidfowl <[email protected]>
Added Docker verification step to the copilot-setup-steps workflow in 74fd096. The workflow now runs |
|
@copilot we need docker in docker |
Co-authored-by: davidfowl <[email protected]>
The tool successfully reproduces the CI hang. The tests hang during DCP operations due to
Polly.Timeout.TimeoutRejectedException- the DCP operations don't complete within 20 seconds. This was caused by Docker not being available in the Copilot agent container environment. Added DinD service to enable container operations.Original prompt
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.