WARNING: THIS SITE IS A MIRROR OF GITHUB.COM / IT CANNOT LOGIN OR REGISTER ACCOUNTS / THE CONTENTS ARE PROVIDED AS-IS / THIS SITE ASSUMES NO RESPONSIBILITY FOR ANY DISPLAYED CONTENT OR LINKS / IF YOU FOUND SOMETHING MAY NOT GOOD FOR EVERYONE, CONTACT ADMIN AT ilovescratch@foxmail.com
Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions .github/workflows/copilot-setup-steps.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,11 +11,38 @@ jobs:
permissions:
contents: read

# Enable Docker-in-Docker for container-based tests
services:
dind:
image: docker:dind
options: --privileged
ports:
- 2375:2375
env:
DOCKER_TLS_CERTDIR: ""

env:
DOCKER_HOST: tcp://localhost:2375

# You can define any steps you want, and they will run before the agent starts.
# If you do not check out your code, Copilot will do this for you.
steps:
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

- name: Wait for Docker to be ready
run: |
for i in {1..30}; do
if docker info >/dev/null 2>&1; then
echo "Docker is ready"
docker info
exit 0
fi
echo "Waiting for Docker... ($i/30)"
sleep 2
done
echo "Docker failed to start"
exit 1

- name: Build solution
# a full build is too slow; also do not fail on errors, continue so that
# copilot can attempt to recover
Expand Down
198 changes: 198 additions & 0 deletions tools/RunTestsInLoop/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,198 @@
# RunTestsInLoop

A utility to run tests repeatedly in a loop to help reproduce intermittent failures or hangs that occur in CI.

## Purpose

This tool is designed to help developers:

- **Reproduce CI hangs** by running tests multiple times until a hang occurs
- **Identify flaky tests** by tracking pass/fail statistics across many runs
- **Stress test specific tests** to ensure stability before unquarantining

## Prerequisites

- .NET SDK 10+ (installed via `./restore.sh` or `./restore.cmd`)
- The repository should be built first (`./build.sh` or `./build.cmd`)

## Usage

```bash
# Show help
dotnet run --project tools/RunTestsInLoop -- --help

# Run DistributedApplicationTests 10 times with 5 minute timeout per run
dotnet run --project tools/RunTestsInLoop -- --project tests/Aspire.Hosting.Tests --iterations 10 --timeout 5

# Run a specific test class 20 times
dotnet run --project tools/RunTestsInLoop -- --project tests/Aspire.Hosting.Tests --class "Aspire.Hosting.Tests.DistributedApplicationTests" --iterations 20

# Run a specific test method 50 times
dotnet run --project tools/RunTestsInLoop -- --project tests/Aspire.Hosting.Tests --method "RegisteredLifecycleHookIsExecutedWhenRunAsynchronously" --iterations 50

# Run with verbose output and continue on failure
dotnet run --project tools/RunTestsInLoop -- --project tests/Aspire.Hosting.Tests --iterations 5 --verbose --stop-on-failure false

# Skip building (if already built)
dotnet run --project tools/RunTestsInLoop -- --project tests/Aspire.Hosting.Tests --iterations 10 --no-build
```

## Options

| Option | Short | Description | Default |
|--------|-------|-------------|---------|
| `--project` | `-p` | Path to test project (required) | - |
| `--iterations` | `-i` | Number of test runs | 10 |
| `--timeout` | `-t` | Timeout per run in minutes (0 = no timeout) | 10 |
| `--method` | `-m` | Filter by test method name (short name, uses wildcard prefix) | - |
| `--class` | `-c` | Filter by fully-qualified test class name | - |
| `--namespace` | `-n` | Filter by namespace | - |
| `--verbose` | `-v` | Show detailed test output | false |
| `--stop-on-failure` | `-s` | Stop after first failure/timeout | true |
| `--extra-args` | `-e` | Additional dotnet test arguments | - |
| `--no-build` | - | Skip building the project | false |

**Note on filters:**
- `--method`: Accepts just the method name (e.g., `RegisteredLifecycleHookIsExecutedWhenRunAsynchronously`). The tool adds a `*.` prefix to match any class.
- `--class`: Requires the fully-qualified class name (e.g., `Aspire.Hosting.Tests.DistributedApplicationTests`)

## Example Output

```
╔══════════════════════════════════════════════════════════════╗
║ Test Loop Runner for Aspire ║
╚══════════════════════════════════════════════════════════════╝

Project: tests/Aspire.Hosting.Tests/Aspire.Hosting.Tests.csproj
Iterations: 10
Timeout: 5 minutes
Class: Aspire.Hosting.Tests.DistributedApplicationTests
Stop on failure: true

Building test project...
Build succeeded.

╔══════════════════════════════════════════════════════════════╗
║ Starting Test Loop ║
╚══════════════════════════════════════════════════════════════╝

┌──────────────────────────────────────────────────────────────┐
│ Iteration 1/10 │
└──────────────────────────────────────────────────────────────┘
✅ PASSED in 45.2s

┌──────────────────────────────────────────────────────────────┐
│ Iteration 2/10 │
└──────────────────────────────────────────────────────────────┘
⏱️ TIMEOUT after 5.0 minutes!
Last 50 lines of output:
...

Stopping due to timeout (--stop-on-failure is enabled)

╔══════════════════════════════════════════════════════════════╗
║ Final Results ║
╚══════════════════════════════════════════════════════════════╝
┌────────────────────────────────────────┐
│ Statistics │
├────────────────────────────────────────┤
│ Passed: 1 │
│ Failed: 0 │
│ Timed out: 1 │
│ Total: 2 │
├────────────────────────────────────────┤
│ Avg time: 152.6s │
│ Min time: 45.2s │
│ Max time: 300.0s │
├────────────────────────────────────────┤
│ Success rate: 50.0% │
└────────────────────────────────────────┘

💡 Tip: If you found a flaky test, consider quarantining it:
dotnet run --project tools/QuarantineTools -- -q -i <issue-url> <Namespace.Class.Method>
```

## Use Cases

### Reproducing CI hangs

```bash
# Run the problematic tests many times to trigger the hang
dotnet run --project tools/RunTestsInLoop -- \
--project tests/Aspire.Hosting.Tests \
--class "Aspire.Hosting.Tests.DistributedApplicationTests" \
--iterations 100 \
--timeout 5
```

### Stress testing a specific test

```bash
# Run a specific test 1000 times to ensure stability
dotnet run --project tools/RunTestsInLoop -- \
--project tests/Aspire.Hosting.Tests \
--method "RegisteredLifecycleHookIsExecutedWhenRunAsynchronously" \
--iterations 1000 \
--stop-on-failure false
```

### Finding timeout issues

```bash
# Run with a short timeout to catch slow tests
dotnet run --project tools/RunTestsInLoop -- \
--project tests/Aspire.Hosting.Tests \
--iterations 20 \
--timeout 2
```

## How it works

1. **Resolves the project path** relative to the repository root
2. **Builds the test project** (unless `--no-build` is specified)
3. **Runs tests in a loop** for the specified number of iterations with trx reporting enabled
4. **Applies timeout** to each run and kills hung processes
5. **Analyzes results** on failure or timeout:
- Parses trx files to identify passed/failed/in-progress tests
- Identifies the likely hanging test from console output
- Shows paths to log files for further investigation
6. **Tracks statistics** including pass/fail/timeout counts and timing
7. **Reports results** with a summary and success rate

## Test Result Analysis

When a test times out or fails, the tool automatically:

- **Identifies the hanging test** by parsing the test runner's progress output
- **Analyzes trx files** (if available) to show:
- Count of passed, failed, and in-progress tests
- Names and error messages of failed tests
- Names of tests that were still running when killed
- **Shows log file locations** for manual inspection

Example output on timeout:
```
⏱️ TIMEOUT after 5.0 minutes!

⏳ Likely hanging test: Aspire.Hosting.Tests.DistributedApplicationTests.VerifyContainerCreateFile
Progress: [+1/x3/?0]

📋 Analyzing test results:
No .trx files found (test may have been killed before writing results)

📁 Log files (1):
artifacts/testresults/loop-runner/Aspire.Hosting.Tests_net8.0_x64.log
```

## Notes

- Tests are run with `--filter-not-trait "quarantined=true" --filter-not-trait "outerloop=true"` to exclude quarantined and outerloop tests
- Test results are written to `artifacts/testresults/loop-runner/`
- The tool uses the repository's `dotnet.sh`/`dotnet.cmd` wrapper to ensure the correct SDK is used
- When a timeout occurs, the entire process tree is killed to clean up orphaned processes
- Statistics are printed every 5 iterations and at the end

## See Also

- [QuarantineTools](../QuarantineTools/README.md) - For quarantining flaky tests
- [Test README](../../tests/README.md) - For general test running information
Loading