Add gcov-based test pruning with coverage cache#1283
Add gcov-based test pruning with coverage cache#1283sbryngelson wants to merge 1 commit intoMFlowCode:masterfrom
Conversation
Claude Code ReviewHead SHA: 1ed87c9 Changed files:
Summary
Findings1. Master-push dep-change detection gap — The if: github.event_name == 'pull_request'
id: dep-checkThe (github.event_name == 'push' &&
needs.file-changes.outputs.cases_py == 'true')If a PR that adds a 2. for key, value in cache.items():
if key == "_meta":
continue
if isinstance(value, dict):
cache[key] = sorted(value.keys()) # mutates value while iterating keys
return cacheMutating values (not keys) during Python 3 dict iteration is defined behavior and safe, but changing 3. Conservative "no-sim-coverage" include may be broader than intended — if changed_sim and not any(f.startswith("src/simulation/") for f in test_file_set):
to_run.append(case)
continueThe rationale ("cache build likely failed for this test") is sound — every MFC test runs the full pre/sim/post pipeline, so a test with zero simulation coverage indicates a failed gcov run. However, the unit test 4. The macro files in 5. Dep-change grep pattern false-positive risk — grep -qP '^\+\s*(use[\s,]+\w|#:include\s|include\s+['"'"'"])'The pattern Minor / improvement opportunities
Correctness verdictThe core coverage logic ( The master-push dep-change gap (#1 above) is the only finding that could cause a committed stale cache to persist silently across merges. All other findings are low-severity. |
Claude Code ReviewHead SHA: e696936 Changed files
Summary
FindingsBug — Duplicate
|
Claude Code ReviewHead SHA: Changed files
Summary
Findings1. Security: fork PR can run untrusted code on the self-hosted Phoenix runnerFile: The If forked PRs can trigger this path (i.e., the workflow runs in a Mitigation options:
2. Correctness:
|
Claude Code ReviewHead SHA: Changed files
Summary
Findings1. Bug — Duplicate inFile: use m_helper_basic !< Functions to compare floating point numbers
use m_helper_basic ← duplicate added by this PRThis is a stray duplicate 2. Security — job executes fork PR code on a self-hosted runnerFile: The job:
For Mitigations to consider:
If Phoenix runners are already restricted to repo members only (runner group ACLs), this risk is reduced but the workflow-level check is still good practice. 3. Logic gap — Root changes skip all testsFile: The unit test at def test_cmakelists_does_not_trigger_all(self):
assert should_run_all_tests({"CMakeLists.txt"}) is FalseFiles under Consider adding 4. Minor — used before conditional initializationFile:
set(FYPP_GCOV_OPTS )before the compiler-detection block would make the intent clearer and prevent any surprise if the variable scope changes. Improvement opportunities
|
- File-level gcov coverage cache maps 555 test UUIDs to exercised .fpp source files (gzip JSON, 11KB, committed to repo) - --only-changes flag prunes tests by intersecting PR-changed files against coverage cache; --build-coverage-cache builds the cache - New rebuild-cache CI job runs on Phoenix via SLURM when cases.py or Fortran dependency graph changes (on both PRs and master pushes) - Dep-change detection greps PR/push diffs for added use/include statements that would invalidate the coverage cache - Conservative fallbacks: missing cache runs all, missing sim coverage includes test, ALWAYS_RUN_ALL files trigger full suite - Remove continue-on-error from github CI job (fixes auto-cancellation) - TEMP: duplicate use in m_bubbles.fpp + remove CMakeLists.txt from ALWAYS_RUN_ALL to test the full cache rebuild pipeline in CI - 53 unit tests cover core coverage logic Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
499a347 to
8d58b93
Compare
Claude Code ReviewHead SHA: 8d58b93 Changed files
Summary
Findings1. Bug: Duplicate
|
Summary
Adds coverage-based test selection that skips tests unaffected by a PR's changes, reducing CI test time by ~80% for typical PRs.
.fppfiles (11KB gzip JSON)--only-changesflag: Intersects changed files against the cache to run only affected testscases.pyor Fortran dependency graph changes; also viaworkflow_dispatchALWAYS_RUN_ALL)GCOV_PREFIXdirs (32 workers); Phase 2 collects coverage via batched gcov calls withThreadPoolExecutorCI integration
rebuild-cacheruns?cases.pyuse/includecases.py[skip ci])workflow_dispatch[skip ci])Files changed
toolchain/mfc/test/coverage.py— Core coverage logic (~480 lines)toolchain/mfc/test/test_coverage_unit.py— 53 unit teststoolchain/mfc/test/test.py—--only-changesand--build-coverage-cacheintegration.github/workflows/test.yml—rebuild-cachejob, artifact upload/download, dep-change detection.github/workflows/phoenix/rebuild-cache.sh— SLURM script for cache rebuild--only-changesflag added to test invocationsTest plan
cases.pychanges (verified in prior PR)use/includeadditions (verified in prior PR)Replaces #1273.
🤖 Generated with Claude Code