Automerge Guardrails for AI-Generated Pull Requests



Hook
Your repo's new "auto-fix" bot takes lint issues, asks an LLM for a patch, opens a PR, and merges after tests pass. Last night it rewrote an Express route to "simplify auth" by removing an isAdmin check. Tests did not cover the path, the PR title looked legit, and the bot merged. Attackers noticed the public endpoint minutes later.
The Problem Deep Dive
AI-generated PRs are appealing for vibe coding chores but come with risks:
- Missing context: LLM might delete guard clauses or add insecure defaults.
- Tests blind spots: CI green does not equal safe.
- Lack of reviewers: bots merge instantly, no human eyes.
- Explainability: commit messages rarely justify changes.
Technical Solutions
Quick Patch: Require Human Sign-Off
Even for bot PRs, enforce CODEOWNERS review. But we can do better.
Durable Fix: Policy-Driven Automerge
- Semantic diff filters. Parse AST diff and flag security-sensitive changes (auth, crypto, IAM). Example with
semgrepdiff mode.
rules:
- id: auth-guard-removed
message: "Auth guard removed"
severity: ERROR
languages: [javascript]
pattern: |
- if (isAdmin(req.user)) { ... }
+ ...
Fail automerge if rule triggers.
-
Static analysis. Run
semgrep,bandit,eslint --max-warnings=0per PR. Block merges on new findings. -
Test coverage gates. Require patch coverage > 80%. Tools like
nyc,pytest --cov-report=xml, plusdiff-coverto evaluate touched lines. -
Threat modeling tags. Label files (e.g.,
auth/*,acl.go) ascritical. Automerge disabled there. -
Signature policy. Bot PR description must include LLM prompt + response summary stored in artifact for audit.
-
Playground stage. Apply patch to staging branch, run high-level integration tests (smoke, security tests) before merging to main.
GitHub Actions snippet:
- name: Check critical file changes
run: |
if git diff --name-only origin/main...HEAD | grep -E 'auth|acl|policy'; then
echo "Critical change, automerge disabled" && exit 1
fi
Alprina Integration
Use Alprina's policy engine to fail bot PRs missing approvals, coverage, or touching sensitive modules.
Testing & Verification
- Unit test Semgrep policies with sample diffs.
- Write CI tests ensuring the automerge workflow respects CODEOWNERS.
- Chaos drills: intentionally push malicious bot PR; ensure pipeline blocks it.
Common Questions
Does this kill bot productivity? Only for risky areas. Let bots merge documentation, formatting, dependency bumps while humans review security-sensitive code.
How to store prompts? Upload to artifact storage (S3) and reference ID in PR body.
What about other SCMs? Same idea with GitLab, Bitbucket: use merge request approvals + custom pipelines.
Conclusion
AI bots can clean code, but they need chaperones. Layer semantic diff checks, coverage gates, and CODEOWNER reviews so vibe-coded patches never strip auth again.