Automerge Guardrails for AI-Generated Pull Requests

Alprina Security Team

Cover Image for Automerge Guardrails for AI-Generated Pull Requests

Alprina Security Team

August 22, 2024

Hook

Your repo's new "auto-fix" bot takes lint issues, asks an LLM for a patch, opens a PR, and merges after tests pass. Last night it rewrote an Express route to "simplify auth" by removing an isAdmin check. Tests did not cover the path, the PR title looked legit, and the bot merged. Attackers noticed the public endpoint minutes later.

The Problem Deep Dive

AI-generated PRs are appealing for vibe coding chores but come with risks:

Missing context: LLM might delete guard clauses or add insecure defaults.
Tests blind spots: CI green does not equal safe.
Lack of reviewers: bots merge instantly, no human eyes.
Explainability: commit messages rarely justify changes.

Technical Solutions

Quick Patch: Require Human Sign-Off

Even for bot PRs, enforce CODEOWNERS review. But we can do better.

Durable Fix: Policy-Driven Automerge

Semantic diff filters. Parse AST diff and flag security-sensitive changes (auth, crypto, IAM). Example with semgrep diff mode.

rules:
  - id: auth-guard-removed
    message: "Auth guard removed"
    severity: ERROR
    languages: [javascript]
    pattern: |
      - if (isAdmin(req.user)) { ... }
      + ...

Fail automerge if rule triggers.

Static analysis. Run semgrep, bandit, eslint --max-warnings=0 per PR. Block merges on new findings.
Test coverage gates. Require patch coverage > 80%. Tools like nyc, pytest --cov-report=xml, plus diff-cover to evaluate touched lines.
Threat modeling tags. Label files (e.g., auth/*, acl.go) as critical. Automerge disabled there.
Signature policy. Bot PR description must include LLM prompt + response summary stored in artifact for audit.
Playground stage. Apply patch to staging branch, run high-level integration tests (smoke, security tests) before merging to main.

GitHub Actions snippet:

- name: Check critical file changes
  run: |
    if git diff --name-only origin/main...HEAD | grep -E 'auth|acl|policy'; then
      echo "Critical change, automerge disabled" && exit 1
    fi

Alprina Integration

Use Alprina's policy engine to fail bot PRs missing approvals, coverage, or touching sensitive modules.

Testing & Verification

Unit test Semgrep policies with sample diffs.
Write CI tests ensuring the automerge workflow respects CODEOWNERS.
Chaos drills: intentionally push malicious bot PR; ensure pipeline blocks it.

Common Questions

Does this kill bot productivity? Only for risky areas. Let bots merge documentation, formatting, dependency bumps while humans review security-sensitive code.

How to store prompts? Upload to artifact storage (S3) and reference ID in PR body.

What about other SCMs? Same idea with GitLab, Bitbucket: use merge request approvals + custom pipelines.

Conclusion

AI bots can clean code, but they need chaperones. Layer semantic diff checks, coverage gates, and CODEOWNER reviews so vibe-coded patches never strip auth again.

Alprina Blog