Fortifying the Inference Supply Chain

Alprina Security Team

Cover Image for Fortifying the Inference Supply Chain

Alprina Security Team

August 25, 2024

Hook - The Adapter That Started Dropping Discord Links

We ship a LoRA adapter every Friday to keep our support chatbot on-brand. One sprint a developer grabbed a community adapter, tweaked it, and merged it straight into the S3 bucket that backs production. No checksums, no signatures, no staging soak. Within a day, users noticed the bot occasionally appending a Discord invite link when conversations mentioned billing. The adapter author had embedded a "payload" vector triggered by certain tokens. Because our serving containers happily loaded any .safetensors file they were given, the malicious adapter went live, and we spent the weekend tracing prompts and purging caches. That incident hammered home that model supply chain risk looks a lot like package-manager supply chain risk-except the payload runs inside our inference runtime with access to user prompts.

Problem Deep Dive - Layers to Worry About

Inference pipelines aren't just model weights. Each release typically includes:

Base model + tokenizer. Often pulled from Hugging Face or an internal registry.
Adapters/LoRAs. Smaller weight files created by fine-tuning.
Runtime containers. Images with PyTorch, TensorRT, Triton, or vLLM.
Serving config. YAML/JSON describing prompt templates, retrieval connectors, allowed egress domains.
Glue code. Python/Rust microservices orchestrating RAG, caching, logging.

An attacker only needs to tamper with one layer to alter behavior. Flip bits in weights to leak secrets, poison adapters with triggers, add outbound HTTP calls in the runtime, or change RAG config to fetch context from an attacker-controlled URL.

Technical Solutions - Build a Model BOM and Enforce It

1. Require a Model Bill of Materials (MBOM)

Treat each release like firmware. Capture every artifact, checksum, and signature in a declarative file:

model: support-chat-lora
base_model: mistral-7b-instruct@sha256:abc123
adapter: s3://llm-artifacts/lora/billing-tone.v6.safetensors
runtime_image: quay.io/org/triton:23.08-slim
files:
  - path: adapter.safetensors
    sha256: 7c5e...
  - path: tokenizer.json
    sha256: e09b...
signing_key: kms/model-signing

Store MBOMs in git and require CI to verify them before promotion.

2. Sign Everything

Weights/adapters: use safetensors + SHA256; sign the digest with Sigstore (cosign sign-blob).
Containers: sign OCI images via cosign sign <image>.
Configs: wrap with in-toto layout so pipelines attest to producing them.

Deployment pipeline step:

cosign verify --certificate-oidc-issuer https://accounts.google.com quay.io/org/triton:23.08-slim
python verify_weights.py --bom model.mbom

Reject releases if verification fails.

3. Static and Dynamic Scanning for Weights

Static:

Ensure safetensors metadata matches expectations (tensor names, shapes).
Enforce allowlist of tensor names so adversaries can't slip extra parameters.

Dynamic:

Run staged prompts covering high-risk triggers (billing, password reset, secrets) and diff outputs against previous release. Alert on new links, URLs, or policy deviations.
Measure embedding norms/entropy; big swings often signal poisoning.

4. Harden Runtime Containers

Build containers via reproducible Bazel or Nix expressions.
Drop Linux capabilities, run as non-root.
Scan with trivy/grype; block critical CVEs (TensorRT, cuDNN, NCCL).
Freeze base images; never pull latest.

5. Lock Down Retrieval and Egress

For RAG stacks, enforce an egress policy:

egress:
  allow:
    - https://vector-db.internal
    - https://docs.internal
  deny:
    - "*"

Any attempt to fetch context from unapproved domains fails fast.

6. Runtime Provenance + Drift Monitoring

Have the serving layer emit provenance headers (x-model-sha, x-bom-id). Store them with logs. Monitor for drift: if a pod serves responses with an unknown hash, pull it out of rotation.

Testing & Verification

Unit tests for verify_weights.py using tampered files (extra tensors, wrong shape).
Integration tests in staging: deploy new MBOM, run contract test suite, compare responses.
Chaos drills: attempt to deploy unsigned adapters; pipeline must fail.
Runtime monitors: Cron job queries /healthz to confirm x-model-sha matches BOM.

Common Questions

Do we need to sign community weights too? Yes-re-sign them yourself after verifying. Trust but verify does not apply; just verify.

How do we detect subtle prompt triggers? Build a regression harness with curated prompts covering sensitive intents. Use diff-based scoring (BLEU, toxicity, custom heuristics) to flag anomalies.

What about latency overhead? Signature checks add milliseconds during deploy, not at runtime. The safety margin is worth it.

Conclusion

Model supply chain security means knowing exactly what artifacts you promote and being able to prove it later. Build a Model BOM, sign weights and containers, scan adapters statically and dynamically, lock down egress, and monitor runtime hashes. Once those systems are in place, grabbing community adapters becomes a controlled experiment, not a weekend incident.

Alprina Blog