Fortifying the Inference Supply Chain



Hook - The Adapter That Started Dropping Discord Links
We ship a LoRA adapter every Friday to keep our support chatbot on-brand. One sprint a developer grabbed a community adapter, tweaked it, and merged it straight into the S3 bucket that backs production. No checksums, no signatures, no staging soak. Within a day, users noticed the bot occasionally appending a Discord invite link when conversations mentioned billing. The adapter author had embedded a "payload" vector triggered by certain tokens. Because our serving containers happily loaded any .safetensors file they were given, the malicious adapter went live, and we spent the weekend tracing prompts and purging caches. That incident hammered home that model supply chain risk looks a lot like package-manager supply chain risk-except the payload runs inside our inference runtime with access to user prompts.
Problem Deep Dive - Layers to Worry About
Inference pipelines aren't just model weights. Each release typically includes:
- Base model + tokenizer. Often pulled from Hugging Face or an internal registry.
- Adapters/LoRAs. Smaller weight files created by fine-tuning.
- Runtime containers. Images with PyTorch, TensorRT, Triton, or vLLM.
- Serving config. YAML/JSON describing prompt templates, retrieval connectors, allowed egress domains.
- Glue code. Python/Rust microservices orchestrating RAG, caching, logging.
An attacker only needs to tamper with one layer to alter behavior. Flip bits in weights to leak secrets, poison adapters with triggers, add outbound HTTP calls in the runtime, or change RAG config to fetch context from an attacker-controlled URL.
Technical Solutions - Build a Model BOM and Enforce It
1. Require a Model Bill of Materials (MBOM)
Treat each release like firmware. Capture every artifact, checksum, and signature in a declarative file:
model: support-chat-lora
base_model: mistral-7b-instruct@sha256:abc123
adapter: s3://llm-artifacts/lora/billing-tone.v6.safetensors
runtime_image: quay.io/org/triton:23.08-slim
files:
- path: adapter.safetensors
sha256: 7c5e...
- path: tokenizer.json
sha256: e09b...
signing_key: kms/model-signing
Store MBOMs in git and require CI to verify them before promotion.
2. Sign Everything
- Weights/adapters: use
safetensors+ SHA256; sign the digest with Sigstore (cosign sign-blob). - Containers: sign OCI images via
cosign sign <image>. - Configs: wrap with
in-totolayout so pipelines attest to producing them.
Deployment pipeline step:
cosign verify --certificate-oidc-issuer https://accounts.google.com quay.io/org/triton:23.08-slim
python verify_weights.py --bom model.mbom
Reject releases if verification fails.
3. Static and Dynamic Scanning for Weights
Static:
- Ensure
safetensorsmetadata matches expectations (tensor names, shapes). - Enforce
allowlistof tensor names so adversaries can't slip extra parameters.
Dynamic:
- Run staged prompts covering high-risk triggers (billing, password reset, secrets) and diff outputs against previous release. Alert on new links, URLs, or policy deviations.
- Measure embedding norms/entropy; big swings often signal poisoning.
4. Harden Runtime Containers
- Build containers via reproducible Bazel or Nix expressions.
- Drop Linux capabilities, run as non-root.
- Scan with
trivy/grype; block critical CVEs (TensorRT, cuDNN, NCCL). - Freeze base images; never pull
latest.
5. Lock Down Retrieval and Egress
For RAG stacks, enforce an egress policy:
egress:
allow:
- https://vector-db.internal
- https://docs.internal
deny:
- "*"
Any attempt to fetch context from unapproved domains fails fast.
6. Runtime Provenance + Drift Monitoring
Have the serving layer emit provenance headers (x-model-sha, x-bom-id). Store them with logs. Monitor for drift: if a pod serves responses with an unknown hash, pull it out of rotation.
Testing & Verification
- Unit tests for
verify_weights.pyusing tampered files (extra tensors, wrong shape). - Integration tests in staging: deploy new MBOM, run contract test suite, compare responses.
- Chaos drills: attempt to deploy unsigned adapters; pipeline must fail.
- Runtime monitors: Cron job queries
/healthzto confirmx-model-shamatches BOM.
Common Questions
Do we need to sign community weights too? Yes-re-sign them yourself after verifying. Trust but verify does not apply; just verify.
How do we detect subtle prompt triggers? Build a regression harness with curated prompts covering sensitive intents. Use diff-based scoring (BLEU, toxicity, custom heuristics) to flag anomalies.
What about latency overhead? Signature checks add milliseconds during deploy, not at runtime. The safety margin is worth it.
Conclusion
Model supply chain security means knowing exactly what artifacts you promote and being able to prove it later. Build a Model BOM, sign weights and containers, scan adapters statically and dynamically, lock down egress, and monitor runtime hashes. Once those systems are in place, grabbing community adapters becomes a controlled experiment, not a weekend incident.