Browse all docs

Anvil Registry / Operations

LLM integration

Anvil can call an external LLM reviewer to summarize package risk evidence. This is optional. The deterministic policy engine still owns enforcement.

Use LLM review for reviewer context:

  • Explain why a lifecycle-script change looks suspicious.
  • Summarize risky file, dependency, metadata, and download-stat signals.
  • Suggest which evidence a human should inspect first.
  • Add quarantine-level context for messy packages.

Do not use LLM review as a permission engine. A model recommendation cannot make a package safe, cannot bypass policy, and cannot approve an override. That job still belongs to deterministic rules and audited humans, because apparently we would like the security product to contain security.

What runs where

The gateway and Admin service can enqueue LLM review jobs. The worker calls the provider and persists validated review output.

Component Role
Gateway Exposes POST /-/anvil/llm-review for admin-token-gated manual review requests.
Admin Lets reviewers queue LLM review from package detail pages and reads persisted review records.
CLI Wraps the route with anvil llm-review package@version.
Worker Calls the configured LLM endpoint and stores schema-valid review output.
Policy engine Treats high or critical LLM risk as quarantine context, not as an allow authority.

Provider credentials are needed only by the worker. Do not expose LLM_REVIEW_API_KEY to the gateway, Admin browser code, public docs routes, or NEXT_PUBLIC_* variables.

Enable review locally

The default Compose stack keeps LLM review disabled. To run the local mock provider:

pnpm smoke:llm-review

That smoke test starts the Compose stack with the llm-review profile, queues a forced review, waits for the worker, and verifies that Admin can read the persisted review.

For a manual local run:

LLM_REVIEW_ENABLED=true \
LLM_REVIEW_PROVIDER=mock \
LLM_REVIEW_MODEL=risk-reviewer \
LLM_REVIEW_ENDPOINT=http://llm-review-mock:8787/review \
docker compose -f infra/docker/docker-compose.yml --profile llm-review up -d --build llm-review-mock gateway worker admin

Then queue a review:

ANVIL_REGISTRY_URL=http://localhost:4873 \
ANVIL_ADMIN_TOKEN=local-dev-token \
  anvil llm-review is-number@7.0.0 --requested-by security-review --priority high

Inspect the result:

ANVIL_REGISTRY_URL=http://localhost:4873 anvil explain is-number@7.0.0

Environment variables

Variable Default Used by Purpose
LLM_REVIEW_ENABLED false Gateway, Admin, worker Enables queueing and worker review behavior.
LLM_REVIEW_PROVIDER unset Gateway, Admin, worker Provider label stored with review records.
LLM_REVIEW_MODEL unset Gateway, Admin, worker Model label sent to the provider and stored with review records.
LLM_REVIEW_ENDPOINT unset Gateway, Admin, worker HTTP endpoint for structured review calls. Required when enabled.
LLM_REVIEW_API_KEY unset Worker only Optional bearer token for the provider endpoint.
LLM_REVIEW_RUN_ON_UNKNOWN_PACKAGES false Worker Automatically run LLM review for unknown package analysis.
LLM_REVIEW_RUN_ON_QUARANTINE false Worker Automatically run LLM review for quarantined analysis results.
LLM_REVIEW_INCLUDE_PRIVATE_PACKAGES false Worker Allows private package metadata to be sent to the provider.

Private package metadata is excluded by default. Only enable LLM_REVIEW_INCLUDE_PRIVATE_PACKAGES=true when the provider, workspace policy, and package owners all allow that data flow. "The endpoint seemed friendly" is not a data-processing agreement.

Provider contract

Anvil posts JSON to LLM_REVIEW_ENDPOINT:

{
  "model": "risk-reviewer",
  "input": {
    "packageName": "example",
    "version": "1.2.3"
  },
  "instructions": "Return only JSON matching the Anvil LlmRiskReview schema. Do not recommend allow solely because evidence is inconclusive."
}

The exact input object includes package evidence collected by the worker, such as metadata, static findings, dependency changes, and available download signals.

The provider response must contain either the review object directly or a review object:

{
  "review": {
    "riskLevel": "high",
    "confidence": "medium",
    "summary": "Install script behavior needs manual review.",
    "suspectedRiskTypes": ["install_script_abuse"],
    "evidence": [
      {
        "signal": "NEW_INSTALL_SCRIPT",
        "explanation": "A patch release introduced a postinstall script.",
        "source": "package_json"
      }
    ],
    "recommendedAction": "quarantine"
  }
}

Accepted risk levels are low, medium, high, and critical. Accepted recommendations are allow, warn, quarantine, and block.

Anvil validates model-shaped output before storing it. Invalid JSON, malformed schema output, or non-2xx provider responses are treated as unavailable review context. The worker records that the review was unavailable; it does not turn model failure into a secret allow.

Manual review route

The gateway exposes:

POST /-/anvil/llm-review
Authorization: Bearer <admin-token>
Content-Type: application/json

Single target:

{
  "name": "example",
  "version": "1.2.3",
  "requestedBy": "security-review",
  "priority": "high"
}

Multiple targets:

{
  "targets": [
    { "name": "example", "version": "1.2.3" },
    { "name": "@scope/pkg", "version": "4.5.6" }
  ],
  "requestedBy": "security-review",
  "priority": "high"
}

The route validates, trims, deduplicates, and enqueues review jobs with runLlmReview: true. It records llm_review.enqueued audit events. It does not approve the package, bypass private-package controls, or skip static analysis.

CLI equivalent:

ANVIL_REGISTRY_URL=http://localhost:4873 \
ANVIL_ADMIN_TOKEN=local-dev-token \
  anvil llm-review example@1.2.3 --requested-by security-review --priority high

Hosted deployment

SST passes LLM review environment values into gateway, worker, and Admin. The LlmReviewApiKey secret is linked only to the worker.

Before deploying with review enabled, run:

pnpm --dir infra/sst test

The preflight catches partially enabled LLM review, such as setting LLM_REVIEW_ENABLED=true without a provider endpoint.

Troubleshooting

Symptom Check
ANVIL_LLM_REVIEW_DISABLED Confirm LLM_REVIEW_ENABLED=true is set for gateway, Admin, and worker.
ANVIL_LLM_REVIEW_INVALID Check the request body uses name and version, or a targets array with those fields.
Jobs enqueue but no review appears Check worker logs, LLM_REVIEW_ENDPOINT, and provider response shape.
Private package review is skipped This is the default. Set LLM_REVIEW_INCLUDE_PRIVATE_PACKAGES=true only after approving that data flow.
Provider returns text instead of JSON Make the provider return the schema object directly or inside review.

When in doubt, run pnpm smoke:llm-review against the local mock first. It removes provider drama from the equation, which is rude to drama but helpful to debugging.