Anvil Registry / Concepts
Worker analysis
Anvil Registry queues expensive analysis outside the install request path. The worker unpacks tarballs, compares versions, and produces evidence that feeds deterministic policy decisions.
What the worker analyses
Manifest checks
- Package name, version, and description changes.
scriptsfield: new or changed lifecycle scripts.dependencies,devDependencies,peerDependencies,optionalDependencieschanges.- New dependencies added in a patch version.
bin,files,repository,license, andmaintainerchanges.
Install script checks
Lifecycle scripts are the most common install-time risk surface:
preinstallinstallpostinstallprepareprepublishprepublishOnly
The worker flags new or changed scripts and scans their contents for suspicious patterns.
Code pattern checks
Static analysis of the unpacked tarball looks for:
child_processusage (exec,spawn,fork).- Direct
process.envaccess. fsmodule usage in install paths.http/httpsorfetchcalls in install paths.net.connectordnslookups.eval,Function, orsetTimeoutwith string arguments.Buffer.from(base64)decoding followed by execution.- Shell piping (
| sh,| bash).
File tree checks
- New binary or executable files.
- Unexpected size changes.
- Minified or obfuscated files.
- Encoded blobs (base64 strings in non-binary files).
- Hidden files (dot-prefixed).
- Unusual paths (traversal attempts, temp directories).
- Credential-looking files (
.npmrc,.ssh,.aws,.env).
Name-squatting checks
The worker compares low-adoption package names against the popular package index:
- Typo variants: missing character, extra character, transposed characters.
- Hyphen and underscore swaps.
- Pluralisation differences.
- Visual similarity (homoglyphs).
- Scope confusion (
@scope/pkgvs@scop/pkg). - Ecosystem confusion (package names that mimic well-known tools).
Algorithms used: Damerau-Levenshtein distance, Jaro-Winkler similarity, token normalisation.
Comparison strategy
By default, the worker compares the target version against the previous three versions. This surfaces:
- What changed.
- Whether the change fits the release type (patch, minor, major).
- Whether install-time behaviour changed when runtime behaviour should not have.
Analysis output
The worker produces a structured report stored in Postgres and linked to the package decision:
- Signal list with severity.
- File references where available.
- Diff summaries against previous versions.
- Name-squatting match results.
- Provenance status.
LLM review context
When enabled, the worker can send structured evidence to an LLM reviewer. The model output is validated against a Zod schema and stored as review context. It never overrides the deterministic policy decision.
Cache identity
Analysis reports are cached by:
- Package name.
- Version.
- Tarball integrity or hash.
- Analysis engine version.
- Policy version.
If any of these change, the cached report is not reused. This prevents a decision for one artifact from silently applying to a different artifact.
Limitations
- Analysis is asynchronous. The gateway may allow or quarantine a package while analysis is pending.
- Very large tarballs may hit worker timeout limits.
- Private package metadata is excluded from LLM review by default.
- The worker does not execute install scripts. Static analysis only.