jq filter library

How Pix v1.4.0 routes every Vulnetix VDB call through a verified jq filter to keep LLM context small. Covers the library layout, verification workflow, and CLI quirks.

Pix v1.4.0 introduces a small library of jq filters at vulnetix/skills/_lib/jq/. Every skill that calls a VDB endpoint pipes the output through the matching filter so the LLM only sees the fields it actually needs.

Why

A single vulnetix vdb vuln CVE-2021-44228 -o json call returns ~4 MB (an array of 20 container views in CVE5 format). vulnetix vdb exploits CVE-2021-44228 -o json returns ~12 MB (an array of 10,000+ exploit records). Feeding either raw to the LLM blows the context budget. Filtered (sized for decision retention — keeps full Vulnetix enrichment, full affected lists, full KEV directives, recommendedVersions, etc., not just headlines):

EndpointRawFilteredReduction
vdb vuln4.0 MB224 KB94.5%
vdb exploits12 MB17 KB99.9%
vdb fixes87 KB14 KB84%
vdb sightings294 KB4 KB98.6%
vdb iocs get70 KB13 KB81%
vdb packages search7 KB4 KB46%
vdb versions8 KB6 KB29%
vdb kev list1 KB1 KB(already lean)

The vdb vuln filter aggregates cna.affected[] across all 20 container views (each container scopes a different ecosystem — Amazon Linux entries in one, npm/maven log4j-core in another, etc.); for log4j the aggregated affected list is 639 entries, capped at the first 200 with the total surfaced as affectedTotal so the LLM can call vdb affected for the rest if needed.

Library layout

One filter per VDB endpoint Pix consumes:

FileEndpoint(s)Verification
vuln.jqvdb vulnverified against CVE-2021-44228
vulns.jqvdb vulnspartial — re-verify on first use
fixes.jqvdb fixesverified
exploits.jqvdb exploitsverified
sightings.jqvdb sightingsverified
iocs.jqvdb iocs getverified
attack-techniques.jqvdb attack-techniques getverified
kev.jqvdb kev list/getverified
packages.jqvdb packages searchverified
versions.jqvdb versionsverified
purl.jqvdb purlpartial
remediation.jqvdb remediation plan -V v2partial
workarounds.jqvdb workarounds -V v2partial
triage.jqvdb triagepartial
scorecard.jqvdb scorecard -V v2partial
cwe.jqvdb cwe -V v2partial
ai-list.jqvdb ai-discoveries, ai-in-wild, ai-malware, ai-assisted-exploitspartial

“Partial” filters are inferred from the vdb-api source struct definitions; they should be verified against the live response on first use and updated if the shape diverges.

Standard skill invocation

vulnetix vdb vuln "$ARGUMENTS" -o json | jq -f "${CLAUDE_PLUGIN_ROOT}/skills/_lib/jq/vuln.jq"

CLI quirks observed in v2.7.x

A handful of vdb subcommands treat -o as output filename (not format). Use -o /dev/stdout to pipe their output:

  • vdb sightings <id>
  • vdb iocs get <id>
  • vdb attack-techniques get <id>
  • vdb kev list / vdb kev get

Bare invocations of some subcommands return help text rather than data:

  • vdb iocs <id> — needs iocs get <id> or iocs list --cve-id <id>.
  • vdb attack-techniques <id> — needs attack-techniques get <id>.
  • vdb metrics <id>metrics types is the only subcommand; per-CVE metrics live in vdb vuln’s containers.adp[0].x_* blocks (use vuln.jq).

Verification workflow

When adding a new filter or updating an existing one:

  1. Read the response struct in /home/chris/GitHub/Vulnetix/vdb-api for field-name hints.
  2. Run the live CLI: vulnetix vdb <cmd> <args> -o json > sample.json. Pace one second between calls to avoid rate-limit retries.
  3. Inspect actual shape: jq -r 'if type == "array" then "array of \(length); first keys: \(.[0] | keys_unsorted)" else "object keys: \(keys_unsorted)" end' sample.json.
  4. Draft a filter that extracts only the fields a Pix skill cares about.
  5. cat sample.json | jq -f filter.jq | wc -c — confirm valid output and meaningful reduction.
  6. Save to _lib/jq/<cmd>.jq with a header comment listing fields, source-struct path, and the CVE / package fixture used.

Filter design discipline

  • Slice arrays (.[0:N]) for top-N projections.
  • Truncate prose with if length > N then .[:N-3] + "..." else . end.
  • Use // null or // "n/a" for missing fields so output is always valid JSON.
  • Prefer object output ({a, b, c}) over re-keying when the result is a single record.
  • Don’t hand-construct CVSS — Vulnetix’s containers.adp[0].x_threatExposure already provides a composite score with rule breakdown.