VulnerableCode: On-demand live evaluation of packages

Organization - AboutCode 

Michael Ehab Mikhail
GitHub: michaelehab
LinkedIn: @michaelehab16
Project: VulnerableCode
Official GSoC project page: Project Link
GSoC Proposal: Proposal Link

Overview

VulnerableCode traditionally relied on batch importers to fetch and store all advisories from a source at once. While effective for building complete databases, batch importers are slow and resource-heavy for developers who only need vulnerability data for a single package.

This project introduces live importers, a new class of importers that operate in a package-first mode. Instead of pulling all advisories, they run against a single PackageURL (PURL), returning only the advisories affecting that package. This makes vulnerability evaluation faster, more efficient, and more personalized, since the database is gradually filled with only the advisories that matter to each user.

To support this, I added:

A new LIVE_IMPORTERS_REGISTRY that tracks available live importers.
A new API endpoint that accepts a PURL, enqueues compatible live importer pipelines into a Redis queue, and executes them asynchronously via workers.
Integration with VulnTotal and its browser extension, enabling users to evaluate packages in real-time through a seamless interface.

This work bridges the gap between batch-first databases and package-first queries, improving VulnerableCode’s flexibility and enabling better integration with developer workflows.

Note

A PURL (Package URL) is a universal way to identify and locate software packages. More on PURL

Project Design and Architecture

The new live importers system builds on existing batch importers, while introducing a parallel registry and asynchronous execution model for package-first runs.

Importer Registries

IMPORTERS_REGISTRY continues to hold batch importers (V1/V2).
LIVE_IMPORTERS_REGISTRY holds live importers.

Each live importer:

Inherits from its batch importer (when logic can be reused), or directly from VulnerableCodeBaseImporterPipelineV2 when a separate implementation is needed.
Declares a supported_types array, defining compatible package ecosystems ("pypi", "npm", "maven", "generic", etc).
Implements a package-first collect_advisories() method, which restricts results to advisories relevant to the given PURL.

Live importer executions are asynchronous: once triggered, they are placed in a Redis-backed job queue and processed by dedicated workers. This prevents blocking the main API thread and allows multiple evaluations to run safely in parallel.

Class architecture of importers registries — Class architecture showing relationship between `IMPORTERS_REGISTRY` and `LIVE_IMPORTERS_REGISTRY`.

API Endpoint

The new API endpoint is responsible for handling live evaluation requests.

Input:
- purl (required)
Execution:
- Checks LIVE_IMPORTERS_REGISTRY for importers whose supported_types match the PURL.
- Enqueues the pipelines runs of these live importers in a live rq.
- Returns the Live Run ID, information about the pipelines to run, and the status url.
- The status URL shows the current state of a live evaluation run and its individual pipeline runs.
Output:
- Once workers complete execution, the resulting advisories are imported into the database and exposed as JSON through the status endpoint.

Live Pipeline Run Class and how it groups multiple PipelineRuns.

Live Importers API request flow — Flow of API endpoint: selecting compatible live importers and executing them in parallel.

Integration with VulnTotal

The new API was integrated into VulnTotal as an optional datasource:

VulnTotal now checks the local environment for VCIO_HOST, VCIO_PORT, and ENABLE_LIVE_EVAL flags in .env.
If enabled, VulnTotal queries VulnerableCode in package-first mode.
This allows VulnTotal to use both its proprietary datasources and the user’s gradually built local database, improving coverage and personalization.

Integration with VulnTotal Browser Extension

The VulnTotal browser extension was updated to support live importers:

Users can enable the “Local VulnerableCode” datasource and live evaluation option.
When enabled, package lookups are forwarded to the new API, retrieving advisories in real-time.
This reduces setup effort—developers can get live vulnerability checks directly in their browser, provided they have a local VC instance.

Live evaluation demo in VulnTotal browser extension — VulnTotal and its browser extension consuming the new live evaluation API.

Linked Pull Requests

Sr. no	Name	Link
1	Add Live Evaluation API endpoint and PyPa live pipeline importer	aboutcode-org/vulnerablecode#1969
2	Add Gitlab Live V2 Importer	aboutcode-org/vulnerablecode#1910
3	Add Curl Live Importer V2	aboutcode-org/vulnerablecode#1923
4	Add Elixir Security Live V2 Importer	aboutcode-org/vulnerablecode#1935
5	Add NPM Live Importer V2	aboutcode-org/vulnerablecode#1941
6	Add GitHub OSV Live V2 Importer Pipeline	aboutcode-org/vulnerablecode#1977
7	Add Postgres Live V2 Importer Pipeline	aboutcode-org/vulnerablecode#1982
8	Add PySec Live V2 Importer Pipeline	aboutcode-org/vulnerablecode#1983
9	Add Local VulnerableCode Datasource in VulnTotal and allow live evaluation	aboutcode-org/vulnerablecode#1985
10	Integrate Local VulnerableCode datasource and live evaluation	aboutcode-org/vulntotal-extension#17

Closing Thoughts

This project was an exciting step forward from my 2024 GSoC work. By moving from batch importers to package-first live importers, We enabled a faster, more personalized, and more flexible way of building vulnerability databases.

I especially enjoyed designing the registry + API architecture and integrating Redis queues and workers for asynchronous execution. This improved scalability, responsiveness, and fault tolerance, ensuring the API never blocks and multiple live evaluations can run in parallel. I also appreciated discussing it with mentors and integrating it seamlessly across VulnerableCode, VulnTotal, and the browser extension.

This work lays the foundation for even richer interactivity in the ecosystem and brings vulnerability evaluation closer to developers’ workflows.

I appreciated the weekly status calls and the feedback I received from my mentors and the amazing team. They were really helpful and supportive. Philippe Ombredanne, Ayan Sinha Mahapatra, Tushar Goel, Keshav Priyadarshi