feat: add backfill script (CM-1218)#4193
Conversation
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 399b9ba. Configure here.
| inserted += batchInserted | ||
| skipped += batchSkipped | ||
| batches++ | ||
| lastId = ids[ids.length - 1] |
There was a problem hiding this comment.
Cursor skips newly critical packages
Medium Severity
The backfill advances lastId and only lists packages with p.id > afterId. Critical packages whose is_critical flips to true after that id was passed are never selected in that run, yet the loop still exits when no higher ids remain. Those packages stay without a stewardships row until the job is run again from the start.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 399b9ba. Configure here.
There was a problem hiding this comment.
Pull request overview
Adds the initial “stewardship” backfill capability for OSSPREY Self Serve v1 by introducing DAL queries to find critical packages missing stewardship rows, plus a packages-worker script/runner to insert those rows in idempotent batches.
Changes:
- Add
@crowd/data-access-layerstewardship DAL functions to (a) page through critical packages lacking stewardship rows and (b) batch-insertunassigned/auto_importedstewardships withON CONFLICT DO NOTHING. - Add a packages-worker backfill runner with cursor pagination + batch-level logging and a CLI entrypoint with SIGINT/SIGTERM graceful shutdown.
- Add
pnpmscripts to run the stewardship backfill (including a:localvariant mirroring existing patterns).
Reviewed changes
Copilot reviewed 5 out of 6 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| services/libs/data-access-layer/src/osspckgs/stewardships.ts | Adds cursor-paginated selector for missing stewardships + idempotent batch insert with criticality re-check. |
| services/libs/data-access-layer/src/osspckgs/index.ts | Re-exports stewardship DAL module from the osspckgs index. |
| services/libs/data-access-layer/src/index.ts | Re-exports stewardship DAL module from the package root. |
| services/apps/packages_worker/src/stewardship/runStewardshipBackfill.ts | Implements the batched backfill loop using the new DAL functions, with graceful-stop support. |
| services/apps/packages_worker/src/bin/stewardship-backfill.ts | Adds CLI entrypoint: connects to packages-db, validates batch size env var, handles shutdown signals. |
| services/apps/packages_worker/package.json | Adds backfill:stewardship and backfill:stewardship:local scripts. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


Summary
Adds stewardship tables and a backfill script to seed the initial state required by the OSSPREY Self Serve program (v1). In v1 the stewardship program is read-only: every critical package gets one unassigned row in the new stewardships table. Write flows (claim, assign, status transitions) land in v2. The backfill script is the one-time (and safely re-runnable) job that populates those rows for the ~358K currently-critical packages.
Changes
Migration V1781094067__stewardship-tables.sql — creates stewardships and five satellite tables (stewardship_stewards, stewardship_activity, stewardship_assessments, stewardship_findings, stewardship_remediation_actions). Only stewardships is populated in v1; the rest are schema-only.
services/libs/data-access-layer/src/osspckgs/stewardships.ts — two DAL query functions: listCriticalPackagesWithoutStewardship (cursor-paginated LEFT JOIN anti-join) and insertUnassignedStewardships (batch INSERT with ON CONFLICT DO NOTHING and is_critical re-check at insert time to guard against concurrent criticality flips).
packages_worker/src/stewardship/runStewardshipBackfill.ts — idempotent loop over DAL functions; cursor-based pagination by package.id; supports graceful shutdown via isStopping callback designed for future Temporal activity wiring.
packages_worker/src/bin/stewardship-backfill.ts — entry point; validates STEWARDSHIP_BACKFILL_BATCH_SIZE env var (fails fast on NaN/non-positive); SIGINT/SIGTERM handled gracefully.
package.json — adds backfill:stewardship and backfill:stewardship:local npm scripts (mirrors backfill:maven:local pattern).
backend/src/api/public/v1/packages/types.ts — extracts StewardshipStatus, Lifecycle, SeverityLevel, OpenVulns, Steward, StewardshipSummary into a shared types file; removes inline duplicates from batchGetStewardship.ts.
mockData.ts / openapi.yaml — adds stewardship block to MockPackageDetail; fixes steward → stewards field rename; adds openVulns to OpenAPI required fields; adds PackageDetail.stewardship schema.
Type of change
JIRA ticket
ticket
Note
Medium Risk
Bulk inserts into production
stewardshipsfor ~358K packages; idempotent SQL limits duplicate risk, but operational mistakes or wrong DB/env could still affect package stewardship data.Overview
Adds a one-time, re-runnable backfill that seeds
stewardshipsrows (unassigned,auto_imported) for every critical package missing stewardship data, supporting OSSPREY Self Serve v1 read-only state.The data-access layer gains cursor-paginated
listCriticalPackagesWithoutStewardshipand batchinsertUnassignedStewardshipswithON CONFLICT DO NOTHINGand anis_criticalre-check at insert time.packages_workerwiresrunStewardshipBackfill(batch loop + optionalisStoppingfor graceful shutdown) and astewardship-backfillbin script withSTEWARDSHIP_BACKFILL_BATCH_SIZEvalidation and SIGINT/SIGTERM handling.backfill:stewardshipandbackfill:stewardship:localnpm scripts mirror the existing maven backfill pattern.Reviewed by Cursor Bugbot for commit 399b9ba. Bugbot is set up for automated code reviews on this repo. Configure here.