Skip to content

Rework CBMC's solver-time-limit infrastructure#9019

Open
tautschnig wants to merge 3 commits into
diffblue:developfrom
tautschnig:solver-time-limit-foundation
Open

Rework CBMC's solver-time-limit infrastructure#9019
tautschnig wants to merge 3 commits into
diffblue:developfrom
tautschnig:solver-time-limit-foundation

Conversation

@tautschnig
Copy link
Copy Markdown
Collaborator

The pre-existing --solver-time-limit N (CLI flag never registered in CBMC's option spec; option-reader added in 38a370c for use by dependent projects via options.set_option(...)) was POSIX-only (POSIX alarm()/SIGALRM), MiniSat 2 only, with one-second granularity, and untested in CI. This PR makes it work, adds CI tests, and broadens back-end support, in three orthogonal commits:

  1. Wire up --solver-time-limit on the CLI; add IPASIR backend support; add CI tests.

    • Registers --solver-time-limit in OPT_SOLVER / HELP_SOLVER and parse_solver_options. cbmc, jbmc, goto-synthesizer and libcprover-cpp all pick it up.
    • Adds IPASIR set_time_limit_seconds via ipasir_set_terminate. Returns P_ERROR on interrupt instead of throwing analysis_exceptiont, matching the MiniSat 2 backend.
    • New regression test (regression/cbmc/solver-time-limit/) on the happy path.
    • New unit-test SCENARIOs (PHP(20) pigeonhole + 1-second deadline) in unit/solvers/sat/satcheck_minisat2.cpp and a new satcheck_ipasir.cpp (conditional on HAVE_IPASIR, exercised by the existing IPASIR/Riss CI job).
    • Documents the flag in doc/man/cbmc.1, jbmc.1, goto-synthesizer.1.
  2. API in milliseconds; portable MiniSat watchdog.

    • solver_resource_limitst::set_time_limit_milliseconds and propt::set_time_limit_milliseconds are now the primary virtual API. set_time_limit_seconds survives as a non-virtual seconds→ms helper, so external/programmatic callers keep working.
    • The MiniSat 2 back-end replaces alarm()/SIGALRM/solver_to_interrupt with a std::thread watchdog: cross-OS portable (incl. Windows, where the old code was a no-op with a warning), sub-millisecond granular, no global signal state, no nesting hazards.
  3. CaDiCaL terminator support.

    • satcheck_cadical_baset gets set_time_limit_milliseconds and a nested terminatort (a CaDiCaL::Terminator polled by the solver during decision steps). On interrupt, returns P_ERROR instead of throwing.

After this PR all three SAT back-ends with native interrupt infrastructure (MiniSat 2, IPASIR, CaDiCaL) honour --solver-time-limit and set_time_limit_milliseconds portably with sub-second granularity; back-ends without (Glucose, PicoSAT, Lingeling, …) inherit the default warning, unchanged.

The unit-test pigeonhole SCENARIOs verify both correct outcome (P_ERROR) and timely interrupt (the 200 ms test completes in ~207-208 ms locally, confirming sub-millisecond watchdog accuracy).

This is foundation for a follow-up PR (branch-pruning) that needs sub-second per-check budgets to bound pruning overhead.

  • Each commit message has a non-empty body, explaining why the change was made.
  • Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
  • The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
  • Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
  • n/a My commit message includes data points confirming performance improvements (if claimed).
  • My PR is restricted to a single feature or bugfix.
  • n/a White-space or formatting changes outside the feature-related changed lines are in commits of their own.

@tautschnig tautschnig self-assigned this May 28, 2026
Copilot AI review requested due to automatic review settings May 28, 2026 11:18
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Reworks CBMC's --solver-time-limit infrastructure so the flag is actually registered on the CLI, works portably (including Windows), supports sub-second granularity (milliseconds), and is honoured by all three SAT back-ends with native interrupt mechanisms (MiniSat 2, IPASIR, CaDiCaL). Adds regression and unit tests, and updates man pages.

Changes:

  • Registers --solver-time-limit in the shared OPT_SOLVER/HELP_SOLVER macros and parse_solver_options, with man-page documentation for cbmc/jbmc/goto-synthesizer.
  • Switches the primary propt / solver_resource_limitst API from set_time_limit_seconds to set_time_limit_milliseconds (seconds helper retained as a non-virtual wrapper); replaces MiniSat's alarm/SIGALRM with a portable std::thread watchdog; adds IPASIR ipasir_set_terminate and CaDiCaL Terminator implementations that return P_ERROR on interrupt instead of throwing.
  • Adds a solver-time-limit regression test and PHP(20)/200 ms pigeonhole unit tests for all three back-ends (IPASIR test guarded by HAVE_IPASIR).

Reviewed changes

Copilot reviewed 20 out of 20 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/goto-checker/solver_factory.h Adds --solver-time-limit to option spec and help text
src/goto-checker/solver_factory.cpp Reads the new CLI flag into options
src/solvers/prop/solver_resource_limits.h Switches virtual API to milliseconds; keeps seconds helper
src/solvers/prop/prop.h Same API shift for propt; default still warns
src/solvers/prop/prop_conv_solver.h Forwards new ms-based override
src/solvers/sat/satcheck_minisat2.{h,cpp} Replaces SIGALRM watchdog with portable std::thread/condition_variable
src/solvers/sat/satcheck_ipasir.{h,cpp} Implements deadline via ipasir_set_terminate; returns P_ERROR instead of throwing
src/solvers/sat/satcheck_cadical.{h,cpp} Adds nested terminatort (CaDiCaL::Terminator); returns P_ERROR instead of throwing
unit/solvers/sat/satcheck_minisat2.cpp New PHP(20) timeout scenario
unit/solvers/sat/satcheck_ipasir.cpp New unit-test file with satisfiable + PHP(20) timeout scenarios
unit/solvers/sat/satcheck_cadical.cpp New PHP(20) timeout scenario
unit/Makefile Adds the new IPASIR unit-test source; excludes it when IPASIR is unset
regression/cbmc/solver-time-limit/{main.c,test.desc} New happy-path regression test for --solver-time-limit
doc/man/{cbmc,jbmc,goto-synthesizer}.1 Documents the new flag

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/solvers/sat/satcheck_minisat2.cpp Outdated
Comment on lines 259 to 270
lbool solver_result = solver->solveLimited(solver_assumptions);

if(old_handler != SIG_ERR)
{
alarm(0);
signal(SIGALRM, old_handler);
solver_to_interrupt = solver.get();
}

#else // _WIN32

if(time_limit_seconds != 0)
// Signal the watchdog to exit (if running) and join it.
if(watchdog.joinable())
{
log.warning() << "Time limit ignored (not supported on Win32 yet)"
<< messaget::eom;
{
std::lock_guard<std::mutex> lk(watchdog_mutex);
solve_done = true;
}
watchdog_cv.notify_one();
watchdog.join();
}
Comment on lines +32 to +35
void set_time_limit_seconds(uint32_t lim)
{
set_time_limit_milliseconds(lim * 1000);
}
@codecov
Copy link
Copy Markdown

codecov Bot commented May 28, 2026

Codecov Report

❌ Patch coverage is 98.38710% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 80.62%. Comparing base (01ebe42) to head (74a0f57).

Files with missing lines Patch % Lines
src/solvers/prop/prop.h 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #9019      +/-   ##
===========================================
+ Coverage    80.59%   80.62%   +0.02%     
===========================================
  Files         1711     1711              
  Lines       189454   189494      +40     
  Branches        73       73              
===========================================
+ Hits        152697   152775      +78     
+ Misses       36757    36719      -38     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

…pport; add CI tests

The `solver-time-limit` option has been read by `solver_factoryt`
since 38a370c (Jan 2019), but was never registered as a
command-line flag in `OPT_SOLVER` -- the original use case was
purely programmatic, via dependent projects calling
`options.set_option("solver-time-limit", N)` directly. That left
the option's plumbing untested in CI and the only backend that
honoured it (MiniSat 2) had no end-to-end coverage.

This change:

  * Registers `--solver-time-limit N` in `OPT_SOLVER` /
    `HELP_SOLVER` and wires the cmdline-to-options forwarding in
    `parse_solver_options`. `cbmc`, `jbmc`, `goto-synthesizer` and
    `libcprover-cpp` (which all use `parse_solver_options`) now
    accept the flag from the command line.

  * Documents the flag in `doc/man/cbmc.1`,
    `doc/man/goto-synthesizer.1` and `doc/man/jbmc.1`, so
    `scripts/check_help.sh` continues to pass.

  * Implements `set_time_limit_seconds` for the IPASIR back-end
    (`satcheck_ipasirt`). The implementation registers an
    `ipasir_set_terminate` callback that returns 1 once the
    per-solve deadline has passed; IPASIR solvers poll the
    callback during solving and abort cleanly when it does. On
    timeout the back-end returns `P_ERROR` (matching the MiniSat
    2 back-end's behaviour) instead of throwing
    `analysis_exceptiont`.

Tests:

  * `regression/cbmc/solver-time-limit/`: a `CORE` regression
    test that pins down the happy path -- the option is parsed
    and propagated through to the solver layer without breaking
    a trivial solve. Runs on every CI build.

  * `unit/solvers/sat/satcheck_minisat2.cpp`: extended with a
    pigeonhole-PHP(20) `SCENARIO` that builds the formula
    directly via `new_variable` / `lcnf`, sets a 1-second time
    limit, and asserts that `prop_solve` returns `P_ERROR`
    within five seconds. PHP is provably exponential for
    resolution-based SAT (Haken, 1985), so the timeout is
    reliably exceeded on any reasonable hardware.

  * `unit/solvers/sat/satcheck_ipasir.cpp`: new file, mirroring
    the satcheck_minisat2 test. Conditional on `HAVE_IPASIR`,
    so it only compiles in CI builds that link an IPASIR back-end
    (currently the `check-ubuntu-24_04-make-gcc` job, which uses
    Riss as the IPASIR provider).

This is the first piece of foundation work for a wider redesign of
CBMC's solver-timeout infrastructure (sub-second granularity,
cross-OS, watchdog-based interrupt). Subsequent commits will
build on a now-tested baseline.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
@tautschnig tautschnig force-pushed the solver-time-limit-foundation branch from fd8302f to 3cd2b8f Compare May 28, 2026 23:17
@tautschnig tautschnig requested a review from a team as a code owner May 28, 2026 23:17
tautschnig and others added 2 commits May 29, 2026 08:14
The solver-time-limit infrastructure used to be expressed in whole
seconds, was implemented for MiniSat 2 via POSIX `alarm()` +
`SIGALRM` (so it was POSIX-only and process-wide), and was
explicitly skipped on Windows with a "Time limit ignored (not
supported on Win32 yet)" warning. In practice this meant:

  * Sub-second budgets -- needed for fine-grained per-check
    timeouts inside symbolic execution -- could not be expressed.
  * Windows builds had no time-limit support at all.
  * The signal handler relied on a process-wide
    `solver_to_interrupt` pointer; nesting solve calls across
    threads or having two solvers active in the same process was
    fragile.

This change reworks the API and the MiniSat 2 implementation:

  * `solver_resource_limitst::set_time_limit_milliseconds` and
    `propt::set_time_limit_milliseconds` are now the primary
    virtual API. The default `propt` implementation logs a
    warning, mirroring the previous behaviour for back-ends with
    no native timeout support.

  * `set_time_limit_seconds` survives as a non-virtual helper
    that forwards to `set_time_limit_milliseconds(lim * 1000)`,
    so external callers (e.g. the CLI option-parser, dependent
    projects) keep working without source changes.

  * The MiniSat 2 back-end now implements the time limit using a
    `std::thread` watchdog: when the limit is non-zero, a
    watchdog thread waits on a `std::condition_variable` for
    either the solve to complete (signalled via a flag) or the
    deadline to elapse, and on timeout calls
    `solver->interrupt()` (which is just a flag set on the
    Minisat::Solver object and is safe to invoke from another
    thread). The watchdog is joined before `do_prop_solve`
    returns. Result: cross-OS, sub-millisecond granularity, no
    global signal state, no `SIGALRM` collision risk.

  * The IPASIR back-end's existing
    `set_time_limit_seconds` override is renamed/repurposed as
    `set_time_limit_milliseconds`; the deadline is computed in
    `std::chrono::milliseconds` instead of seconds. The
    callback already polled a `chrono::steady_clock` deadline,
    so this is essentially a stored-precision change.

  * The `solver_factoryt` call site in `solver_factory.cpp`
    keeps using the seconds helper (the CLI flag
    `--solver-time-limit` remains expressed in seconds for
    user familiarity); only the internal API is now in
    milliseconds.

Tests:

  * Existing pigeonhole(20) SCENARIOs in
    `unit/solvers/sat/satcheck_minisat2.cpp` and
    `unit/solvers/sat/satcheck_ipasir.cpp` now request a
    200-millisecond limit and verify
    (a) that `prop_solve` returns `P_ERROR`, and
    (b) that wall-clock elapsed time is well under five seconds
        (a generous slack for slow CI hardware and watchdog-join
        latency).
    Locally the satcheck_minisat2 test completes in ~207 ms,
    confirming sub-millisecond accuracy of the watchdog.

This is the second piece of the wider solver-timeout rework.
Subsequent work will add CaDiCaL terminator support and let the
branch-pruning code use sub-second per-call budgets directly.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
The CaDiCaL back-end was the last SAT back-end in CBMC for which
`set_time_limit_milliseconds` (and its `set_time_limit_seconds`
helper) silently fell through to `propt`'s default no-op + warning.
This change adds native time-limit support via CaDiCaL's
`Terminator` callback API:

  * `satcheck_cadical_baset` declares a nested `terminatort` class
    in the header (forward-declared to keep `cadical.hpp` out of
    the header).

  * `terminatort` is defined in the .cpp and inherits
    `CaDiCaL::Terminator`. It stores a deadline as a
    `steady_clock::time_point` and returns true once the deadline
    has passed.

  * `set_time_limit_milliseconds(uint32_t)` stores the limit on
    the solver instance.

  * `do_prop_solve` constructs / refreshes the deadline before
    calling `solver->solve()`, calls
    `solver->connect_terminator(...)` while solving, and calls
    `solver->disconnect_terminator()` afterwards. CaDiCaL polls
    the terminator at decision points during solving and aborts
    cleanly on a true return.

  * On interrupt, `do_prop_solve` now returns `propt::resultt
    ::P_ERROR` -- matching the MiniSat 2 and IPASIR back-ends --
    instead of throwing `analysis_exceptiont`. The
    `<util/exception_utils.h>` include is dropped as it is no
    longer needed.

A `set_time_limit_milliseconds(200)` SCENARIO is added to
`unit/solvers/sat/satcheck_cadical.cpp` mirroring the existing
MiniSat 2 / IPASIR timeout tests: a pigeonhole(20) CNF (provably
exponential for resolution-based SAT) is built directly via
`new_variable` / `lcnf`, and the solver is asserted to return
`P_ERROR` within five seconds. Locally the test completes in
~208 ms (200 ms deadline + ~8 ms terminator-polling latency),
confirming sub-millisecond accuracy of the CaDiCaL implementation.

After this commit, all three SAT back-ends with native interrupt
infrastructure (MiniSat 2 via std::thread watchdog, IPASIR via
`ipasir_set_terminate`, CaDiCaL via `connect_terminator`) honour
`--solver-time-limit` and `set_time_limit_milliseconds` with
sub-second granularity, are portable across operating systems,
and return `P_ERROR` on interrupt for callers to detect.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
@tautschnig tautschnig force-pushed the solver-time-limit-foundation branch from 3cd2b8f to 74a0f57 Compare May 29, 2026 08:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants