Rework CBMC's solver-time-limit infrastructure#9019
Conversation
There was a problem hiding this comment.
Pull request overview
Reworks CBMC's --solver-time-limit infrastructure so the flag is actually registered on the CLI, works portably (including Windows), supports sub-second granularity (milliseconds), and is honoured by all three SAT back-ends with native interrupt mechanisms (MiniSat 2, IPASIR, CaDiCaL). Adds regression and unit tests, and updates man pages.
Changes:
- Registers
--solver-time-limitin the sharedOPT_SOLVER/HELP_SOLVERmacros andparse_solver_options, with man-page documentation for cbmc/jbmc/goto-synthesizer. - Switches the primary
propt/solver_resource_limitstAPI fromset_time_limit_secondstoset_time_limit_milliseconds(seconds helper retained as a non-virtual wrapper); replaces MiniSat'salarm/SIGALRMwith a portablestd::threadwatchdog; adds IPASIRipasir_set_terminateand CaDiCaLTerminatorimplementations that returnP_ERRORon interrupt instead of throwing. - Adds a
solver-time-limitregression test and PHP(20)/200 ms pigeonhole unit tests for all three back-ends (IPASIR test guarded byHAVE_IPASIR).
Reviewed changes
Copilot reviewed 20 out of 20 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| src/goto-checker/solver_factory.h | Adds --solver-time-limit to option spec and help text |
| src/goto-checker/solver_factory.cpp | Reads the new CLI flag into options |
| src/solvers/prop/solver_resource_limits.h | Switches virtual API to milliseconds; keeps seconds helper |
| src/solvers/prop/prop.h | Same API shift for propt; default still warns |
| src/solvers/prop/prop_conv_solver.h | Forwards new ms-based override |
| src/solvers/sat/satcheck_minisat2.{h,cpp} | Replaces SIGALRM watchdog with portable std::thread/condition_variable |
| src/solvers/sat/satcheck_ipasir.{h,cpp} | Implements deadline via ipasir_set_terminate; returns P_ERROR instead of throwing |
| src/solvers/sat/satcheck_cadical.{h,cpp} | Adds nested terminatort (CaDiCaL::Terminator); returns P_ERROR instead of throwing |
| unit/solvers/sat/satcheck_minisat2.cpp | New PHP(20) timeout scenario |
| unit/solvers/sat/satcheck_ipasir.cpp | New unit-test file with satisfiable + PHP(20) timeout scenarios |
| unit/solvers/sat/satcheck_cadical.cpp | New PHP(20) timeout scenario |
| unit/Makefile | Adds the new IPASIR unit-test source; excludes it when IPASIR is unset |
| regression/cbmc/solver-time-limit/{main.c,test.desc} | New happy-path regression test for --solver-time-limit |
| doc/man/{cbmc,jbmc,goto-synthesizer}.1 | Documents the new flag |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| lbool solver_result = solver->solveLimited(solver_assumptions); | ||
|
|
||
| if(old_handler != SIG_ERR) | ||
| { | ||
| alarm(0); | ||
| signal(SIGALRM, old_handler); | ||
| solver_to_interrupt = solver.get(); | ||
| } | ||
|
|
||
| #else // _WIN32 | ||
|
|
||
| if(time_limit_seconds != 0) | ||
| // Signal the watchdog to exit (if running) and join it. | ||
| if(watchdog.joinable()) | ||
| { | ||
| log.warning() << "Time limit ignored (not supported on Win32 yet)" | ||
| << messaget::eom; | ||
| { | ||
| std::lock_guard<std::mutex> lk(watchdog_mutex); | ||
| solve_done = true; | ||
| } | ||
| watchdog_cv.notify_one(); | ||
| watchdog.join(); | ||
| } |
| void set_time_limit_seconds(uint32_t lim) | ||
| { | ||
| set_time_limit_milliseconds(lim * 1000); | ||
| } |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #9019 +/- ##
===========================================
+ Coverage 80.59% 80.62% +0.02%
===========================================
Files 1711 1711
Lines 189454 189494 +40
Branches 73 73
===========================================
+ Hits 152697 152775 +78
+ Misses 36757 36719 -38 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…pport; add CI tests The `solver-time-limit` option has been read by `solver_factoryt` since 38a370c (Jan 2019), but was never registered as a command-line flag in `OPT_SOLVER` -- the original use case was purely programmatic, via dependent projects calling `options.set_option("solver-time-limit", N)` directly. That left the option's plumbing untested in CI and the only backend that honoured it (MiniSat 2) had no end-to-end coverage. This change: * Registers `--solver-time-limit N` in `OPT_SOLVER` / `HELP_SOLVER` and wires the cmdline-to-options forwarding in `parse_solver_options`. `cbmc`, `jbmc`, `goto-synthesizer` and `libcprover-cpp` (which all use `parse_solver_options`) now accept the flag from the command line. * Documents the flag in `doc/man/cbmc.1`, `doc/man/goto-synthesizer.1` and `doc/man/jbmc.1`, so `scripts/check_help.sh` continues to pass. * Implements `set_time_limit_seconds` for the IPASIR back-end (`satcheck_ipasirt`). The implementation registers an `ipasir_set_terminate` callback that returns 1 once the per-solve deadline has passed; IPASIR solvers poll the callback during solving and abort cleanly when it does. On timeout the back-end returns `P_ERROR` (matching the MiniSat 2 back-end's behaviour) instead of throwing `analysis_exceptiont`. Tests: * `regression/cbmc/solver-time-limit/`: a `CORE` regression test that pins down the happy path -- the option is parsed and propagated through to the solver layer without breaking a trivial solve. Runs on every CI build. * `unit/solvers/sat/satcheck_minisat2.cpp`: extended with a pigeonhole-PHP(20) `SCENARIO` that builds the formula directly via `new_variable` / `lcnf`, sets a 1-second time limit, and asserts that `prop_solve` returns `P_ERROR` within five seconds. PHP is provably exponential for resolution-based SAT (Haken, 1985), so the timeout is reliably exceeded on any reasonable hardware. * `unit/solvers/sat/satcheck_ipasir.cpp`: new file, mirroring the satcheck_minisat2 test. Conditional on `HAVE_IPASIR`, so it only compiles in CI builds that link an IPASIR back-end (currently the `check-ubuntu-24_04-make-gcc` job, which uses Riss as the IPASIR provider). This is the first piece of foundation work for a wider redesign of CBMC's solver-timeout infrastructure (sub-second granularity, cross-OS, watchdog-based interrupt). Subsequent commits will build on a now-tested baseline. Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
fd8302f to
3cd2b8f
Compare
The solver-time-limit infrastructure used to be expressed in whole
seconds, was implemented for MiniSat 2 via POSIX `alarm()` +
`SIGALRM` (so it was POSIX-only and process-wide), and was
explicitly skipped on Windows with a "Time limit ignored (not
supported on Win32 yet)" warning. In practice this meant:
* Sub-second budgets -- needed for fine-grained per-check
timeouts inside symbolic execution -- could not be expressed.
* Windows builds had no time-limit support at all.
* The signal handler relied on a process-wide
`solver_to_interrupt` pointer; nesting solve calls across
threads or having two solvers active in the same process was
fragile.
This change reworks the API and the MiniSat 2 implementation:
* `solver_resource_limitst::set_time_limit_milliseconds` and
`propt::set_time_limit_milliseconds` are now the primary
virtual API. The default `propt` implementation logs a
warning, mirroring the previous behaviour for back-ends with
no native timeout support.
* `set_time_limit_seconds` survives as a non-virtual helper
that forwards to `set_time_limit_milliseconds(lim * 1000)`,
so external callers (e.g. the CLI option-parser, dependent
projects) keep working without source changes.
* The MiniSat 2 back-end now implements the time limit using a
`std::thread` watchdog: when the limit is non-zero, a
watchdog thread waits on a `std::condition_variable` for
either the solve to complete (signalled via a flag) or the
deadline to elapse, and on timeout calls
`solver->interrupt()` (which is just a flag set on the
Minisat::Solver object and is safe to invoke from another
thread). The watchdog is joined before `do_prop_solve`
returns. Result: cross-OS, sub-millisecond granularity, no
global signal state, no `SIGALRM` collision risk.
* The IPASIR back-end's existing
`set_time_limit_seconds` override is renamed/repurposed as
`set_time_limit_milliseconds`; the deadline is computed in
`std::chrono::milliseconds` instead of seconds. The
callback already polled a `chrono::steady_clock` deadline,
so this is essentially a stored-precision change.
* The `solver_factoryt` call site in `solver_factory.cpp`
keeps using the seconds helper (the CLI flag
`--solver-time-limit` remains expressed in seconds for
user familiarity); only the internal API is now in
milliseconds.
Tests:
* Existing pigeonhole(20) SCENARIOs in
`unit/solvers/sat/satcheck_minisat2.cpp` and
`unit/solvers/sat/satcheck_ipasir.cpp` now request a
200-millisecond limit and verify
(a) that `prop_solve` returns `P_ERROR`, and
(b) that wall-clock elapsed time is well under five seconds
(a generous slack for slow CI hardware and watchdog-join
latency).
Locally the satcheck_minisat2 test completes in ~207 ms,
confirming sub-millisecond accuracy of the watchdog.
This is the second piece of the wider solver-timeout rework.
Subsequent work will add CaDiCaL terminator support and let the
branch-pruning code use sub-second per-call budgets directly.
Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
The CaDiCaL back-end was the last SAT back-end in CBMC for which
`set_time_limit_milliseconds` (and its `set_time_limit_seconds`
helper) silently fell through to `propt`'s default no-op + warning.
This change adds native time-limit support via CaDiCaL's
`Terminator` callback API:
* `satcheck_cadical_baset` declares a nested `terminatort` class
in the header (forward-declared to keep `cadical.hpp` out of
the header).
* `terminatort` is defined in the .cpp and inherits
`CaDiCaL::Terminator`. It stores a deadline as a
`steady_clock::time_point` and returns true once the deadline
has passed.
* `set_time_limit_milliseconds(uint32_t)` stores the limit on
the solver instance.
* `do_prop_solve` constructs / refreshes the deadline before
calling `solver->solve()`, calls
`solver->connect_terminator(...)` while solving, and calls
`solver->disconnect_terminator()` afterwards. CaDiCaL polls
the terminator at decision points during solving and aborts
cleanly on a true return.
* On interrupt, `do_prop_solve` now returns `propt::resultt
::P_ERROR` -- matching the MiniSat 2 and IPASIR back-ends --
instead of throwing `analysis_exceptiont`. The
`<util/exception_utils.h>` include is dropped as it is no
longer needed.
A `set_time_limit_milliseconds(200)` SCENARIO is added to
`unit/solvers/sat/satcheck_cadical.cpp` mirroring the existing
MiniSat 2 / IPASIR timeout tests: a pigeonhole(20) CNF (provably
exponential for resolution-based SAT) is built directly via
`new_variable` / `lcnf`, and the solver is asserted to return
`P_ERROR` within five seconds. Locally the test completes in
~208 ms (200 ms deadline + ~8 ms terminator-polling latency),
confirming sub-millisecond accuracy of the CaDiCaL implementation.
After this commit, all three SAT back-ends with native interrupt
infrastructure (MiniSat 2 via std::thread watchdog, IPASIR via
`ipasir_set_terminate`, CaDiCaL via `connect_terminator`) honour
`--solver-time-limit` and `set_time_limit_milliseconds` with
sub-second granularity, are portable across operating systems,
and return `P_ERROR` on interrupt for callers to detect.
Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
3cd2b8f to
74a0f57
Compare
The pre-existing
--solver-time-limit N(CLI flag never registered in CBMC's option spec; option-reader added in 38a370c for use by dependent projects viaoptions.set_option(...)) was POSIX-only (POSIXalarm()/SIGALRM), MiniSat 2 only, with one-second granularity, and untested in CI. This PR makes it work, adds CI tests, and broadens back-end support, in three orthogonal commits:Wire up
--solver-time-limiton the CLI; add IPASIR backend support; add CI tests.--solver-time-limitinOPT_SOLVER/HELP_SOLVERandparse_solver_options.cbmc,jbmc,goto-synthesizerandlibcprover-cppall pick it up.set_time_limit_secondsviaipasir_set_terminate. ReturnsP_ERRORon interrupt instead of throwinganalysis_exceptiont, matching the MiniSat 2 backend.regression/cbmc/solver-time-limit/) on the happy path.unit/solvers/sat/satcheck_minisat2.cppand a newsatcheck_ipasir.cpp(conditional onHAVE_IPASIR, exercised by the existing IPASIR/Riss CI job).doc/man/cbmc.1,jbmc.1,goto-synthesizer.1.API in milliseconds; portable MiniSat watchdog.
solver_resource_limitst::set_time_limit_millisecondsandpropt::set_time_limit_millisecondsare now the primary virtual API.set_time_limit_secondssurvives as a non-virtual seconds→ms helper, so external/programmatic callers keep working.alarm()/SIGALRM/solver_to_interruptwith astd::threadwatchdog: cross-OS portable (incl. Windows, where the old code was a no-op with a warning), sub-millisecond granular, no global signal state, no nesting hazards.CaDiCaL terminator support.
satcheck_cadical_basetgetsset_time_limit_millisecondsand a nestedterminatort(aCaDiCaL::Terminatorpolled by the solver during decision steps). On interrupt, returnsP_ERRORinstead of throwing.After this PR all three SAT back-ends with native interrupt infrastructure (MiniSat 2, IPASIR, CaDiCaL) honour
--solver-time-limitandset_time_limit_millisecondsportably with sub-second granularity; back-ends without (Glucose, PicoSAT, Lingeling, …) inherit the default warning, unchanged.The unit-test pigeonhole SCENARIOs verify both correct outcome (
P_ERROR) and timely interrupt (the 200 ms test completes in ~207-208 ms locally, confirming sub-millisecond watchdog accuracy).This is foundation for a follow-up PR (branch-pruning) that needs sub-second per-check budgets to bound pruning overhead.