Skip to content

Releases: pathsim/fastsim

v0.16.0 — native BVP solver, AlgebraicConstraint block, sparse implicit linear solver

Choose a tag to compare

@milanofthe milanofthe released this 22 Jun 09:03

Highlights

Native BVP1D blockscipy.solve_bvp rebuilt natively (Kierzenka–Shampine 4th-order Lobatto-IIIa/Simpson collocation + residual-based mesh refinement) with the Newton Jacobian from auto-differentiation of the traced fun/bc/icond. Matches scipy to 1e-7/1e-8 and is 80–340x faster (cold/warmstarted). Supports free parameters (eigenvalues, unknown fluxes) and interior/multipoint conditions at arbitrary ports (beyond scipy). Allocation-free hot path.

AlgebraicConstraint block — solves F(x, u) = 0 for x each evaluation (warmstarted Newton, AD Jacobian). The base primitive for instantaneous algebraic relations (chemical equilibrium, flash/VLE, steady-state operating points, implicit constitutive laws); a zeroed rate recovers the quasi-steady-state approximation.

Sparse implicit linear solverLinearSolver now caches the sparse symbolic LU (pattern-keyed) and solves in place, shared across the implicit stage solvers, the DAE inner-Newton, and the BVP collocation. Measured speedups vs the previous implementation: DAE 2.2x, large banded-sparse stiff systems ~1.5x, small stiff 1.15–1.35x — bit-identical results.

Other changes

  • BVP1D and all traced blocks now take inputs dynamically — n_inputs removed (no block declares an input count).
  • Method-of-lines stiff PDE benchmarks (Brusselator, heat) added to the suite to keep the sparse implicit path measured (mol_pde).
  • Hand-written block classes (BVP1D, AlgebraicConstraint, Scope, Spectrum) unified onto the central registry docstrings (pathsim format) and the standard info() introspection.

Full test suite: 397 Rust + 1365 Python tests passing.

v0.15.1 — tracer coverage + tape-lowering optimization

Choose a tag to compare

@milanofthe milanofthe released this 10 Jun 15:40

Patch over v0.15.0 (clippy lint gate fix only; runtime identical).

Tracer now covers array methods (x.sum()/dot()/clip()/...), extended ufuncs (radians, fmin/fmax, exp2, copysign, logaddexp, heaviside, expit), np.interp, constant factories as assignment targets (arange/linspace/eye/diag), extended indexing (constant fancy lists, negative steps, Ellipsis, newaxis), and mixed scalar/array ufunc dispatch. Python % and np.remainder now lower with correct floored-mod semantics. New tape-lowering pipeline (value-numbering canonicalization + chain fusion into Reduce/Dot kernels) cuts AD-Jacobian tapes ~38%; codegen output unchanged. Backed by a tracer coverage corpus and differential fuzzers (traced vs eager numpy).

v0.15.0 — tracer coverage + tape-lowering optimization

Choose a tag to compare

@milanofthe milanofthe released this 10 Jun 15:23

Tracer now covers array methods (x.sum()/dot()/clip()/...), extended ufuncs (radians, fmin/fmax, exp2, copysign, logaddexp, heaviside, expit), np.interp, constant factories as assignment targets (arange/linspace/eye/diag), extended indexing (constant fancy lists, negative steps, Ellipsis, newaxis), and mixed scalar/array ufunc dispatch. Python % and np.remainder now lower with correct floored-mod semantics. New tape-lowering pipeline (value-numbering canonicalization + chain fusion into Reduce/Dot kernels) cuts AD-Jacobian tapes ~38%; codegen output unchanged. Backed by a tracer coverage corpus and differential fuzzers (traced vs eager numpy).

v0.14.0 — struct-only codegen (hierarchical + library + pure-discrete)

Choose a tag to compare

@milanofthe milanofthe released this 10 Jun 09:20

Struct API now honors structure (hierarchical: per-block blk_i_alg/blk_i_deriv) and layout (library: blocks.{h,c} + solver.{h,c}), and supports pure-discrete models (n_state==0). The plain API is removed; struct is the sole codegen path (reentrant, embeddable via get_signal/set_signal). FMU export intact.

v0.13.0

Choose a tag to compare

@milanofthe milanofthe released this 09 Jun 09:13

FMI 3.0 Model Exchange FMU export: turn any fastsim Simulation (or Subsystem) into a portable, self-contained source FMU.

FMU export (sim.to_fmu(...), block.to_fmu(...))

  • FMI 3.0 source FMU: emits modelDescription.xml plus C sources (fmi3.h, model.{c,h}, fmu.c, buildDescription.xml) packaged as a .fmu zip. Built on the struct-everything codegen path (reentrant model_t, no globals), so the FMU compiles to a single translation unit.
  • Continuous Model Exchange: states, derivatives, outputs and parameters map to FMI value references straight from the codegen ModelLayout (single source of truth for the variable map and the SIG_ enum). Verified against the native run via self-import.

Directional derivatives (fmi3GetDirectionalDerivative)

  • Analytic forward-mode AD lowered to C (model_jvp): a tangent pass parallel to the primal, mirroring the SSA autodiff rules. Covers the full Jacobian surface: knowns over states, inputs and parameters; unknowns over derivatives and outputs (∂y/∂x, ∂ẋ/∂u, ∂y/∂u, ∂ẋ/∂p).
  • Tangents for min/max reductions (subgradient select), fmod, and 1-D LUTs (segment slope); a fastsim_digamma C helper backs lgamma/tgamma derivatives.

Events

  • Full event interface: fmi3GetEventIndicators / fmi3CompletedIntegratorStep / fmi3UpdateDiscreteStates for zero-crossing, condition and periodic events. valuesOfContinuousStatesChanged is reported only for state-modifying effects.

Subsystem (open-system) export

  • A Subsystem's interface inputs become FMI input variables (set via fmi3SetFloat64), interface outputs become FMI outputs, internal block outputs become locals. Resolves interface ports through arbitrary nesting and fan-out. Parameters are now tunable.

Closed continuous systems behave exactly as before.

v0.12.0 — vectorized Dot/Reduce + wider tracer surface

Choose a tag to compare

@milanofthe milanofthe released this 09 Jun 05:30

Two performance & ergonomics additions on top of the consolidated SSA core. No Python API change; still a drop-in for pathsim.

Vectorized Dot/Reduce (~1.55x on the tape matvec).

  • The canonical 4-lane multiply-add dot/reduce now live in the op manifest (ssa::op), shared by the native F64Builder, the interpreter, and the flat tape — so all three agree bit-for-bit (previously the native and tape paths used different reduction orders).
  • The tape's Dot/Reduce gather their operands into a contiguous scratch and run the 4-lane kernel, killing the per-element libm fma() call on the portable build and breaking the FP-add dependency chain.
  • Bench jit_tape/matvec_dot: -36% (n=8), -35% (n=32), p<0.01.

Wider tracer surface.

  • np.deg2rad / np.rad2deg / np.square / np.reciprocal and np.diff / np.cumsum now trace, evaluated bit-for-bit like numpy. They lower to compositions of existing ops (no new SSA op), so they get autodiff and C codegen for free — custom Python that uses them still compiles to a fused tape and to C.

Verified: Rust 372 lib + integration + differential fuzzer (interpret==tape bit-exact), native/codegen vector parity, clippy -D warnings, Python 607 passed / 302 subtests.

v0.11.0 — SSA graph as the single source of truth

Choose a tag to compare

@milanofthe milanofthe released this 08 Jun 21:05

Minor release consolidating the SSA-graph architecture. The SSA graph is now fastsim's physical single source of truth, and the module tree reflects it. No Python API change: still a drop-in replacement for pathsim, all tests green.

Highlights:

  • New ssa module — the symbolic-numeric core (pyo3-free, always compiled): the op graph, the op manifest, the canonical f64 semantics, the fast tape evaluator, the optimizer, autodiff, and the native/symbolic Builder. The block runtime closures, the IR, compile, and the C codegen all attach here.
  • New tracer module — the Python tracing frontend, one of several producers of an SSA graph (the misnamed jit module is gone).
  • Op manifest (ssa/op.rs) — the op vocabulary, canonical f64 semantics, flat-tape opcodes + mapping, and codegen C math-function names now live in one place; codegen is a thin backend.
  • Typed flat→structured slot seam (blockops::slot_kind), replacing string parsing in the IR decoder.
  • Block scheduler renamed to utils::schedule::Schedule, leaving exactly one Graph type in the crate (the SSA op graph).

Verified: Rust 372 lib + integration + differential fuzzer (2000 seeds, interpret==tape bit-exact), clippy -D warnings, Python 599 passed / 302 subtests, codegen C-compile matrix. Compiled vs interpreted event handling confirmed bit-identical.

v0.10.8 — SSA graph as single source of truth

Choose a tag to compare

@milanofthe milanofthe released this 08 Jun 19:42

Internal architecture refactor. No Python API change — still a drop-in replacement for pathsim; all tests green.

  • Extracted the SSA graph core into its own ssa module (graph, op manifest, tape, optimize, autodiff, build), pyo3-free and always compiled. The Python tracing frontend is now tracer (the misnamed jit module is gone).
  • New ssa/op.rs op manifest: the op vocabulary, canonical f64 semantics (apply_*), the flat-tape opcodes + Node→opcode mapping, and the codegen C math-function names all live in one place. codegen is now a thin backend.
  • Typed the flat→structured slot seam via blockops::slot_kind (no more string parsing of slot names in the IR decoder).
  • Renamed the block scheduler to utils::schedule::Schedule, so there is exactly one Graph type in the crate (the SSA op graph).

Verified: Rust 372 lib + integration + differential fuzzer (2000 seeds, interpret==tape bit-exact), clippy -D warnings, Python 599 passed / 302 subtests, codegen C-compile matrix.

v0.10.7

Choose a tag to compare

@milanofthe milanofthe released this 08 Jun 13:26

Fixes Simulation.compile() silently discarding the source simulation's solver choice.

CompiledSimulation

  • Solver inheritance: compile() now carries over the source simulation's solver, adaptive tolerances (tolerance_lte_abs/tolerance_lte_rel) and timestep dt, so a compiled run integrates the same problem with the same method. Previously it fell back to the default explicit RKBS32, which on a stiff model is stability-bound and took orders of magnitude more steps than the implicit solver the user selected (e.g. a stiff Van der Pol run ballooned to millions of micro-steps).
  • Adaptive gating: the compiled run loops now gate adaptive stepping by the solver's own adaptivity (adaptive && solver.is_adaptive), mirroring Simulation. A fixed-step solver combined with events no longer drives the event locator's step size down to dt_min (a runaway).

Simulation

  • Added a solver getter returning the active solver's class name, so sim.compile().solver == sim.solver.

Override any inherited setting afterwards via set_solver / dt / log on the compiled object.

v0.10.6

Choose a tag to compare

@milanofthe milanofthe released this 08 Jun 12:47

Brings CompiledSimulation (the statically compiled run object from Simulation.compile()) to parity with Simulation.

CompiledSimulation

  • Logging: compile() logs a COMPILE summary and each run() prints the same SOLVER setup line and interleaved TRANSIENT progress a Simulation run does. The continuous run loop now drives the standalone solver's per-step take_step (mirroring integrate, numerics unchanged) so progress is reported per step.
  • API: a log toggle and set_time on the compiled object; detailed docstrings on every public method and property, in the established pybinding style.
  • Internals: the compiled runtime moved to its own module (src/compile/runtime.rs), leaving compile/mod.rs as the compiler.