Skip to content

fix: surface silent task loss from auto-clear and disk failures#15

Open
kylesnowschwartz wants to merge 1 commit into
tintinweb:masterfrom
kylesnowschwartz:fix/silent-task-loss
Open

fix: surface silent task loss from auto-clear and disk failures#15
kylesnowschwartz wants to merge 1 commit into
tintinweb:masterfrom
kylesnowschwartz:fix/silent-task-loss

Conversation

@kylesnowschwartz

Copy link
Copy Markdown

Disclosure

This patch was AI-generated (Claude) at my request, motivated by a few silent-failure modes I noticed while using pi-tasks in real work — most painfully, completed-tool events for an auto-cleared task being swallowed and the LLM repeatedly trying to update task IDs that had vanished from its context.

QA level: light. I've verified the on-disk durability paths manually (corrupt file, deleted directory, missing-ID update) and the included unit tests pass (151 passed, plus biome check and tsc --noEmit clean). I have not end-to-end exercised the auto-clear → system-reminder pathway or the subagent-vanish-during-run pathway in a live session — those rely on session-lifecycle timing I didn't try to synthesize. Treat accordingly; happy to iterate on whatever you'd like tightened, simplified, or reverted.


Failure modes addressed

  • TaskStore.update() returned the same shape for both task missing and task deleted, forcing callers to disambiguate by inspecting fields. The subagent listeners in index.ts ignored the return entirely, so a TaskExecute task that was auto-cleared while the subagent was running would silently swallow the completion event.
  • AutoClearManager.onTurnStart returned a boolean, hiding which task IDs were just removed. The LLM had no way to know its earlier-known task IDs had been auto-cleared, so subsequent TaskUpdate calls returned Task #X not found for tasks the user could clearly see in context.
  • TaskStore.load() silently swallowed JSON parse errors despite a "start fresh" comment that did not match the code. No signal to the user, no signal to the host.
  • mkdirSync ran only in the constructor, so removing the tasks directory mid-stream caused acquireLock and save to throw ENOENT and lose the in-flight mutation.

Changes

  • task-store.ts: update() returns notFound: boolean; load() splits ENOENT (silent) from parse errors (calls onCorruptFile callback); ensureDir() runs inside acquireLock and save for auto-heal.
  • auto-clear.ts: onTurnStart returns { cleared, ids } so the host can tell the LLM exactly which tasks vanished.
  • index.ts: subagent listeners check notFound and surface a UI notice if work completed for a vanished task; tool_result handler drains pendingAutoClearedIds onto the next tool result as a system-reminder so the LLM stops trying to update vanished tasks; pendingAutoClearedIds is reset on session_switch alongside other session-scoped state.
  • 6 new tests covering file deletion, dir deletion, missing-id update, successful update return shape, deletion return shape, and corrupt-file preservation.

What I actually verified

Check Result
biome check src/ test/ clean
tsc --noEmit clean
vitest run 151 passed (5 files)
Manual: TaskUpdate of nonexistent ID surfaces "not found" rather than swallowing
Manual: rm -rf .pi/tasks mid-session, then TaskUpdate dir auto-recreated, mutation persisted
Manual: corrupt tasks-*.json mid-session in-memory state preserved, onCorruptFile warning toast appeared in pi UI, next save healed the file
Auto-clear pendingAutoClearedIds system-reminder path not exercised live (logic and unit tests only)
Subagent listener notFound UI notice not exercised live (logic and unit tests only)

Diffstat: 5 files changed, +248, -30.

Failure modes addressed:

- TaskStore.update() returned the same shape for both 'task missing' and
  'task deleted', forcing callers to disambiguate by inspecting fields.
  Subagent listeners in index.ts ignored the return entirely, so a
  TaskExecute task that was auto-cleared while the subagent was running
  would silently swallow the completion event.
- AutoClearManager.onTurnStart returned a boolean, hiding which task IDs
  were just removed. The LLM had no way to know its earlier-known task
  IDs had been auto-cleared, so subsequent TaskUpdate calls returned
  'Task #X not found' for tasks the user could clearly see in context.
- TaskStore.load() silently swallowed JSON parse errors despite a
  'start fresh' comment that did not match the code. No signal to the
  user, no signal to the host.
- mkdirSync ran only in the constructor, so removing the tasks
  directory mid-stream caused acquireLock and save to throw ENOENT
  and lose the in-flight mutation.

Changes:

- task-store.ts: update() returns notFound: boolean; load() splits
  ENOENT (silent) from parse errors (calls onCorruptFile callback);
  ensureDir() runs inside acquireLock and save for auto-heal.
- auto-clear.ts: onTurnStart returns { cleared, ids } so the host can
  tell the LLM exactly which tasks vanished.
- index.ts: subagent listeners check notFound and surface a UI notice
  if work completed for a vanished task; tool_result handler drains
  pendingAutoClearedIds onto the next tool result as a system-reminder
  so the LLM stops trying to update vanished tasks; pendingAutoClearedIds
  is reset on session_switch alongside other session-scoped state.
- 6 new tests covering file deletion, dir deletion, missing-id update,
  successful update return shape, deletion return shape, and corrupt-file
  preservation.
pe200012 added a commit to pe200012/pi-tasks that referenced this pull request May 16, 2026
Adapt the reminder cadence refactor to preserve PR tintinweb#15 auto-cleared task ID reminders and missing-task hardening.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant