[DPE-10450] Skip Patroni REST API call in member_inactive when snap is down#1793
Open
taurus-forever wants to merge 2 commits into
Open
[DPE-10450] Skip Patroni REST API call in member_inactive when snap is down#1793taurus-forever wants to merge 2 commits into
taurus-forever wants to merge 2 commits into
Conversation
…s down Guard member_inactive with is_patroni_running() (mirroring member_started) so that when the Patroni snap service is not active we return True immediately instead of spending ~60s retrying the Patroni REST API. Assisted-by: Claude:claude-4.8-opus
update-status could crash with FileNotFoundError when it fired before a replica finished bootstrapping: the cluster is already initialised (so _can_run_on_update_status passes) and the unit IP is in members_ips, but the PostgreSQL data directory has not been created yet. member_inactive returns True because Patroni is not running, so os.listdir was called on a non-existent path. Return early (no action) when the data directory does not exist yet, since the member has not been initialised and there is no frozen process to recover. Assisted-by: Claude:claude-4.8-opus
dragomirp
approved these changes
Jun 23, 2026
marceloneppel
approved these changes
Jun 23, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Issue
The update-status cost 1 extra minute (for nothing) on the server restart IF
update-statuscomes beforestart.See juju/juju#22688
It also affects all cluster recovery cases when update-status is unpredictable/random but real event.
Example:
Solution
Guard member_inactive with is_patroni_running() (mirroring member_started) so that when the Patroni snap service is not active we return True immediately instead of spending ~60s retrying the Patroni REST API.
Assisted-by: Claude:claude-4.8-opus
Checklist