[live-migration] adds the paths for enabling save/restore#2709
Conversation
abe57f8 to
62083de
Compare
8b37be7 to
3c1f2ba
Compare
Host-side primitives to snapshot in-flight container state on the
source and re-attach to it on the destination, without disturbing
the existing create paths.
- cow: add MigrationState (vsock stdio ports + WaitForProcess call
id) and Process.MigrationState() accessor used by the save path.
Stubbed (zero value) on hcs.Process and jobcontainers.JobProcess.
- gcs:
- Process records the stdio vsock ports allocated by gc.exec and
exposes them via MigrationState; Close tolerates nil io channels
for streams not opened on restore.
- ExitCode tolerates hrNotFound from WaitForProcess (guest may
have reaped the process before the restored host re-subscribes);
Wait now routes through ExitCode.
- Rename CloneContainer -> OpenContainer as the generic "attach to
an already-running container" entry point.
- Add Container.OpenProcessWithIO: restore counterpart of
CreateProcess that re-listens on supplied vsock ports and
re-subscribes to the exit notification.
- Add GuestConnection.NextPort / SetNextPort to snapshot and seed
the IO port allocator floor so restored processes don't collide
with newly-allocated ones.
- cmd: add Attach, the destination counterpart of Command /
CommandContext that binds a Cmd to a caller-resolved process and
wires the IO relays (factored out of Start into startRelay).
- guest/bridge: reset Bridge.protVer to PvInvalid in ListenAndServe
so a fresh NegotiateProtocol after reconnect dispatches to the
PvInvalid handler instead of UnknownMessageHandler.
- vm/guestmanager: add Guest.OpenContainer, NextPort and SetNextPort
wrappers over the underlying GCS connection.
- vm/vmmanager: add UtilityVM.PropertiesV3 and migration.go with the
migration lifecycle wrappers (StartWithMigrationOptions,
Initialize/Start/Transfer/FinalizeLiveMigration,
MigrationNotifications).
- pkg/migration: add parse.go with protobuf -> HCS schema converters
for migration init options (memory transport, throttle params,
compression settings).
- Test and mock updates (cmd, gcs, hcs, jobcontainers, bridge,
controller/process mocks) for the new MigrationState contract,
Attach, and bridge reconnect behavior.
Signed-off-by: Harsh Rawat <harshrawat@microsoft.com>
| return nil | ||
| } | ||
| return &hcsschema.MigrationInitializeOptions{ | ||
| MemoryTransport: memoryTransportFromProto(p.MemoryTransport), |
There was a problem hiding this comment.
Should we fail if Memory transport is anything other than TCP instead of setting nil value?
| func (b *Bridge) ListenAndServe(bridgeIn io.ReadCloser, bridgeOut io.WriteCloser) error { | ||
| // Each new connection starts unnegotiated, so the next NegotiateProtocol | ||
| // request dispatches to the PvInvalid handler registered in AssignHandlers. | ||
| b.protVer = prot.PvInvalid |
There was a problem hiding this comment.
Is this change to make live migration work with newer shim and older gcs?
There was a problem hiding this comment.
No, this is to enable reconnect after the migration. During Connect, if the protocol is not the initial ie Invalid, we fail the call. Also, during successful connect, we would set the correct protocol.
This change would reset the protocol on each bridge recreate, so that the connect can work properly.
| return props, nil | ||
| } | ||
|
|
||
| // PropertiesV3 returns the properties of the utility VM from HCS using the V2 |
There was a problem hiding this comment.
No, this is V2 HCS API only. PropertiesV2 in the package also used the same but it is not generic enough. Hence, we are introducing PropertiesV3 which is fully generic and can be used to utilize all the queries to HCS V2 API.
Summary
Introduces the host-side primitives needed to snapshot in-flight container state on the source and re-attach to it on the destination, without disturbing the existing create paths.
cow: add MigrationState struct (stdin/stdout/stderr vsock ports + outstanding WaitForProcess bridge call id) and a Process.MigrationState() accessor used by the save path. Stubbed (zero value) on hcs.Process and jobcontainers.JobProcess, since neither uses vsock or a GCS bridge.
internal/gcs:
internal/cmd: add cmd.Attach, the destination-side counterpart of Command/CommandContext that binds a Cmd to a caller-resolved process and wires the IO relays. Relay wiring factored out of Start into startRelay. Tests cover Attach lifetime and IO flow.
internal/guest/bridge: reset Bridge.protVer to PvInvalid at the top of ListenAndServe so a fresh NegotiateProtocol after a reconnect dispatches to the PvInvalid-registered handler instead of falling through to UnknownMessageHandler. Covered by new TestBridge_ListenAndServeResetsProtocolVersion.
internal/vm/guestmanager: add Guest.OpenContainer, NextPort and SetNextPort wrappers over the underlying GCS connection.
internal/vm/vmmanager:
pkg/migration: add parse.go with protobuf -> HCS schema converters for migration initialization options (memory transport, throttle parameters, compression settings).
Test and mock updates across cow, cmd, gcs, hcs, jobcontainers, and bridge packages (including regenerated internal/controller/process/mocks/mock_cow.go) to satisfy the new MigrationState contract and exercise Attach / bridge reconnect behavior.