[IMPROVEMENT] Implement Network Reconnection for Enhancing Replica Rebuilding Resilience

## Is your improvement request related to a feature? Please describe (👍 if you like this request)

Replica rebuilding can fail due to networking issues such as dropped TCP connections when under high CPU load. When this occurs, the rebuilding process must restart. Although Longhorn can skip existing data blocks by comparing checksums of the source and destination, the rebuilding remains inefficient.

## Describe the solution you'd like

To improve efficiency, Longhorn could attempt to reconnect and resume the replica rebuilding process after a connection drop. If reconnection attempts exceed a maximum number, Longhorn can abort the rebuilding due to the node's highly unstable network.

## Describe alternatives you've considered



## Additional context

The improvement is inspired by https://github.com/longhorn/longhorn/issues/8745

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[IMPROVEMENT] Implement Network Reconnection for Enhancing Replica Rebuilding Resilience #9626

Is your improvement request related to a feature? Please describe (👍 if you like this request)

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[IMPROVEMENT] Implement Network Reconnection for Enhancing Replica Rebuilding Resilience #9626

Description

Is your improvement request related to a feature? Please describe (👍 if you like this request)

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions