XL e2e nightly job fails due to insufficient AWS capacity for `g6e.48xlarge` instances in `us-east-2`

## Overview

Our XL e2e nightly job failed last night because Amazon did not have any `g6e.48xlarge` instances available at the time:

<img width="1285" alt="Image" src="https://github.com/user-attachments/assets/a3adefe2-108b-426b-8af8-a562b3c33b7e" />

## Recommended Solution

We should keep our existing workflow logic to launch a `g6e.48xlarge` instance in our desired subnet on `us-east-2`. However, if that instance fails to launch due to capacity constraints, we should then fall back and select another subnet.

We appear to have 3 subnets available to us within the `us-east-2` region, and we can try the remaining two regions we have not touched.

Worst case scenario, we can create a follow-up issue to investigate the possibility of trying other regions too (as fallbacks). We just must be aware that different regions tend to have different pricing models.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

XL e2e nightly job fails due to insufficient AWS capacity for `g6e.48xlarge` instances in `us-east-2` #2974

Overview

Recommended Solution

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

XL e2e nightly job fails due to insufficient AWS capacity for g6e.48xlarge instances in us-east-2 #2974

Description

Overview

Recommended Solution

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

XL e2e nightly job fails due to insufficient AWS capacity for `g6e.48xlarge` instances in `us-east-2` #2974