Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

instance type g5.xlarge is not supported by cortex #2432

Open
dreamflasher opened this issue Mar 4, 2022 · 2 comments
Open

instance type g5.xlarge is not supported by cortex #2432

dreamflasher opened this issue Mar 4, 2022 · 2 comments
Labels
bug Something isn't working

Comments

@dreamflasher
Copy link

Version

0.42

Description

error: cluster.yaml: node_groups: index 0: instance_type: instance type g5.xlarge is not supported by cortex

Configuration

# cluster.yaml

# cluster name
cluster_name: cortex-sr

# AWS region
region: us-west-2

# list of cluster node groups;
node_groups:
  - name: ng-gpu
    instance_type: g5.2xlarge
    min_instances: 1
    max_instances: 5
    instance_volume_size: 50
    instance_volume_type: gp3
    spot: true
  # ...

# API load balancer type [nlb | elb]
api_load_balancer_type: nlb

# API load balancer scheme [internet-facing | internal]
api_load_balancer_scheme: internet-facing

# operator load balancer scheme [internet-facing | internal]
# note: if using "internal", you must configure VPC Peering to connect your CLI to your cluster operator
operator_load_balancer_scheme: internet-facing

# restrict access to APIs by cidr blocks/ip address ranges
api_load_balancer_cidr_white_list: [0.0.0.0/0]

# restrict access to the Operator by cidr blocks/ip address ranges
operator_load_balancer_cidr_white_list: [0.0.0.0/0]

# list of IAM policies to attach to your Cortex APIs
iam_policy_arns: ["arn:aws:iam::aws:policy/AmazonS3FullAccess"]

# instance type for prometheus (use an instance with more memory for clusters exceeding 300 nodes or 300 pods)
prometheus_instance_type: "t3.medium"

Steps to reproduce

  1. Start a g5 cluster

Expected behavior

Cluster starts

Actual behavior

error: cluster.yaml: node_groups: index 0: instance_type: instance type g5.xlarge is not supported by cortex

Suggested solution

Please support g5

@dreamflasher dreamflasher added the bug Something isn't working label Mar 4, 2022
@deliahu
Copy link
Member

deliahu commented Mar 10, 2022

Unfortunately we are still waiting for a new release of the AWS CNI which supports g5 instances. Once that happens, it should be straightforward to add support in Cortex.

@oborchers
Copy link

@deliahu: Is there any update on this one?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants