Creating the AWS EC2 Environment - A Step-by-Step Guide¶
Below is a step-by-step guide for creating an AWS environment suitable for hosting a K3s cluster (or other applications). These steps focus on AWS infrastructure only: VPC, subnet, security group, SSH key pair, EC2 instances (with Elastic IP addresses), and an EFS file system.
Note
- All commands below assume you are running in the AWS Cloud Shell with
us-east-2
as the desired region. You can easily change the default region.- For simplicity, we’ll create only one public subnet. You can expand to more subnets (private, HA, etc.) if needed.
- We’ll assign Elastic IP (EIP) addresses to each EC2 instance so their public IPs persist across stops/restarts.
- If you wish to add multiple control-plane/management nodes in the future for High Availability, you may need to incorporate a load balancer and point your domain’s DNS A record to that load balancer address rather than a single node’s IP.
1. Set Environment Variables¶
Open AWS Cloud Shell and set a few variables to make commands easier to run. Adjust the region, CIDR blocks, and other information if desired.
export REGION="us-east-2"
export CIDR_VPC="10.0.0.0/24"
export CIDR_SUBNET="10.0.0.0/24"
export SSH_KEY_NAME="MVAI-SSH-Key"
export SG_NAME="MVAIsg"
export VPC_NAME="MemVergeAI-VPC"
export SUBNET_NAME="MemVergeAI-Subnet"
export RT_NAME="MemVergeAI-RouteTable"
export IG_NAME="MemVergeAI-IGW"
export FILE_SYSTEM_NAME="MemVergeAI-EFS"
Explanation of Variables:
- REGION="us-east-2"
- The AWS region where all resources (VPC, EC2 instances, EFS, etc.) will be created.
- This guide assumes Ohio (us-east-2).
- CIDR_VPC="10.0.0.0/24"
- The IP address range for your new VPC.
- A
/16
block supports up to 65,536 IP addresses (enough for production use cases). - A
/24
block supports up to 256 IP addresses, typically allows for ~251 usable IP addresses, which is enough for most small PoC use cases. - Adjust this if you need a larger or smaller network.
- In most cases, there is no extra cost for having a larger VPC CIDR block than you actually use.
- CIDR_SUBNET="10.0.0.0/24"
- The IP address range for a subnet within your VPC.
- A /24 block supports up to 256 IP addresses (minus AWS overhead), allows for ~251 usable IP addresses.
- This subnet will be configured as a public subnet.
- SSH_KEY_NAME="MVAI-SSH-Key"
- The name of the SSH key pair you will create and use to securely connect to your EC2 instances.
- SG_NAME="MVAIsg"
- The name of the AWS Security Group you will create to manage inbound/outbound traffic rules.
- This guide opens ports for SSH, HTTP, HTTPS, and any additional cluster ports.
- VPC_NAME="MemVergeAI-VPC"
- A human-readable name for the VPC resource.
- Helps you identify this particular VPC in the AWS Console.
- SUBNET_NAME="MemVergeAI-Subnet"
- A name tag for your subnet resource.
- Useful for distinguishing it from other subnets in the same region.
- RT_NAME="MemVergeAI-RouteTable"
- The name for the Route Table associated with the above subnet.
- This table will define routes (e.g., a default route to an Internet Gateway).
- IG_NAME="MemVergeAI-IGW"
- A name tag for the Internet Gateway resource.
- Internet Gateways provide outbound public Internet connectivity for public subnets in your VPC.
- FILE_SYSTEM_NAME="MemVergeAI-EFS"
- A name tag for your Amazon EFS file system.
- By default, EFS scales automatically but can be labeled with a custom name for ease of identification.
You can confirm your default region with:
If this command returns no output, it's likely your region is defined by $AWS_REGION
and $AWS_DEFAULT_REGION
environment variables. To verify your region setting, you can use the following command:
To find out more information, use aws configure list
.
To explicitly set a region, use aws configure set region <your_region>
2. Create a New VPC¶
A Virtual Private Cloud (VPC) provides network isolation for your cluster. Creating a dedicated VPC helps avoid conflicts with existing infrastructure and gives you full control over subnets, routing, and security.
-
Create the VPC:
VPC_ID=$(aws ec2 create-vpc \ --cidr-block $CIDR_VPC \ --tag-specifications "ResourceType=vpc,Tags=[{Key=Name,Value=$VPC_NAME}]" \ --query 'Vpc.VpcId' \ --output text) echo "Created VPC with ID: $VPC_ID"
Example:
-
(Optional) Enable DNS support in the VPC if not already by default:
To check whether DNS support and DNS hostnames are already enabled on your VPC, you can use the following AWS CLI commands in Cloud Shell. These will display each attribute for the VPC in question:
# Replace $VPC_ID with your actual VPC ID if it’s not stored in a variable # 1. Check DNS support aws ec2 describe-vpc-attribute \ --vpc-id $VPC_ID \ --attribute enableDnsSupport # 2. Check DNS hostnames aws ec2 describe-vpc-attribute \ --vpc-id $VPC_ID \ --attribute enableDnsHostnames
Example Output:
- If either
Value
is false, you can enable it by running:
- If either
3. Create a Public Subnet¶
Your subnet designates a range of IP addresses within the VPC. A public subnet has a route to the Internet, required for downloading packages, patches, and for any external traffic.
Create one subnet in the new VPC:
SUBNET_ID=$(aws ec2 create-subnet \
--vpc-id $VPC_ID \
--cidr-block $CIDR_SUBNET \
--availability-zone "${REGION}a" \
--tag-specifications "ResourceType=subnet,Tags=[{Key=Name,Value=$SUBNET_NAME}]" \
--query 'Subnet.SubnetId' \
--output text)
echo "Created Subnet with ID: $SUBNET_ID"
Example output:
4. Create and Attach an Internet Gateway¶
An Internet Gateway (IGW) enables your VPC to communicate with the Internet. By attaching an IGW, traffic can flow from your subnet to the Internet (and vice versa), allowing downloads, updates, and inbound connections.
-
Create the Internet Gateway (IGW):
IGW_ID=$(aws ec2 create-internet-gateway \ --tag-specifications "ResourceType=internet-gateway,Tags=[{Key=Name,Value=$IG_NAME}]" \ --query 'InternetGateway.InternetGatewayId' \ --output text) echo "Created Internet Gateway with ID: $IGW_ID"
Example output:
-
Attach the IGW to your VPC:
5. Create and Associate a Routing Table¶
A routing table contains rules that determine where traffic is directed. By creating a route that sends 0.0.0.0/0 (i.e., Internet-bound traffic) to the IGW, you ensure that resources in your subnet can reach the outside world.
-
Create the route table:
RT_ID=$(aws ec2 create-route-table \ --vpc-id $VPC_ID \ --tag-specifications "ResourceType=route-table,Tags=[{Key=Name,Value=$RT_NAME}]" \ --query 'RouteTable.RouteTableId' \ --output text) echo "Created Route Table with ID: $RT_ID"
Example output:
-
Create a default route that sends Internet-bound traffic to the IGW:
aws ec2 create-route \ --route-table-id $RT_ID \ --destination-cidr-block 0.0.0.0/0 \ --gateway-id $IGW_ID
Expected Output:
-
Associate the subnet with the route table (making it a public subnet):
Example output:
6. Create a Security Group¶
A security group acts as a virtual firewall. By defining inbound and outbound rules, you control access to your instances. This guide opens the ports required for SSH, HTTP, HTTPS, EFS, and the internal ports needed by K3s, Grafana, and Prometheus.
Next, create a security group that allows:
- SSH (22) from anywhere (you can restrict to specific IPs for better security).
- HTTP (80) from anywhere.
- HTTPS (443) from anywhere.
- NFS (2049) for EFS mounting (from within the same VPC).
- Internal cluster traffic (e.g., K3s on port 6443, etc.) from within the same security group.
Create the security group¶
SG_ID=$(aws ec2 create-security-group \
--group-name $SG_NAME \
--vpc-id $VPC_ID \
--query 'GroupId' \
--output text)
echo "Created Security Group with ID: $SG_ID"
Example output:
Authorize inbound rules¶
- SSH (22) from anywhere:
aws ec2 authorize-security-group-ingress \
--group-id $SG_ID \
--protocol tcp \
--port 22 \
--cidr 0.0.0.0/0
- HTTP (80) from anywhere:
aws ec2 authorize-security-group-ingress \
--group-id $SG_ID \
--protocol tcp \
--port 80 \
--cidr 0.0.0.0/0
- HTTPS (443) from anywhere:
aws ec2 authorize-security-group-ingress \
--group-id $SG_ID \
--protocol tcp \
--port 443 \
--cidr 0.0.0.0/0
- Internal communication on ports that K3s/Grafana/Prometheus might need (e.g.,
6443
,3000
,9090
,9093
):
# NFS/EFS allow from same VPC CIDR or the same security group
aws ec2 authorize-security-group-ingress \
--group-id $SG_ID \
--protocol tcp \
--port 2049 \
--cidr $CIDR_VPC
# Kubernetes on 6443 (K3s/K8s API):
aws ec2 authorize-security-group-ingress \
--group-id $SG_ID \
--protocol tcp \
--port 6443 \
--source-group $SG_ID
# Grafana on 3000:
aws ec2 authorize-security-group-ingress \
--group-id $SG_ID \
--protocol tcp \
--port 3000 \
--source-group $SG_ID
# Prometheus on 9090:
aws ec2 authorize-security-group-ingress \
--group-id $SG_ID \
--protocol tcp \
--port 9090 \
--source-group $SG_ID
# Kubectl Manager on 10257:
aws ec2 authorize-security-group-ingress \
--group-id $SG_ID \
--protocol tcp \
--port 10257 \
--source-group $SG_ID
# Kubelet API on 10250:
aws ec2 authorize-security-group-ingress \
--group-id $SG_ID \
--protocol tcp \
--port 10250 \
--source-group $SG_ID
# Kube Scheduler on 10259:
aws ec2 authorize-security-group-ingress \
--group-id $SG_ID \
--protocol tcp \
--port 10259 \
--source-group $SG_ID
# Etcd on Port Range 2379-2380
aws ec2 authorize-security-group-ingress \
--group-id $SG_ID \
--protocol tcp \
--port 2379-2380 \
--source-group $SG_ID
# NodePort on Port Range 30000-32767
aws ec2 authorize-security-group-ingress \
--group-id $SG_ID \
--protocol tcp \
--port 30000-32767 \
--source-group $SG_ID
# ICMP Echo/Ping
aws ec2 authorize-security-group-ingress \
--group-id $SG_ID \
--protocol icmp \
--port -1 \
--cidr $CIDR_VPC
# DNS on port 53 (CoreDNS) for TCP and UDP protocols:
aws ec2 authorize-security-group-ingress \
--group-id $SG_ID \
--protocol tcp \
--port 53 \
--source-group $SG_ID
aws ec2 authorize-security-group-ingress \
--group-id $SG_ID \
--protocol udp \
--port 53 \
--source-group $SG_ID
7. Create a New SSH Key Pair¶
You need a secure way to log into the instances. Creating a dedicated SSH key pair ensures unique credentials and minimizes the risk of unauthorized access.
Generate a new key pair to SSH into your instances:
# Create the .pem SSH key pair file
aws ec2 create-key-pair \
--key-name $SSH_KEY_NAME \
--query 'KeyMaterial' \
--output text > ${SSH_KEY_NAME}.pem
# Change permissions of the .pmem file
chmod 400 ${SSH_KEY_NAME}.pem
# Confirm the full .pmem file name
echo ${SSH_KEY_NAME}.pem
# Show the full pato the .pmem file
ls -d $PWD/${SSH_KEY_NAME}.pem
- This command creates a file named
MVAI-SSH-Key.pem
in Cloud Shell. - IMPORTANT: Keep this
.pem
file secure. You will use it to SSH into the EC2 instances.
7.1 Download the SSH Key Pair (.pem file)¶
The SSH Key Pair file (e.g., MVAI-SSH-Key.pem
) currently resides on the CloudShell virtual machine. Here’s how you can download that .pem
file to your local system:
- Open AWS CloudShell in your web browser.
-
Confirm the key file exists by running:
You should see your
.pem
file in the current directory (e.g.,MVAI-SSH-Key.pem
). -
Use CloudShell’s Download Option:
- In the CloudShell console, click on the three-dots menu or “Actions” in the top-right corner.
- Select “Download file” (the exact wording may vary slightly).
- When prompted, enter the full path to your
.pem
file. For example, if you see it in your home directory, you might type:
- Choose where to save the file on your local computer.
-
Set Permissions Locally (Recommended)
Once the file is on your local laptop:- On macOS/Linux:
- On Windows (OpenSSH in PowerShell) you can similarly set file permissions or use Windows file properties to restrict access.
-
Verify by listing the file or attempting to connect to your EC2 instance:
8. Launch EC2 Instances¶
8.1 Launch the Management/Control Plane Node¶
An Amazon Machine Image (AMI) contains the operating system and any pre-configured software. We recommend Ubuntu 22.04 LTS.
-
Find an Ubuntu 22.04 AMI in us-east-2. You can do:
-
Verify the Root Device Name Different AMIs sometimes have different root device names (e.g., /dev/sda1, /dev/xvda), so you may want to confirm the correct root device for your AMI. You can run:
Example:
This confirms the Ubuntu 22.04 image uses
/dev/sda1
for the OS boot disk name. -
Launch the m5.xlarge instance
The management (control plane) node runs K3s server components and orchestrates the cluster. We recommend using an m5.xlarge, or larger, instance type for balanced CPU and memory resources. This command provisions the host and assigns a 60GiB GP3 OS boot disk:
MGMT_INSTANCE_ID=$(aws ec2 run-instances \ --image-id $AMI_ID \ --instance-type m5.xlarge \ --key-name $SSH_KEY_NAME \ --security-group-ids $SG_ID \ --subnet-id $SUBNET_ID \ --tag-specifications "ResourceType=instance,Tags=[{Key=Name,Value=MemVergeAI-Management01}]" \ --block-device-mappings "[ { \"DeviceName\": \"/dev/sda1\", \"Ebs\": { \"VolumeSize\": 60, \"VolumeType\": \"gp3\" } } ]" \ --query 'Instances[0].InstanceId' \ --output text) echo "Management instance ID: $MGMT_INSTANCE_ID"
Example
8.2 Launch the GPU Worker Node¶
Launch the g5.2xlarge instance:
If your applications rely on GPU compute, a g5.2xlarge instance provides NVIDIA A10 GPU resources. It can handle ML workloads and other GPU-accelerated tasks.
WORKER_INSTANCE_ID=$(aws ec2 run-instances \
--image-id $AMI_ID \
--instance-type g5.2xlarge \
--key-name $SSH_KEY_NAME \
--security-group-ids $SG_ID \
--subnet-id $SUBNET_ID \
--tag-specifications 'ResourceType=instance,Tags=[{Key=Name,Value=MemVergeAI-GPU-Worker01}]' \
--block-device-mappings "[
{
\"DeviceName\": \"/dev/sda1\",
\"Ebs\": {
\"VolumeSize\": 60,
\"VolumeType\": \"gp3\"
}
}
]" \
--query 'Instances[0].InstanceId' \
--output text)
echo "GPU worker instance ID: $WORKER_INSTANCE_ID"
Example
8.3 Wait for the Instances to Start¶
Wait a few moments for these instances to transition to a running
state. Use the following command to check and monitor their status:
aws ec2 describe-instances \
--filters "Name=vpc-id,Values=$VPC_ID" \
--query "Reservations[].Instances[].{ID:InstanceId,Name:Tags[?Key=='Name']|[0].Value,State:State.Name,PublicIP:PublicIpAddress,PrivateIP:PrivateIpAddress}" \
--output table
Example:
----------------------------------------------------------------------------------------------
| DescribeInstances |
+----------------------+---------------------------+-------------+----------------+----------+
| ID | Name | PrivateIP | PublicIP | State |
+----------------------+---------------------------+-------------+----------------+----------+
| i-0770b293b7b6383e0 | MemVergeAI-Management01 | 10.0.0.156 | None. | running |
| i-02a83e5064fccd806 | MemVergeAI-GPU-Worker01 | 10.0.0.9 | None. | running |
+----------------------+---------------------------+-------------+----------------+----------+
Continue once all instances are in the running
state.
9. Allocate and Assign Elastic IPs (Optional)¶
By default, AWS gives instances a dynamic public IP. Elastic IPs (EIPs) remain the same even if you stop and start the instance, ensuring consistent DNS mappings and preventing certificate issues with Let’s Encrypt.
Because we want static public IPs that persist through instance stop/start cycles, allocate an Elastic IP for each instance.
-
Management Node EIP:
MGMT_EIP_ALLOC_ID=$(aws ec2 allocate-address --query 'AllocationId' --output text) echo "Management EIP Allocation ID: $MGMT_EIP_ALLOC_ID" aws ec2 associate-address \ --instance-id $MGMT_INSTANCE_ID \ --allocation-id $MGMT_EIP_ALLOC_ID
Example Output
-
You can retrieve the public IP by:
MGMT_PUBLIC_IP=$(aws ec2 describe-addresses \ --allocation-ids $MGMT_EIP_ALLOC_ID \ --query 'Addresses[0].PublicIp' \ --output text) echo "Management Node Public IP: $MGMT_PUBLIC_IP"
Example:
-
Worker Node EIP:
WORKER_EIP_ALLOC_ID=$(aws ec2 allocate-address --query 'AllocationId' --output text) echo "Worker EIP Allocation ID: $WORKER_EIP_ALLOC_ID" aws ec2 associate-address \ --instance-id $WORKER_INSTANCE_ID \ --allocation-id $WORKER_EIP_ALLOC_ID
Example output:
Now both instances have static public IPs that remain consistent across reboots.
Here is another way to confirm both EC2 instances now have a 'PublicIP':
aws ec2 describe-instances \
--filters "Name=vpc-id,Values=$VPC_ID" \
--query "Reservations[].Instances[].{ID:InstanceId,Name:Tags[?Key=='Name']|[0].Value,State:State.Name,PublicIP:PublicIpAddress,PrivateIP:PrivateIpAddress}" \
--output table
Example:
----------------------------------------------------------------------------------------------
| DescribeInstances |
+----------------------+---------------------------+-------------+----------------+----------+
| ID | Name | PrivateIP | PublicIP | State |
+----------------------+---------------------------+-------------+----------------+----------+
| i-0770b293b7b6383e0 | MemVergeAI-Management01 | 10.0.0.156 | 3.20.192.186 | running |
| i-02a83e5064fccd806 | MemVergeAI-GPU-Worker01 | 10.0.0.9 | 3.128.242.144 | running |
+----------------------+---------------------------+-------------+----------------+----------+
10. Renaming AWS EC2 Hostnames (Optional)¶
The default hostnames created by AWS are not intuitive for the MemVerge.ai cluster. You can rename your AWS EC2 instances to more intuitive hostnames like mvai-mgmt
and mvai-nvgpu01
. This will make your cluster management more manageable.
-
Update the hostname on each instance
SSH into each EC2 instance and run the following commands:
Replace "new-hostname" with your desired hostname (e.g., MemVerge.ai-mgmt, MemVerge.ai-node001).
-
Update /etc/hosts file
Edit the /etc/hosts file and add a line with the new hostname below the default
127.0.0.1 localhost
line: -
Update DNS settings (Optional)
If you're using Amazon Route 53 or another DNS service, update the DNS records to reflect the new hostnames.
-
Reboot the host:
-
When the system boots, verify the new hostname is correct:
10.1 Updating /etc/hosts on All Nodes¶
To ensure proper communication between nodes in your cluster, you must add the hostnames and IP addresses of all nodes to the /etc/hosts
file on each system. This step is crucial when not using DNS for hostname resolution. If you use DNS, this step is not required. Ensure your DNS entries are correct.
-
Gather the private IP addresses and hostnames of all nodes in your cluster using
ip a
. -
SSH into each node (management and worker nodes). The default user for Ubuntu Linux is
ubuntu
: -
On each node, edit the /etc/hosts file:
-
Add entries for all nodes in your cluster. The format is:
For example, add these lines:
Add an entry for each node in your cluster, including the node you're currently editing.
-
Save the file and exit the editor.
-
Repeat steps 2-5 for each node in your cluster.
-
Verify the changes by pinging other nodes using their hostnames:
Ensure that each node can ping all other nodes using their hostnames.
By adding these entries to
/etc/hosts
on all systems, you ensure that each node can resolve the hostnames of other nodes in the cluster. This is crucial for Kubernetes and other cluster components to communicate properly.Remember to update the
/etc/hosts
file on all nodes whenever you add or remove nodes from your cluster. While this manual process works well for smaller, static clusters, using DNS is generally preferred for larger or more dynamic environments.
11. Create an EFS File System¶
To have a persistent shared file system that can be mounted by all cluster nodes, create a new AWS Elastic File System (EFS). Amazon EFS is a scalable, elastic file system that multiple instances can access simultaneously. You can use it for shared storage among your cluster nodes to store snapshots and other data that needs to be accessible by the user workloads.
-
Create the file system:
FILE_SYSTEM_ID=$(aws efs create-file-system \ --performance-mode generalPurpose \ --throughput-mode bursting \ --encrypted \ --tags Key=Name,Value=$FILE_SYSTEM_NAME \ --query 'FileSystemId' \ --output text) echo "Created EFS with ID: $FILE_SYSTEM_ID"
Example:
-
Create a Mount Target in the same subnet:
aws efs create-mount-target \ --file-system-id $FILE_SYSTEM_ID \ --subnet-id $SUBNET_ID \ --security-groups $SG_ID
Example output:
{ "OwnerId": "669102733081", "MountTargetId": "fsmt-06bfc187a478c98e4", "FileSystemId": "fs-06089fdf3a7751a5f", "SubnetId": "subnet-01f24fb72235228ed", "LifeCycleState": "creating", "IpAddress": "10.0.0.190", "NetworkInterfaceId": "eni-098e4d7ff9692af99", "AvailabilityZoneId": "use2-az1", "AvailabilityZoneName": "us-east-2a", "VpcId": "vpc-01bdeafcc0ce883e5" }
-
Obtain the DNS Name of the EFS File system. This is required by the Management and Worker nodes to mount it later.
EFS_DNSNAME="${FILE_SYSTEM_ID}.efs.${REGION}.amazonaws.com"
echo "The EFS DNS Name is: $EFS_DNSNAME"
Example:
12. Mounting the EFS Volume on Management and GPU Worker Nodes¶
After creating your EFS file system and mount target, you can mount it on both instances (the management node and the GPU worker node) so they share the same persistent storage. You will need to SSH to each Management and Worker node to perform these actions.
NOTE
If you are using a web Cloud Shell environment, click '+' to get a new terminal. This avoids losing any shell environment variables created during the installation process so far, which you need for future steps
-
Install NFS Utilities
On Ubuntu 22.04, the EFS mount requires
nfs-common
: -
Create a Mount Directory
Create a local mount point (e.g.,
/mnt/efs
) on each node: -
Determine the EFS Mount Endpoint
3.1. Using EFS DNS Name
By default, Amazon EFS provides a DNS name in the format:For instance, if your
$FILE_SYSTEM_ID
isfs-06089fdf3a7751a5f
and your$REGION
isus-east-2
, the EFS endpoint would be:3.2. Optional: Using the Mount Target IP
As shown in your creation output, the IpAddress might be10.0.0.190
. You can mount using that IP directly, but it’s generally better to rely on the DNS name for high availability and automatic failover between Availability Zones. -
Mount the EFS File System
Use the following command example to mount EFS on each node. Replace the DNS with the one displayed in the previous step:
Replace:
fs-06089fdf3a7751a5f.efs.us-east-2.amazonaws.com
with your actual EFS DNS endpoint./mnt/efs
with the directory you wish to mount on, if different.
Tip: Confirm the mount is successful:
You should see an entry similar to:
fs-06089fdf3a7751a5f.efs.us-east-2.amazonaws.com:/ nfs4 … /mnt/efs
-
Persist the Mount in
/etc/fstab
To ensure the EFS file system automatically remounts after reboot or instance stop/start, add an
/etc/fstab
entry on each node:echo "fs-06089fdf3a7751a5f.efs.us-east-2.amazonaws.com:/ /mnt/efs nfs4 defaults,_netdev 0 0" | sudo tee -a /etc/fstab
_netdev
ensures the system knows this mount requires a network connection before mounting.- You can add additional options (e.g.,
rsize=1048576
,wsize=1048576
) if needed, but the above defaults typically suffice.
Once added, test the
fstab
entry by unmounting and remounting:
If successful, EFS should remount without errors. Use df -hT | grep efs
to confirm the file system is mounted.
Remember to repeat this process on all Management and Worker nodes before proceeding!
13. (Optional) Create a Load Balancer for the Management Node(s)¶
When running multiple control-plane (management) nodes, you want a single, stable endpoint for client or API access. An AWS Network Load Balancer (NLB) is well-suited for load-balancing TCP traffic—such as the Kubernetes API on port 6443. Alternatively, if you plan to expose HTTP/HTTPS services directly from the control plane, you might prefer an Application Load Balancer (ALB). The instructions below use an NLB for simplicity.
13.1 Create a Network Load Balancer for the Kubernetes API¶
If you do not require a high-available control/management plane setup, skip this step and proceed to Step 13.
A Network Load Balancer operates at Layer 4 (TCP). It passes traffic quickly and efficiently to multiple backend instances (in this case, your management nodes). This setup also helps facilitate a High Availability environment by ensuring traffic is directed only to healthy nodes.
-
Create the NLB:
LB_ARN=$(aws elbv2 create-load-balancer \ --name MemVergeAI-NLB \ --type network \ --scheme internet-facing \ --subnets $SUBNET_ID \ --query 'LoadBalancers[0].LoadBalancerArn' \ --output text) echo "Created NLB with ARN: $LB_ARN"
- This places the load balancer in the public subnet you created earlier (
$SUBNET_ID
). - We use
internet-facing
so it can receive traffic from external clients.
Example output:
- This places the load balancer in the public subnet you created earlier (
-
Create a Target Group:
TG_ARN=$(aws elbv2 create-target-group \ --name MemVergeAI-NLB-Targets \ --protocol TCP \ --port 6443 \ --vpc-id $VPC_ID \ --query 'TargetGroups[0].TargetGroupArn' \ --output text) echo "Created Target Group with ARN: $TG_ARN"
- Here we specify TCP on port 6443, which is the typical port for the K3s (and Kubernetes) API.
- Adjust the port if your management node listens on a different one.
Example output:
-
Register Your Existing Management Node:
- This tells the NLB to forward incoming traffic on port 6443 to the management node’s instance ID on the same port.
-
Create a Listener:
aws elbv2 create-listener \ --load-balancer-arn $LB_ARN \ --protocol TCP \ --port 6443 \ --default-actions Type=forward,TargetGroupArn=$TG_ARN
- The listener watches for traffic on port 6443 on the load balancer and forwards it to your target group (
$TG_ARN
).
- The listener watches for traffic on port 6443 on the load balancer and forwards it to your target group (
Once created, the NLB will have its own DNS name. You can retrieve it by running:
LB_DNS_NAME=$(aws elbv2 describe-load-balancers \
--load-balancer-arns $LB_ARN \
--query 'LoadBalancers[0].DNSName' \
--output text)
echo "NLB DNS Name: $LB_DNS_NAME"
13.2 Update Your DNS to Point to the Load Balancer¶
Rather than pointing your DNS A record to a single management node's IP, you can point it to the NLB. This way, if you add or remove management nodes, the DNS stays the same, and the load balancer handles routing.
-
Create or Update an A Record in Route53 (if using Route53):
# Example assumes your hosted zone is in $HOSTED_ZONE_ID # and you want "ai.example.com" to resolve to the load balancer. aws route53 change-resource-record-sets \ --hosted-zone-id $HOSTED_ZONE_ID \ --change-batch '{ "Changes": [{ "Action": "UPSERT", "ResourceRecordSet": { "Name": "ai.example.com.", "Type": "A", "AliasTarget": { "HostedZoneId": "Z26RNL4JYFTOTI", # Example ALB/NLB hosted zone ID for us-east-2 "DNSName": "'"$LB_DNS_NAME"'", "EvaluateTargetHealth": false } } }] }'
- Note that Alias records for ALB/NLB require the correct HostedZoneId for the AWS region (e.g., us-east-2). Check AWS Documentation for the correct “ALB/NLB Hosted Zone ID.”
- Alternatively, if you cannot use Alias records, you can create a standard A record pointing to the NLB DNS name using a CNAME. However, an Alias record is usually recommended.
-
Using a 3rd Party DNS Provider:
- Go to your DNS provider’s dashboard.
- Create or edit an A record for
ai.example.com
to reference the NLB’s DNS name. - If the provider doesn’t allow an ALIAS or ANAME record, you might need a CNAME that points to the NLB’s domain (e.g.
xxx.elb.amazonaws.com
). - Wait for DNS propagation before testing.
13.3 Adding Another Management Host in the Future¶
As your environment grows or you need High Availability, you might spin up a second or third management node. Each new node should be included in the same NLB target group so traffic to port 6443 is distributed among all management nodes.
- Launch another management EC2 instance as you did before (e.g.,
m5.xlarge
in the same VPC/subnet). -
Register the new instance with the existing target group:
-
Validate health checks:
- The NLB runs health checks against the specified port (6443 by default). Make sure the new node is healthy and able to serve K3s traffic.
- You can check the health status via:
-
Scale as needed: You can add more management nodes the same way. The NLB automatically starts routing traffic once they pass health checks.
IMPORTANT
For a true High-Availability K3s setup, ensure your additional management nodes are configured properly at the K3s layer. Simply adding more control-plane instances behind the load balancer is only half the equation; each new node must be joined as a server in K3s (not just as a worker). Follow K3s’s official documentation for the correct HA setup procedure.
14. Handling HTTP and HTTPS with an Application Load Balancer and Traefik (for Kubernetes)¶
When you install K3s with the default configuration, it typically includes the Traefik ingress controller. This component listens on ports 80 and 443 inside the cluster for incoming HTTP and HTTPS connections. By default, K3s might expose Traefik via a Service of type LoadBalancer or NodePort, depending on your configuration. You will install K3s next, so we will create a HTTP/HTTPS load balancer once Kubernetes is operational. For now, no further action is needed.
15. Update Your DNS Records¶
If you did not create a load balancer and assign a DNS entry in Step 13, follow this step. Otherwise, Continue to Step 16.
If you own a domain and want to serve HTTPS requests to your cluster without warnings, you’ll need to add a DNS A record pointing to the Elastic IP (or load balancer) of the Management Node(s). Below is an example approach:
-
Using Route53:
If your domain is hosted in Route53, create (or update) an A record in the relevant Hosted Zone:# This example assumes you have a Hosted Zone ID in the $HOSTED_ZONE_ID variable # and want to point 'ai.example.com' to your management node's EIP. aws route53 change-resource-record-sets \ --hosted-zone-id $HOSTED_ZONE_ID \ --change-batch '{ "Changes": [{ "Action": "UPSERT", "ResourceRecordSet": { "Name": "ai.example.com.", "Type": "A", "TTL": 300, "ResourceRecords": [{"Value": "'"$MGMT_PUBLIC_IP"'"}] } }] }'
-
Using Another DNS Provider:
- Log into your DNS provider’s dashboard.
- Create an A Record for e.g.
ai.example.com
and set its value to your management node’s public IP. - Wait for DNS propagation (can take a few minutes up to an hour).
Once the DNS record is in place, you can connect to your management node using ai.example.com
(or whichever hostname you chose).
16. Verify Connectivity¶
After a few minutes, you should be able to:
-
SSH into the management node using:
or if DNS is set:
-
Ping the worker node from the management node (and vice versa) using private IPs:
# On the management node, do: ping <worker-private-IP> # On the worker node, do: ping <management-private-IP>
They should communicate freely within the VPC.
-
Access the Internet from your EC2 instances (e.g.,
curl https://www.google.com
).
17. Update the Operating System¶
-
(Recommended) OS Patches: SSH into each instance and run:
Do not run a major-release distro upgrade (e.g., from 22.04 to 22.10).
Conclusion¶
Congratulations! Your AWS environment is now ready to deploy Kubernetes and MemVerge.ai. Proceed to the next steps in this Installation Guide to continue the installation process.