AWS GPU Spot Request
How to create an AWS Spot Request for a VM with an NVidia T4 GPU
0:00:00.000
1. Setting up AWS Spot Instance Request
(0:08:30.000)
▼
Initiated process to create a GPU-enabled VM on AWS
Accessed AWS console and navigated to spot instance request page
Began configuring spot instance request parameters
Selected Ubuntu server 22.04 as the Amazon Machine Image (AMI)
0:00:00.000
1.1. Initiating AWS Spot Instance Request
(0:02:57.960)
▼
Accessed AWS console and logged in
Navigated to spot instance request page
Discussed previous attempt resulting in unintended fleet association
Began manual configuration of launch parameters
0:02:57.960
1.2. Selecting AMI and Creating Key Pair
(0:02:42.040)
▼
Selected Ubuntu server 22.04 as the Amazon Machine Image (AMI)
Created a new key pair for SSH access
Downloaded and saved the key pair file for future use
Returned to spot fleet request page to continue configuration
0:05:40.000
1.3. Configuring Target Capacity and Maintenance Options
(0:02:50.000)
▼
Set target capacity to 1 for a single instance
Discussed options for maintaining target capacity
Explored terminate, stop, and hibernate options
Chose 'stop' option as best for GPU instance development and testing
0:08:30.000
2. Configuring Instance Details and Security
(0:08:30.000)
▼
Selected network settings and availability zones
Chose appropriate GPU instance types (G5 and G6)
Configured security groups for the instance
Associated an Elastic IP address with the instance
0:08:30.000
2.1. Network Configuration and Instance Type Selection
(0:02:49.880)
▼
Navigated to network settings
Encountered difficulties finding specific GPU instance types
Identified G5 and G6 instance types as suitable options
Selected instances based on spot price considerations
0:11:19.880
2.2. Finalizing Instance Selection
(0:02:50.120)
▼
Refined instance pool selection to 12 options
Deleted unnecessary instance types
Prepared to launch the selected instance
Encountered issues with instance allocation
0:14:10.000
2.3. Associating Elastic IP
(0:02:50.000)
▼
Associated an Elastic IP address with the instance
Verified the public IP address assignment
Attempted to ping the instance (unsuccessful)
Recognized need for security group configuration
0:17:00.000
3. Network Setup and Instance Type Selection
(0:08:30.000)
▼
Created and configured security groups
Troubleshooted network connectivity issues
Successfully established ping connection to instance
Prepared for SSH connection to the instance
0:17:00.000
3.1. Security Group Configuration
(0:02:50.000)
▼
Created a new security group for the instance
Added necessary inbound rules to allow SSH and ping
Associated the security group with the instance
Verified security group assignment
0:19:50.000
3.2. Network Connectivity Troubleshooting
(0:02:50.000)
▼
Reattempted to ping the instance
Successfully established ping connection
Prepared for SSH connection setup
Addressed known_hosts file configuration
0:22:40.000
3.3. SSH Key Configuration
(0:02:50.000)
▼
Removed outdated entries from known_hosts file
Imported the new SSH key for the instance
Configured SSH profile for easy access
Prepared for initial SSH connection
0:25:30.000
4. SSH Connection and Instance Exploration
(0:06:03.240)
▼
Successfully established SSH connection to the instance
Explored instance specifications and resource usage
Configured remote development environment
Concluded setup process and prepared for handover
0:25:30.000
4.1. Initial SSH Connection and Resource Inspection
(0:02:01.080)
▼
Successfully connected to the instance via SSH
Examined available memory and CPU resources
Investigated disk space usage and allocation
Noted discrepancies in reported vs. actual disk space
0:27:31.080
4.2. Remote Development Environment Setup
(0:02:01.080)
▼
Configured initial directory for SSH sessions
Set up profile for quick access to the instance
Tested double-click login functionality
Prepared for handover to next phase of development
0:29:32.160
4.3. Conclusion and Handover
(0:02:01.080)
▼
Confirmed successful setup of GPU-enabled AWS instance
Verified remote access and development environment
Prepared to transfer control to Bruce for next steps
Concluded the setup process and video recording