Manual Job Migration
Running jobs can be moved manually at any time using the CLI, for example, to change the compute platform or the pay type.
Procedure
-
Log in to the OpCenter.
- If you are using the web interface, enter your credentials on the landing page.
- If you are using the web CLI, you are already logged in.
- If you are using a remote terminal or a terminal session on the OpCenter
server, enter the following command:
float login -u <username> -p <password> -a <OpCenter_ip_address>
where <username> and <password> are login credentials, and <OpCenter_ip_address> is the OpCenter's private IP address if you are within your organization's virtual private cloud (VPC), or public IP address if you are outside the VPC.Note: The OpCenter IP address is cached, so you can omit it when you log back in. After the cache expires, the IP address defaults to local host. If you get a "connection refused" error, retry with -a <OpCenter_ip_address> option included.
- Migrate a running job to a new virtual machine instance.
Note: The vmPolicy option associated with a running job can be modified. If the new policy is incompatible with the old policy (not considering priceLimit), then job migration is automatically triggered, so, for example, a job running on a Spot Instance migrates immediately to an On-demand Instance.
- CLI: To migrate a job to a specific VM type, enter the following command:
float migrate -t <instance_type> -j <job_id>
To migrate to an instance whose capacities you define, enter the following command:
float migrate --cpu <minCPU>:<maxCPU> –-mem <minMem>:<maxMem> -j <job_id>
where <minCPU>:<maxCPU> is the allowable range for the number of virtual CPUs, <minMem>:<maxMem> is the allowable range for the memory size in GB, and <job_id> is the job identifier. The upper bounds for number of virtual CPUs and memory size can be omitted in which case only the lower bounds apply. - Web interface: Go to the Jobs screen and locate your job by ID or Name. Under the Actions column, click on the Migrate Jobicon. Fill out the fields in the pop-up screen and then click on Migrate.
- CLI: To migrate a job to a specific VM type, enter the following command:
-
Check whether the job migrates successfully.
- CLI: Keep entering the float squeue command until the job state changes from "Floating" to "Executing."
- Web interface: On the Jobs screen, click on the Refresh button until the job state changes from "Floating" to "Executing."
-
View a record of any migration events
- CLI: Use the float log cat job.events -j <job_id> command.
- Web interface: On the Jobs screen, click on your job, and then go to the Attachments tab. Click on the Preview icon next to the job.events log.
Example of manual migration:float migrate -f -t t3a.large -j Xx0r4CYE7X6MRmivjoITf float log cat job.events -j Xx0r4CYE7X6MRmivjoITf 2023-04-19T14:42:13.334: Ready to migrate with spec job: lWpXdspWZcWESBMMy9Nbm, instType: t3a.large, CPU: 0:0, Memory: 0:0, zone: , payType: 2023-04-19T14:42:13.334: Attempt to find instance type for spec InstType:t3a.large,CPU:2 ~ 0,Memory:4 ~ 0,Zone:us-east-1b,CPUVendor:AuthenticAMD,priceLimit:0,priceLimitPerc:0 2023-04-19T14:42:13.389: Determined instance params: Zone:us-east-1b,InstType:t3a.large,CPU:2,Memory:8 2023-04-19T14:42:13.389: Ready to migrate with instance type: t3a.large, cpu: 2, memory: 8, zone: us-east-1b, last instance type: t3a.medium(Spot) 2023-04-19T14:42:13.389: Ready to checkpoint host i-0d7877d1978a4db4c 2023-04-19T14:42:14.759: Checkpointed host i-0d7877d1978a4db4c, result: [container: c63784ae91f26cc4b2f8d1980f84ce14bc17ae5b8478953c4a2cb06c61e3a8b3, checkpoint file: ], duration 1.369989165s 2023-04-19T14:42:27.401: Detached volume vol-0fc557b252c9a497c from host i-0d7877d1978a4db4c 2023-04-19T14:42:33.819: Detached volume vol-0ea186ddf37c64396 from host i-0d7877d1978a4db4c 2023-04-19T14:42:40.212: Detached volume vol-05bf0027cef28aa4f from host i-0d7877d1978a4db4c 2023-04-19T14:42:40.212: Ready to create new host to recover 2023-04-19T14:42:46.999: Created instance i-06107ed1970156e8d at us-east-1b, waiting for it to initialize 2023-04-19T14:45:02.014: Mounted vol-0fc557b252c9a497c:/mnt/float-data to i-06107ed1970156e8d 2023-04-19T14:45:02.014: Mounted vol-0ea186ddf37c64396:/mnt/float-image to i-06107ed1970156e8d 2023-04-19T14:45:02.014: Mounted vol-05bf0027cef28aa4f:/data to i-06107ed1970156e8d 2023-04-19T14:45:02.015: Created new host: i-06107ed1970156e8d(Spot) 2023-04-19T14:45:02.248: Got 1 containers on host i-06107ed1970156e8d 2023-04-19T14:45:02.248: Ready to recover {ID:c63784ae91f2,Checkpointed:true,Running:false} on host i-06107ed1970156e8d 2023-04-19T14:45:02.264: Job floated to instance i-06107ed1970156e8d (2 CPU/8 GB) (Spot) 2023-04-19T14:45:02.893: Migrated to new VM: i-06107ed1970156e8d