Skip to content

CLI Command Reference

Use float CLI commands to interact with the OpCenter; for example, to submit and manage jobs.

Version

The MMCloud CLI commands described here are consistent with the OpCenter release shown in the table.

MMCloud CLI OpCenter Release Date
FLOAT_v3.0.0-69ce0c9-Imperia.bin FLOAT_v3.0.0-69ce0c9-Imperia.bin 2024-07-31T08:34:11Z

Usage

Use float in the following format.

float [global_flags] [command] [subcommand] [options]

Global Flags

Use global flags with any floatcommand or subcommand.

Flag Usage Definition
-a, --address ip_address Connect to OpCenter server IP address of OpCenter (default: localhost or last OpCenter IP address used)
-F, --format json | yaml | table Specify format for output Output format (default: yaml)
-h, --help Display help Help for MMCloud CLI
--logLevel log_level Specify log level Log level (default: info)
-p, --password password Log in to OpCenter Login password
--scroll Enable scroll mode for multiple page output Enable navigation (up, down, left, right) for displays that span multiple pages
-u, --username user_name Log in to OpCenter Login username
-v, --verbose on | off Turn verbose mode on or off Verbose mode setting (default: off)

Categories

The MMCloud CLI commands are grouped into categories as shown in the table. An alphabetical listing of the commands follows.

Category Commands
Job Management list, show, submit, cancel, migrate, modify, suspend, resume, snapshot, rerun, promote, hosts
Job Status Monitoring log, ps, top, df
Authentication & Authorization login, logout, secret
User Management user, group, quota-policy
OpCenter Core Services image, gateway, template, config, license, report, storage
OpCenter Management Operations status, version, release, restart
Other completion

float cancel (scancel)

Use the float cancel (or float scancel) command to cancel a job. The command has no subcommands. Use the command with the -h flag to list the options.

Subcommands Usage Option Option Definition
  Cancel job --filter filter Filter to select jobs if --job not specified. A simple filter is [attribute][operator][value]. Use --filter multiple times to apply multiple filters combined with "and" operator. Create complex filters by combining simple filters using parentheses and operators (and, or). Use "? to match a single character and "*" to match multiple characters.

Values are strings, datetimes, or numbers. Operators are:
  • = for strings, datetimes, or numbers
  • >, <, >=, <= for datetimes or numbers
Attributes include:
  • id
  • name
  • user
  • status
  • duration
  • start
  • lastUpdate
  • update
  • end
  • tags
  • envs
  • exitCode
  • output
  • cpu
  • memory
  • dataVolumes
  • publishes
  • imageID
  • cost
  • category

Examples:
  • --filter status=Executing --filter timeRange=2010-10-22~
  • --filter "tags=kind:training,project:finance-*"
  • --filter "((user=admin) and (status=Executing or status=Completed))"
    -f, --force Automatically answer "yes" at confirmation prompt
    -j, --job job_id Job to cancel if --filter not specified

Example

$ float cancel -j ctZLDo7OFG4BuJ8ytiTem
Warning: Are you sure you want to cancel this job? 

ID: ctZLDo7OFG4BuJ8ytiTem 
Name:   python-c5d.large 

All the related resources will be released. (yes/No): y
Request to cancel ctZLDo7OFG4BuJ8ytiTem has been submitted

float completion

Use the float completion command to generate auto-completion script for use in current shell or in every shell started subsequently. Use a subcommand with the -h flag to display help on how to use the auto-completion script.

Subcommands Usage Option Option Definition
bash Generate auto-completion script for bash --no-descriptions Disable completion descriptions
fish Generate auto-completion script for fish --no-descriptions Disable completion descriptions
powershell Generate auto-completion script for powershell --no-descriptions Disable completion descriptions
zsh Generate auto-completion script for zsh --no-descriptions Disable completion descriptions

float config

Use the float config command to view or change the OpCenter configuration. Use a subcommand with the -h flag to list the options.

Subcommands Usage Option Option Definition
cert Set server certificate and key (must supply both) -c, --cert /path/to/cert Path to certificate file
    -k, --key /path/to/key Path to key file
get Show runtime value of a single OpCenter configuration parameter config_parameter Configuration parameter to display, for example, sessionTimeout
ldap Set values for LDAP configuration parameters --addr ldap_server_address IPv4 address of LDAP server
    --adminGroup ldap_admin_group LDAP admin group
    --anonymous=true | false LDAP anonymous bind (default false)
    --base base_DN LDAP base Distinguished Name (DN)
    --bindDN bind_DN LDAP bind DN
    --bindPW bind_pw LDAP bind password
    --cert /path/to/cert Path to LDAP certificate file
    --conf /path/to/conf_file Path to LDAP configuration file
    --connTimeout duration Duration (format: HhMmSs) until LDAP connection times out (default 10s)
    --enable=true | false Enable LDAP (default true)
    --groupOU group_OU LDAP group Organizational Unit (OU) (default "Group")
    --key /path/to/key Path to LDAP key file
    --network tcp | udp LDAP connection protocol (default "tcp")
    --peopleOU people_OU LDAP people OU (default "People")
    --reset Reset all LDAP parameters to empty (null)
    --tls=true | false LDAP use tls (default true)
list, ls List configuration parameters for OpCenter (shows if a parameter can be changed and if a change requires a system restart) -s, --scope filter Condition(s) to filter the list of configuration parameters (default: list all parameters)
mset Set runtime values of multiple configuration parameters config_parm1=parm_value1 config_parm2=parm_value2 ... Array of configuration parameters and the values to set them to. If the word "default" is used for `parm_value`, the configuration parameter is reset to its default value.
set Set runtime value of a single configuration parameter config_parm parm_value Configuration parameter and the value to set it to. If the word "default" is used for `parm_value`, the configuration parameter is reset to its default value.
    --file value.txt Text file containing value for `config_parameter` (see example)

Examples

$ float config get log.level
info
$ float config set sessionTimeout 48h
sessionTimeout is set to 48h0m0s
$ float config set sessionTimeout default
sessionTimeout is set to 1h0m0s
$ float config ls --scope "session"
 +----------------+----------+----------+--------------+
 |      KEY       |  VALUE   | EDITABLE | NEED RESTART |
 +----------------+----------+----------+--------------+
 | sessionTTL     | 168h0m0s | Y        |              |
 | sessionTimeout | 1h0m0s   | Y        |              |
 +----------------+----------+----------+--------------+
$ echo -n 48h > session.txt
$ float config set sessionTimeout --file session.txt
The value of sessionTimeout is set to '48h0m0s'.

float df

Use the float df command to display the file systems mounted by an executing job (includes used and available disk space). The linux command df must be installed in the container. This command has no subcommands. Use the command with the -h flag to list the options.

Subcommands Usage Option Option Definition
  Display mounted file systems and associated disk space. --args df_args Arguments to pass to df command
    -j, --jobId job_id Job to query

Example

$ float df --args "-h" -j WIY92p0jWyaCMP0CNQYjC
Filesystem                                                       Size  Used Avail Use% Mounted on
overlay                                                          6.0G  1.1G  5.0G  17% /
/dev/nvme3n1                                                      10G  105M  9.9G   2% /data
tmpfs                                                             64M     0   64M   0% /dev
172.31.81.17:/mnt/memverge/slurm/work/nzG6oM5DoLCysXAkoZFCA/app   50G   20G   31G  39% /mmce
/dev/nvme2n1                                                     6.0G  1.1G  5.0G  17% /etc/hosts
/dev/nvme0n1p2                                                    40G  7.5G   33G  19% /opt/aws
shm                                                               63M     0   63M   0% /dev/shm
devtmpfs                                                         1.7G     0  1.7G   0% /proc/key

float gateway

Use the float gateway command to create and manage a reverse proxy server. Use a subcommand with the -h flag to list the options.

Subcommands Usage Option Option Definition
connect Connect a running job to the gateway (reverse proxy) -g, --gateway gw_id ID of gateway to connect job to (format is g- followed by a fixed-length character string)
    -j,--job job_id ID of job to connect to gateway
    --targetPort port_number Port used to connect to job, for example, 8787 for RStudio.
Include --targetPortmultiple times to connect multiple ports from a single server to the gateway.
create Create a gateway --bandwidth bw Minimum gateway bandwidth in Mbps (default: 25). Only use with AliCloud.
    -c,--cpu min_cpu Minimum number of virtual CPUs for gateway (default: 0)
    -t,--instType instance_type VM instance type for gateway, for example, c5.xlarge for AWS. Do not combine with --cpu or --mem options.
    -m,--mem min_mem Minimum memory capacity in GB for gateway (default: 0)
    -n,--name gw_name Name to associate with gateway
    --noPublicIP Create gateway with private IP address only. Ensure that gateway is reachable from the hosts that need to connect.
    --portRange min:max Range of client-side ports opened on gateway (allowed range is between 1000 and 65535).
    --securityGroup sec_group AWS security group (or tag in GCP) applied to gateway. Include option multiple times to apply multiple security groups.
    -z, --zone availability_zone Availability zone in which to create gateway
destroy Destroy a gateway -g, --gateway gw_id ID of gateway to destroy (format is g- followed by a fixed-length character string)
    -f, --force Automatically answer "yes" at confirmation prompt
disconnect Disconnect a job from a gateway -g, --gateway gw_id ID of gateway to query (format is g- followed by a fixed-length character string)
    -j, --job job_id ID of job to disconnect from gateway
    --port port_num Server-side port to disconnect from gateway. If gateway connects to multiple ports on server, include --port for each port.
info Display information about a gateway (including IP address and connected jobs) -g, --gateway gw_id ID of gateway to query (format is g- followed by a fixed-length character string)
list List all running gateways (optionally include stopped gateways) -A, --showAll Option to list stopped as well as running gateways (overrides filters)
    -f, --filter filter Filter(s) to apply to list of gateways. Use option multiple times to apply multiple filters combined with "and" operator. See details in Working with OpCenter/Filters/Gateway Filters.
    -o, --orderBy attribute Attribute used to order gateway listing (prepend with "-" to reverse order). Supported values are (default: lastUpdate):
  • id
  • name
  • status
  • cpu
  • memory
  • ip
  • portStart
  • portEnd
  • start
  • usedPorts
  • cost
  • lastUpdate
modify Modify attributes of a gateway --addSecurityGroup sec_group AWS security group (or tag in GCP) added to gateway. Include option multiple times for multiple security groups.
    -g, --gateway gw_id ID of gateway to modify
    --rmSecurityGroup sec_group AWS security group (or tag in GCP) removed from gateway. Include option multiple times for multiple security groups.

Example

$ float gateway create -n NewGateway --securityGroup sg-0fbb6a83983183364 --portRange 10000:10500
id: g-wsfthf3z8zeb0cyoc6dsb
name: NewGateway
status: Creating
configuration: ""
IPAddress: ""
portRange: 10000-10500
startTime: "2024-01-01T16:33:20Z"
usedPorts: 0
cost: ""
clientJobs: {}

float group

Use the float group command to manage OpCenter user groups. Use a subcommand with the -h flag to list the options.

Subcommands Usage Option Option Definition
add, create Create a new group new_group Name of new group
    --admin user1, user2... User(s) given admin role in new group
    --gid GID Group ID for this group (default is next available gid)
    --user user1, user2... User(s) added to new group
add --ldap Associate an LDAP group with an LDAP directory ldap_group_name Name of group to associate with LDAP directory
delete, remove, rm Delete a group group_name | group_id Name or id of group to delete
info, show Display information about group including members group_name | group_id Name or id of group to query
list, ls List groups that current user belongs to  
update Update attributes of a group group_name | group_id Name or id of group to update
    --add user1, user2... User(s) added to group
    --admin Flag to indicate that any username listed after --add is given admin role in group and any username listed after --remove has admin role in group removed
    --gid GID New group ID for this group
    --name group_name New name for group
    --remove user1, user2... User(s) removed from group

Examples

$ float group add team --user tester
name: team
gid: 3
admins: ""
users: tester
float group update crops --admin --remove barley --add wheat
name: crops
gid: 2007
admins: wheat
users: wheat,barley
type: builtin

float hosts (sinfo)

Use the float hosts (or sinfo) command to show details and current status of current (or all) worker nodes. The command has no subcommands. Use the command with the -h flag to list the options.

Subcommands Usage Option Option Definition
  Show current (or all) worker node details including status -A, --all Clear all filters to show all jobs
    -f, --filter filter Filter(s) to apply to jobs whose associated hosts are displayed. Use option multiple times to apply multiple filters combined by "and" operator. See details in Working with OpCenter/Filters/Host Filters.
The default filter is:
-f "running=true or update<=1h"
Example: -f status=executing -f timeRange=2010-10-22~
    -o, --orderBy attribute Attribute used to order host listing (prepend with "-" to reverse order). Supported values are (default: start):
  • entity
  • status
  • instanceType
  • cpu
  • memory
  • payType
  • publicIP
  • start
  • end
    -w, --windowSize num_hosts Number of hosts displayed before header row is repeated (default: 512)

float image

Use the float image command to manage container images on OpCenter. Use a subcommand with the -h flag to list the options.

Subcommands Usage Option Option Definition
add Add image pull information to OpCenter image_name image_uri Name to associate with image in the AppLibrary and URI to pull image from repository. Do not use image_uri if --link specified.
    --import-all Import all tags for this image from repository
    --link file_link Link to access AWS S3 or AliCloud OSS file, for example, s3://bucket_name/file_path Do not use if image_uri specified.
    --token repo_access_token Token to pull image from private repository
    --user repo_access_user Username to pull image from private repository
cache Add (delete) image to (from) cache configured for OpCenter (NFS-mounted directory or S3 bucket) image_name Container image to add to or delete from cache
    -d, --delete If option included, delete image from cache
    -f, --force Overwrite existing cached version of image
    --tag tag_name Tag to select specific container image (default: "latest")
delete, rm, remove Delete container image(s) and one or more tags image1 image2... or image1:tag1 image2:tag2... Container image(s) to delete. If no tags are specified, then all tags are removed.
    -f, --force Automatically answer "yes" at confirmation prompt
    --tag tag_name Tag associated with image1 image2... to remove. If this is the only tag associated with images, images are removed.
list, ls Show all images available on OpCenter    
tags, tag Display tags associated with image and image status image_name Container image name
update Update information associated with image image_name Container image name
    --addtag image_tag1, image_tag2... Additional tags to associate with image
    --name image_name New name to identify image
    --token repo_access_token New token required to access private repository
    --user repo_access_user New username associated with token required to access private repository
upload Load image from local server image_name Name to identify image in App Library
    -i, --image local_name Name of image in local repository (cannot use if --path included)
    --path /path/to/image Path to where image is located ((cannot use if --image included))

Examples

$ float image list
+------------+-------------------------------+--------+-------------+
|    NAME    |              URI              | TAGS   | ACCESS USER |
+------------+-------------------------------+--------+-------------+
| python     | docker.io/bitnami/python      | latest |             |
| r-base     | docker.io/rocker/r-base       | latest |             |
.....(edited)
+------------+-------------------------------+--------+-------------+
$ float image delete r-base
Deleting r-base
$ float image add r-base docker.io/rocker/r-base
name: r-base
uri: docker.io/rocker/r-base
owner: admin
tags:
    latest:
        status: Available
        locked: false
        lastUpdated: 2023-06-15T16:12:41.739947963Z
        size: Unknown

$ float image cache blast
Request to cache image blast (tag: latest) has been submitted
$ float image cache blast --delete
Request to clean image blast (tag: latest) has been submitted
$ float image tags blast
+--------------------------+--------+-----------+----------------------+
|           URI            |  TAG   |  STATUS   |     LAST UPDATED     |
+--------------------------+--------+-----------+----------------------+
| docker.io/memverge/blast | 2.14.0 | Available | 2023-06-26T05:15:02Z |
|                          | latest | Available | 2024-08-13T15:51:33Z |

$ float image upload test2 --path /tmp/hello-world-latest.tar
Start uploading image test2 (localhost/hello-world:latest) from /tmp/hello-world-latest.tar
Progress: |>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>| 100.00% Complete (ETA. 0s)
Uploaded image /tmp/hello-world-latest.tar, time spent: 0s

name: test2
uri: localhost/hello-world
owner: admin
tags:
    latest:
        status: Ready
        uri: file:///mnt/memverge/images/test2-latest.tar
        locked: false
        lastUpdated: 2023-04-17T19:26:38.584557866Z
        lastPushed: 2023-04-17T19:26:38.584557866Z
        size: Unknown
$ float image upload testinggzupload --path ./testinggzupload.tar.gz
Start uploading image testinggzupload (docker.io/library/testinggzupload:latest) from ./testinggzupload.tar.gz
Progress: |>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>| 100.00% Complete (ETA. 0s)
Uploaded image ./testinggzupload.tar.gz, time spent: 31m20s

name: testinggzupload
uri: docker.io/library/testinggzupload
owner: admin
tags:
    latest:
        status: Ready
        uri: s3://opcenter-bucket-594424d0-c760-11ee-8abd-128ad93e7ec9/images/ntltcroq2g.tar
        locked: false
        lastUpdated: 2024-04-11T15:39:29.686393328Z
        lastPushed: 2024-04-11T15:39:29.686393428Z
        size: 1.33 GB

Note

The user that queries the local repository for the IMAGE ID must be the same user that executes the float image upload command In this example, the user is root.

$ sudo podman images
REPOSITORY                  TAG         IMAGE ID      CREATED        SIZE
quay.io/podman/hello        latest      39ae24b9cabf  3 days ago     1.7 MB

$ sudo /opt/memverge/bin/float image upload testimage --image 39ae24b9cabf
Found image quay.io/podman/hello:latest, using it as repository and tag
Using /bin/podman to save image quay.io/podman/hello:latest to testimage.tar
Start uploading image testimage (quay.io/podman/hello:latest) from testimage.tar
...[edited]

float license

Use the float license command to manage OpCenterlicenses. Use a subcommand with the -h flag to list the options.

Subcommands Usage Option Option Definition
acquire Acquire license from MMCloud Portal --A, --account acct_name Username (email address) to access the MMCloud Portal
    -P, --password passwd Password to access the MMCloud Portal
info Display license information and status  

float list (squeue)

Use the float list (or squeue) command to show a filtered list of queued jobs. The command has no subcommands. Use the command with the -h flag to list the options.

Subcommands Usage Option Option Definition
  Show filtered list of queued jobs (default: list jobs from oldest to newest) -A, --all Clear all filters to show all jobs
    -f, --filter filter Filter(s) to apply to list of queued jobs. Use option multiple times to apply multiple filters combined with "and" operator. See details in Working with OpCenter/Filters/Job Filters.
Example: -f status=executing -f timeRange=2023-10-22~ -f tags=kind:training,project:finance-\*
    -o, --orderBy attribute Attribute used to order job listing (prepend with "-" to reverse order). Supported values are (default: start):
  • id
  • name
  • user
  • status
  • duration
  • start
  • cost
  • cpu
  • memory
    -w, --windowSize num_jobs Number of jobs displayed before header row is repeated (default 512)

Examples

$ float squeue -f "imageID=*blast*"
+-------+------------------+...+-------+-----------+----------+----------------------+------------+
|   ID  |       NAME       |...| USER  |  STATUS   | DURATION |     SUBMIT TIME      |    COST    |
+-------+------------------+...+-------+-----------+----------+----------------------+------------+
| m8t...| blast-c5.9xlarge |   | admin | Completed | 8h44m18s | 2024-02-14T04:07:38Z | 5.4851 USD |
| w70...| blast-c5.9xlarge |   | bean  | Completed | 9h3m16s  | 2024-02-14T04:22:45Z | 5.3502 USD |
+-------+------------------+...+-------+-----------+----------+----------------------+------------+
...[edited for clarity]
$ float squeue -f "imageID=*blast*" -f user=bean
+-------+------------------+...+-------+-----------+----------+----------------------+------------+
|   ID  |       NAME       |...| USER  |  STATUS   | DURATION |     SUBMIT TIME      |    COST    |
+-------+------------------+...+-------+-----------+----------+----------------------+------------+
| w70...| blast-c5.9xlarge |   | bean  | Completed | 9h3m16s  | 2024-02-14T04:22:45Z | 5.3502 USD |
+-------+------------------+...+-------+-----------+----------+----------------------+------------+
...[edited for clarity]
$ float squeue -o name
+--------+----------------------+------------------+-------+-----------+--------------+----------+------------+
| ID     |         NAME         |    WORKING HOST  | USER  |  STATUS   | SUBMIT TIME  | DURATION |    COST    |
+-------------------------------+------------------+-------+-----------+--------------+----------+------------+
| SZd... | apple                | 54.163.154.116...| admin | Executing | ...19:11:47Z | 1h35m1s  | 0.1143 USD |
| btC... | banana               | 34.234.64.157... | admin | Executing | ...19:12:19Z | 1h34m30s | 0.0926 USD |
| hnP... | camel                |                  | admin | Completed | ...20:43:01Z | 3m28s    | 0.0031 USD |
| 5d1... | cherry               | 54.89.203.124... | admin | Executing | ...19:12:27Z | 1h34m22s | 0.0924 USD |
| grd... | python-t3a.xlarge    | 34.229.17.82...  | admin | Completed | ...20:43:48Z | 2m51s    | 0.0026 USD |
| nVD... | tidyverse-t3a.xlarge | 3.89.224.246...  | admin | Executing | ...19:14:39Z | 1h32m10s | 0.0903 USD |
+--------+----------------------+------------------+-------+-----------+--------------+----------+------------+
...[edited for clarity]
$ float squeue -f "imageID=*python:latest and status=Executing"
+-----------------------+-------------------------+------------------------------+--------+-----------+...
|          ID           |          NAME           |         WORKING HOST         |  USER  |  STATUS   |...
+-----------------------+-------------------------+------------------------------+--------+-----------+...
| w2763vpgwuy9cksihxbcd | python-d72rur-t3.medium | 44.220.82.26 (2Core4GB/Spot) | banana | Executing |...
| q6r4ome7wyywk33b5jo9g | testpy                  | 3.239.38.179 (2Core4GB/Spot) | bean   | Executing |...
+-----------------------+-------------------------+------------------------------+--------+-----------+...
...[edited for clarity]
$ float squeue -f "imageID=*python* or imageID=*tidyverse*"
+-----------------------+-------------------------+------------------------------+--------+-----------+...
|          ID           |          NAME           |         WORKING HOST         |  USER  |  STATUS   |...
+-----------------------+-------------------------+------------------------------+--------+-----------+...
| w2763vpgwuy9cksihxbcd | python-d72rur-t3.medium | 44.220.82.26 (2Core4GB/Spot) | banana | Executing |...
| q6r4ome7wyywk33b5jo9g | testpy                  | 3.239.38.179 (2Core4GB/Spot) | bean   | Executing |...
| bxulaob02n1d91cmymvxk | Rtest                   | 44.212.52.61 (2Core4GB/Spot) | bean   | Executing |...
+-----------------------+-------------------------+------------------------------+--------+-----------+...
...[edited for clarity]

float log

Use the float log command to view and manage log files. Use subcommand with the -h flag to list the options.

Subcommands Usage Option Option Definition
cat Write log file contents to standard output log_file Log file whose contents are displayed
    -i, --hid host_id Host whose logs are displayed (default: OpCenter server)
    -j, --job job_id Job whose logs are displayed (default: OpCenter)
download Download a zip file of selected logs associated with job -i, --include logs1, logs2... Logs included in zip file (default: "all"). Options are:
  • "all": all logs
  • "opCenter": OpCenter logs
  • "app": application logs
  • "host": host logs
  • "job": job logs
  • "ocStack": OpCenter stack trace
  • "agentStack": agent stack trace
  • "jobObject": job object
    -j, --job job_id Job whose log files are included
    --path /path/to/dir Path to save zip file (default "./")
list, ls List all logs associated with target -i, --hid host_id Host whose log files are listed (default: OpCenter server)
    -j, --job job_id Job whose log files are listed (default: OpCenter)
    -H, --readable Display log size in human-readable format
rm Remove all logs associated with target -i, --hid host_id Host whose log files are removed (default: OpCenter server)
    -j, --job job_id Job whose log files are removed (default: OpCenter)
tail Write last n lines of log file to standard output log_file Log file to display
  -f, --follow Display new lines as they are appended to log file
    -i, --hid host_id Host whose log file lines are displayed (default: OpCenter server)
    -j, --job job_id Job whose log file lines are displayed (default: OpCenter)
    -n, --num n Number of lines to display (default: 100)

Examples

$ float log tail --follow output -j XGiUDRto7kwofWBNPkiW5
Ready to prepare source data
Ready to download pbmc_1k_v3_fastqs from s3
Ready to download refdata-gex-GRCh38-2020-A from s3
Ready to run test
...[output edited]
$ float log ls -H
+---------------------+---------------+----------------------+
|      LOG NAME       | READABLE SIZE |   LAST UPDATE TIME   |
+---------------------+---------------+----------------------+
| opcenter.access_log | 7.40 MB       | 2024-02-14T19:02:02Z |
| opcenter.log        | 712.73 KB     | 2024-02-14T18:55:28Z |
| upgrade.log         | 1.92 KB       | 2024-02-11T03:54:27Z |
| messages            | 220.56 KB     | 2024-02-14T18:48:48Z |
+---------------------+---------------+----------------------+

float login

Use the float login command, with valid username and password, to log in to OpCenter. The command has no subcommands. Use the command with the -h flag to list the options.

Subcommands Usage Option Option Definition
  Log in to OpCenter --info Display login status

Examples

$ float -u admin -p secret123 -a 192.168.0.1 login
$ float login --info
address: 192.168.0.1
username: admin
role: admin
type: builtin
$ float login
Username: admin
Password: 
Login Succeeded!

float logout

Use the float logout command to log the current user out of the OpCenter and invalidate the authorization token. The command has no subcommands. Use the command with the -h flag to list the options.

Example

$ float logout 
Logout Succeeded!
$ float login --info
Error: cannot find any login session (code: 2004)

float migrate

Use the float migrate command to move a job from one VM instance to another VM instance of the same type or of a different type. The command has no subcommands. Use the command with the -h flag to display the options.

Subcommands Usage Option Option Definition
  Migrate job to a new VM instance. If no options used, migrate to identical instance in same availability zone. -c, --cpu min_cpu:max_cpu Range of number of virtual CPUs to select from for new VM instance (can omit max_cpu).
    -e, --env Environment variables that apply when the migrated job resumes. Use format env_key=env_value
    -f, --force Automatically answer "yes" at confirmation prompt
--gpu-count min_gpu:max_gpu Range of number of GPUs to select from for new VM instance
--gpu-mem min_gpumem:max_gpumem Range of memory size (in GB) of GPUs to select from for new VM instance
--gpu-vendor gpu_vendor GPU vendor name (for example, nvidia or amd)
    -t, --instType instance_type VM instance type to migrate to (e.g., c5.xlarge in AWS). Do not combine with --cpu or --mem options.
    -j, --job job_id Job to migrate
    -m, --mem min_mem:max_mem Range of memory size (in GB) to select from for new VM instance (can omit max_mem).
    -P, --payType spot | ondemand Pricing tier for VM instance (Spot or On-demand)
    --sync Block all terminal input until job has migrated
--rerun Ignore snapshot and restart job from the beginning on new VM instance (job must be running)
    -z, --zone availability_zone Availability zone in which to execute job

Examples

$ float migrate -t c5.xlarge -j NDt428IsJtJZsB9WNUhGH
$ float migrate -j lL07E84pQQpYqCQ88xeIQ
entity:
  id: i-0cfbd3f1f82087dd5
  type: host
  name: 192.168.0.2
status: normal
instanceType: c5.xlarge
startTime: "2022-09-23T15:09:09Z"
downTime: ""
$ float migrate -f --sync --payType ondemand -j tqrGc4Z6g18nkphzTeaxM
tqrGc4Z6g18nkphzTeaxM is now migrating...
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

tqrGc4Z6g18nkphzTeaxM has been migrated to 44.212.92.162 (4Core16GB/OnDemand).

float modify

Use the float modify command to change a subset of attributes associated with a running job. The command has no subcommands. Use the command with the -h flag to list the options.

Subcommands Usage Option Option Definition
Modify certain attributes associated with running job --addCustomTag tagName:tagValue Tag name and value to associate with running job. Separate multiple tags with "," or include option multiple times to add multiple tags.
--addSecurityGroup sec_group Security group (or tag in GCP) added to VM instance for this job. Include option multiple times to add multiple security groups.
    --errPolicy err_policy The policy to use if the job fails. The allowed policies are the following.
  • reclaimAll: change job state to failed and reclaim all cloud resources
  • restart: retain VM and local storage volumes (file systems) and automatically restart job from the beginning on same VM
  • retainVolumes: reclaim VM and retain local storage volumes (file systems); user must cancel job or resume job on new VM using float resume -j job_id
    --force Automatically answer "yes" at confirmation prompt
    -j, --job job_id Job to apply changes to
    -M, --migratePolicy migrate_policy New migrate policy to apply. See float submit for format.
    --rmSecurityGroup sec_group Security group (or tag in GCP) to remove from VM instance for this job. Include option multiple times to remove multiple security groups.
    --snapshotInterval snapshot_interval New periodic snapshot interval for this job. Use "disable" or "0" to turn off periodic snapshots.
    -V, --vmPolicy vm_policy New VM creation policy to apply. See float submit for format.

Example

$ float modify --vmPolicy [SpotOnly=true] -j PF0bgCvlpdJkog0RCBZPg
Warning: Are you sure you want to modify PF0bgCvlpdJkog0RCBZPg?
New vmPolicy may trigger migration.(yes/No): yes
Successfully modified PF0bgCvlpdJkog0RCBZPg:  --vmPolicy [SpotOnly=true]
$ float modify -j y78iefqeo8k6ng2rpvduj --addCustomTag run-name=task1
Successfully modified y78iefqeo8k6ng2rpvduj:  --addCustomTag run-name=task1
$ float show -j y78iefqeo8k6ng2rpvduj
...
customTags:
    run-name=task1: ""
...[edited]
$ float modify -j y78iefqeo8k6ng2rpvduj --addCustomTag project=RNA-sequencing,dept=Research
Successfully modified y78iefqeo8k6ng2rpvduj:  --addCustomTag project=RNA-sequencing,dept=Research
$ float modify -j y78iefqeo8k6ng2rpvduj --addCustomTag PI=jones --addCustomTag funding=NHS
Successfully modified y78iefqeo8k6ng2rpvduj:  --addCustomTag PI=jones --addCustomTag funding=NHS
$ float show -j y78iefqeo8k6ng2rpvduj
...
customTags:
    PI=jones: ""
    funding=NHS: ""
    project=RNA-sequencing,dept=Research: ""
    run-name=task1: ""
...[edited]

float promote (boost)

Use the float promote or float boostcommand (as an admin user) to move a job in the "Submitted" state to the front of the scheduling queue. The command has no subcommands. Use the command with the -h flag to list the options.

Subcommands Usage Option Option Definition
  Promote a job in the "Submitted" state to the front of the scheduling queue.
Must be admin user.
-j, --job job_id Job to promote

float ps

Use the float ps command to show the complete process tree of a running job (the linux command ps must be installed in the container). The command has no subcommands. Use the command with the -h flag to list the options.

Subcommands Usage Option Option Definition
  Show the complete process tree of job --args podman_args Arguments passed to podman
    -j, --jobId job_id Job to query

float quota-policy

Use the float quota-policy command to define and manage quota (SurfZone) policies. Use a subcommand with the -h flag to list the options.

Subcommands Usage Option Option Definition
add, create Create quota policy policy_name Name to associate with the quota policy
    --limit budget_limit Maximum amount (in $) allowed to spend in one month
    --action cancel | suspend Action taken when quota limit reached (default: cancel)
    --auto-resume=true | false Action taken on suspended job when quota replenished (default: true)
    --threshold threshold Percentage of quota (budget) consumed to trigger alert (default: 80)
info, show Display quota policy details policy_id ID that identifies quota policy
list List available quota policies    
update, modify Update parameters in existing quota policy policy_id ID that identifies quota policy
    --name policy_name New name to associate with the quota policy
    --limit budget_limit Updated maximum amount (in $) allowed to spend in one month
    --action cancel | suspend Updated action taken when quota limit reached
    --autoresume true | false Updated action taken on suspended job when quota replenished
    --threshold threshold Updated percentage of quota (budget) consumed to trigger alert
delete, rm, remove Delete a quota policy policy_id ID to identify quota policy

Examples

$ float quota-policy list
id: u2w02ntshv61mxc7muaz0
  name: fruit
  metric: Cost
  overageAction: cancel
  autoResume: false
  threshold: 80%
  limit: 37
- id: 2f4qaesir4wxz71mazstq
  name: legume
  metric: Cost
  overageAction: suspend
  autoResume: true
  threshold: 80%
  limit: 35
$ float quota-policy update 2f4qaesir4wxz71mazstq --limit 50
id: 2f4qaesir4wxz71mazstq
name: legume
metric: Cost
overageAction: suspend
autoResume: true
threshold: 80%
limit: 50

float release

Use the float release command to manage the OpCenter software. Use a subcommand with the -h flag to list the options.

Subcommands Usage Option Option Definition
info Display information regarding new features and bug fixes in OpCenter release -r, --release version The release to display information about
list, ls List available OpCenter releases    
sync Sync CLI version with OpCenter version    
upgrade Upgrade OpCenter software --force Automatically answer "yes" at confirmation prompt
    -r, --release version The release to upgrade to (default: "latest"). Only the admin user can upgrade software. Cannot upgrade using web CLI console.
    --sync Wait for upgrade to complete and then sync the CLI to the new release

Examples

$ float release ls
+----------+--------------------------------------+----------------------+-----------+
| VERSION  |               RELEASE                |     RELEASE TIME     |   SIZE    |
+----------+--------------------------------------+----------------------+-----------+
| * v2.5.0 | FLOAT_v2.5.0-171088e-HalfMoonBay.bin | 2024-02-09T03:31:15Z | 220.65 MB |
| v2.4.1   | FLOAT_v2.4.1-0803674-Goa.bin         | 2024-01-09T18:29:33Z | 219.32 MB |
+----------+--------------------------------------+----------------------+-----------+
$ float release sync
downloading ...
Progress: |>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>| 100.00% Complete (ETA. 0s)
The float binary is synced up with opcenter 

Note

If you place the float binary in a directory that is only writable by the root user, use sudo float release sync

float report

Use the float report command to download reports from OpCenter. Use a subcommand with the -h flag to list the options.

Subcommands Usage Option Option Definition
download Download OpCenter usage history report --path /path/to/dir Path to save report on local computer
get Generate and display usage report with filter(s) applied report_name Name of report, e.g., usage_report_by_job
    -A, --all Compile report from all usage data retained by this OpCenter. If -A not used, default filter applied (different for each report).
    -d, --date date_string Date used to filter reports (default: current report)
    --filter filter Filter(s) to apply to usage report. Use option multiple times to apply multiple filters. Working with OpCenter/Filters/Job Filters for details. Example: timeRange=2010-10-22~
    -l, --limit num_records Maximum number of records to display (default: 0 which means unlimited)
    -o, --orderBy field Field to order by. Use 'cost' or 'jobs' (default). The column entitled 'JOB COUNT' is used to order entries when 'jobs' specified.
    -r, --refresh Force the refresh of report
ls, list List all available usage reports    

Examples:

$ float report get usage_report_by_user -A -f status=Cancelled
+-----------+-----+-----------+------------+--------------+----------------+--------------------+...
| USER NAME | FEE | JOB COUNT | WALL TIME  | COMPUTE TIME | SPOT INSTANCES | ONDEMAND INSTANCES |...
+-----------+-----+-----------+------------+--------------+----------------+--------------------+...
| admin     |   0 |        69 | 617h57m42s | 615h42m7s    |             47 |                 11 |...
| apple     |   0 |         1 | 6h33m59s   | 6h53m31s     |              8 |                  0 |...
  ...
(edited)
$ float report download --path ./temp.gzip
Downloaded to ./temp.gzip
`file temp.gzip`
temp.gzip: gzip compressed data, original size modulo 2^32 14336

float rerun (requeue, resubmit)

Use the float rerun (or requeue or resubmit) command to re-submit a completed job, a canceled job, or a job that failed to complete. The command has no subcommands. Use the command with the -h flag to list the options.

Subcommands Usage Option Option Definition
  Re-submit completed or failed job -j, --job job_id Job to re-submit

float restart (reboot)

Use the float restart (or reboot) command to restart OpCenter (terminates all login sessions). The command has no subcommands. Use the command with the -h flag to list the options.

Subcommands Usage Option Option Definition
  Restart OpCenter -f, --force Automatically answer "yes" at confirmation prompt

Example

$ float list
+-----+---------------------+...+-----------+----------------------+----------+------------+
| ID  |        NAME         |...|  STATUS   |     SUBMIT TIME      | DURATION |    COST    |
+-----+---------------------+...+-----------+----------------------+----------+------------+
| g...| tidyverse-t3.medium |...| Executing | 2023-12-11T23:22:17Z | 20m37s   | 0.0066 USD |
+---------------------------+...+-----------+----------------------+----------+------------+
$ float restart
Warning: There are running jobs in server, do you want to restart forcibly?
Some job may fail if you do that!(yes/No): yes
Warning: Are you sure you want to restart OpCenter? All active sessions will be inactivated.(yes/No): yes
OpCenter is now restarting and will be online soon.
Pause briefly and then log back in
$ float list
+-----+---------------------+...+-----------+----------------------+----------+------------+
| ID  |        NAME         |...|  STATUS   |     SUBMIT TIME      | DURATION |    COST    |
+-----+---------------------+...+-----------+----------------------+----------+------------+
| g...| tidyverse-t3.medium |...| Executing | 2023-12-11T23:22:17Z | 21m42s   | 0.0072 USD |
+---------------------------+...+-----------+----------------------+----------+------------+
(edited for clarity)

float resume (recover, restore)

Use the float resume (or recover or restore) command to resume a suspended job (see float suspend command). The command has no subcommands. Use the command with the -h flag to list the options.

Subcommands Usage Option Option Definition
  Resume a suspended job. If no options used, resume on identical instance as original.. -c, --cpu min_cpu:max_cpu Range of number of virtual CPUs to select from for new VM instance (can omit max_cpu).
    -t, --instType instance_type Instance type for new VM (e.g., c4.xlarge in AWS). Do not combine with --cpu or --mem options.
    -j, --job job_id Job to resume
    -m, --mem min_mem:max_mem Range of memory size (in GB) to select from for new VM instance (can omit max_mem).
    -P, --payType spot | ondemand Pricing tier for VM instance (Spot or On-demand).

float sbatch (submit)

The float sbatch command is the same as float submit.

float scancel (cancel)

The float scancel command is the same as float cancel.

float secret

Use the float secret command to interact with the secret manager. Use the command with the -h flag to list the options.

Subcommands Usage Option Option Definition
get Retrieve secret from the secret manager secret_name Name associated with secret
ls, list List all secrets created by current user    
set, put Insert {name, value} pair into secret manager database secret_name secret_value {name, value} pair to insert
    --file /path/to/file File containing secret_value. Use instead of specifying secret_value on command line.
    -f, --force Overwrite existing secret value
unset, rm, del, remove, delete Delete {name, value} pair from secret manager database secret_name Name associated with {secret_name, secret_value} pair to delete

Example

$ float secret set s3access 1234567
Set s3access successfully
$ float secret ls
+----------+
|   NAME   |
+----------+
| s3access |
+----------+
$ float secret unset s3access
unset secret s3access successfully
$ cat value.txt
secret123
$ float secret set usersecret --file ./value.txt
Set usersecret successfully

float show

Use the float show command to display the status of a job and content of job scripts. The command has no subcommands. Use the command with the -h flag to list the options.

Subcommands Usage Option Option Definition
Show current status of job or VM instance, or show contents of scripts used for job --containerInitScript Display contents of container init script
-c, --content Display contents of job script
    -i, --hid host_id VM instance to query
--hostInitScript Display contents of host init script
--hostTerminateScript Display contents of host terminate script
    -j, --job job_id Job to query

Example

$ float show -j FOIW5Y16KZgJ6Tsd02QuS
id: ctZLDo7OFG4BuJ8ytiTem
name: python-c5d.large
workingHost: 52.7.123.178 (2Core4GB/Spot)
user: admin
imageID: docker.io/bitnami/python:latest
imageDigest: sha256:24c1d45bf41c396184bd9808b307c67267a809754cd176ac8d91cceb47d0f3ef
output: |-
    Getting image source signatures
    Copying blob sha256:1acb894a7ceb1ba5362fb85123b0248a064ed3195aaa24af74e8cb710ca1c5a4
    Copying config sha256:23bacce690702dac91557ef74ab312cb3db5a2b4bb54ada968d1352e7d9a110a
    Writing manifest to image destination
    Storing signatures
    Loaded image: docker.io/bitnami/python:latest
    First submit job ctZLDo7OFG4BuJ8ytiTem, call podman directly
    No cmd args provided, launch job directly
    4eda6b0d471fb31eecfd06b59e1d8c527310861009fa4baf967d834397934bac
status: Executing
.....[output edited]

$ float show -c -j S0JPnWd3a2hQmVmVWCMWc
#!/usr/bin/bash
LOG_PATH=$1
LOG_FILE=$LOG_PATH/output
touch $LOG_FILE
exec >$LOG_FILE 2>&1
echo "Congratulations! You have submitted your first job"
for(( c=1; c<3; c++))
do
        if [[ $(($c % 3)) == 1 ]] then
                echo "Hello World!"
        else
                echo "Your next job will be more interesting" >&2
        fi
                sleep 20s
done
echo "Job complete"

float sinfo (hosts)

The float sinfo command is the same as float hosts.

float snapshot

Use the float snapshot command to display information about snapshots. Use subcommand with the -h flag to list the options.

Subcommands Usage Option Option Definition
list List available snapshots -A, --all Show all snapshots associated with job
    -f, --filter filter Filter to apply to snapshots, for example: status=normal. The default filter is active=true. Use option multiple times to apply multiple filters combined with "and" operator.
    -j, --jobID job_id Job to query
show, info Display information about snapshot -s, --snapID snapshot_id ID of snapshot to query

Example:

$ float snapshot show -s jobSnap-Kp89d7qzJjiuhNeaOUfsA
id: jobSnap-Kp89d7qzJjiuhNeaOUfsA
jobID: h3xg4jpC8ydcgkVWBKj4S
volumeSnapshots:
    - zone: us-east-1b
      volumeId: vol-02786458cebdeca43
      volumeSize: 6
      status: completed
      snapshotId: snap-08e5b6ed8b7834b96
      createTime: "2023-04-18T15:51:17Z"
      mountPoint: /mnt/float-data
      cost: 0.0000 USD
    - zone: us-east-1b
      volumeId: vol-06dd42bf6b091466a
      volumeSize: 6
      status: completed
      snapshotId: snap-0f2833cdb639fa4b5
      createTime: "2023-04-18T15:51:17Z"
      mountPoint: /mnt/float-image
      cost: 0.0000 USD
    - zone: us-east-1b
      volumeId: vol-023c2e91e34f37335
      volumeSize: 10
      status: completed
      snapshotId: snap-025a733db664e9054
      createTime: "2023-04-18T15:51:17Z"
      mountPoint: /data
      cost: 0.0000 USD
status: normal
createTime: 2023-04-18T15:51:17.434161661Z

float squeue (list)

The float squeue command is the same as float list.

float status

Use the float status command to show current status of the OpCenter. The command has no subcommands. Use the command with the -h flag to list the options.

Example

$ float status
id: ac300c37-b0da-4c7b-8a50-d3fc49f6ad2a
Server Status: normal
API Request Status: normal
License Status: valid
Init Time: "2024-02-09T15:36:39Z"
Up Time: "2024-02-11T03:54:28Z"
Current Time: "2024-02-15T15:57:02Z"

float storage

Use the float storage command to "register" (that is, pre-configure) storage services for use when submitting jobs. Use subcommand with the -h flag to list the options.

SubcommandsUsageOptionOption Definition
delete, rm, removeDelete a registered storage servicestorageID Identifier associated with the storage service to delete
info, showDisplay information about a registered storage servicestorageID Identifier associated with the storage service to query
listDisplay filtered list of registered storage services-f, --filter filter Filter to apply to registered storage services. Use option multiple times to apply multiple filters combined with "and" operator. Simple filters consist of an attribute, operator, and value. Supported attributes include:
  • id
  • name
  • storageType
  • url
  • zone
  • path
  • createdTime
-o, --orderBy attributeAttribute used to order listing (prepend with "-" to reverse order). Use any filter attribute (default: createdTime). Create nested ordering by connecting multiple attributes with ",".
register volume, add volumeRegister an existing volume as a storage service-n, --name storage_nameName to associate with this storage service
--permision normal | publicStorage service access permission
  • normal (default): storage accessible by all users in group (editable by admin and owner)
  • public: storage accessible by all users (editable by admin and owner)
--mode ro | rwAccess mode
  • ro: read only
  • rw (default): read and write
--id volume_idVolume ID that identifies volume (for example, an identifier like vol-0da677a1f4967350a in AWS)
-m, --mountPoint pathPath to directory where volume is mounted by container
register nfs, add nfsRegister NFS-exported directory as a storage service-n, --name storage_nameName to associate with this storage service
--permision normal | publicStorage service access permission
  • normal (default): storage accessible by all users in group (editable by admin and owner)
  • public: storage accessible by all users (editable by admin and owner)
--mode ro | rwAccess mode
  • ro: read only
  • rw (default): read and write
-m, --mountPoint pathPath to directory where volume is mounted by container
--url nfs://nfs_server_ip/exported_dirIP address of NFS server and exported directory, for example 192.168.1.1/home
register lustre, add lustreRegister Lustre file system as a storage service-n, --name storage_nameName to associate with this storage service
--permision normal | publicStorage service access permission
  • normal (default): storage accessible by all users in group (editable by admin and owner)
  • public: storage accessible by all users (editable by admin and owner)
--mode ro | rwAccess mode
  • ro: read only
  • rw (default): read and write
-m, --mountPoint pathPath to directory where volume is mounted by container
--options stringMount options for lustre file system, for example, [opts=rw, noexec, abort_recov, recovery_time_hard=120s]
--url lustre://dns-name/mountname/exported_dirDNS name of lustre server, file system's mount name, and exported directory, respectively. The mount name is the name you assign the file system when configuring FSx for Lustre on AWS. Naming the file system makes it easier to find.
register s3, add s3Register S3 bucket as a storage service-n, --name storage_nameName to associate with this storage service
--permision normal | publicStorage service access permission
  • normal (default): storage accessible by all users in group (editable by admin and owner)
  • public: storage accessible by all users (editable by admin and owner)
--mode ro | rwAccess mode
  • ro: read only
  • rw (default): read and write
-m, --mountPoint pathPath to directory where volume is mounted by container
--bucket s3://bucket_name/folderS3 bucket name and folder (optional) to use for storage service
-c, --credential [accessKey=key_value, secret=secret_value,]Access key string and access secret value, respectively
--endpoint s3_endpointEndpoint for S3 bucket, for example, s3.us-east-1.amazonaws.com
register gs, add gsRegister Google Cloud Storage as a storage service-n, --name storage_nameName to associate with this storage service
--permision normal | publicStorage service access permission
  • normal (default): storage accessible by all users in group (editable by admin and owner)
  • public: storage accessible by all users (editable by admin and owner)
--mode ro | rwAccess mode
  • ro: read only
  • rw (default): read and write
-m, --mountPoint pathPath to directory where volume is mounted by container
--bucket gs://bucket_name/folderGoogle Cloud Storage bucket name and folder (optional) to use for storage service
-c, --credential [accessKey=key_value, secret=secret_value,]Access key string and access secret value, respectively
--endpoint gs_endpointEndpoint for GCS bucket, for example, storage.googleapis.com
registerRegister storage service using dataVolume format (instead of register volume|nfs|lustre|s3|gs).--dataVolume [string]Parameters that define the storage service. Use the following format.
  • volume (EBS): [mode=rw]volume-id[:/mountpoint] or [mode=rw]volume://volume-id[:/mountpoint]
  • NFS: [mode=rw]nfs://172.31.2.120/path[:/mountpoint]
  • S3: [accesskey=xxx,secret=yyy,endpoint=zzz,mode=rw]s3://bucketname/path[:/mountpoint]
  • GCS: [accesskey=xxx,secret=yyy,endpoint=zzz,mode=rw]gs://bucketname/path[:/mountpoint]
  • Lustre: [opts=xxx,yyy,zzz,mode=rw]lustre://dnsname/mountname[:/mountpoint]
update volumeUpdate an existing volume as a storage service storageIDIdentifier associated with this storage service
-n, --name storage_nameName to associate with this storage service
--permision normal | publicStorage service access permission
  • normal (default): storage accessible by all users in group (editable by admin and owner)
  • public: storage accessible by all users (editable by admin and owner)
--mode ro | rwAccess mode
  • ro: read only
  • rw (default): read and write
--id volume_idVolume ID that identifies volume (for example, an identifier like vol-0da677a1f4967350a in AWS)
-m, --mountPoint pathPath to directory where volume is mounted by container
update nfsUpdate NFS-exported directory as a storage service storageIDIdentifier associated with this storage service
-n, --name storage_nameName to associate with this storage service
--permision normal | publicStorage service access permission
  • normal (default): storage accessible by all users in group (editable by admin and owner)
  • public: storage accessible by all users (editable by admin and owner)
--mode ro | rwAccess mode
  • ro: read only
  • rw (default): read and write
-m, --mountPoint pathPath to directory where volume is mounted by container
--url nfs://nfs_server_ip/exported_dirIP address of NFS server and exported directory, for example 192.168.1.1/home
update lustreUpdate Lustre file system as a storage service storageIDIdentifier associated with this storage service
-n, --name storage_nameName to associate with this storage service
--permision normal | publicStorage service access permission
  • normal (default): storage accessible by all users in group (editable by admin and owner)
  • public: storage accessible by all users (editable by admin and owner)
--mode ro | rwAccess mode
  • ro: read only
  • rw (default): read and write
-m, --mountPoint pathPath to directory where volume is mounted by container
--options stringMount options for lustre file system, for example, [opts=rw, noexec, abort_recov, recovery_time_hard=120s]
--url lustre://dns-name/mountname/exported_dirDNS name of lustre server, file system's mount name, and exported directory, respectively. The mount name is the name you assign the file system when configuring FSx for Lustre on AWS. Naming the file system makes it easier to find.
update s3Update S3 bucket as a storage service storageIDIdentifier associated with this storage service
-n, --name storage_nameName to associate with this storage service
--permision normal | publicStorage service access permission
  • normal (default): storage accessible by all users in group (editable by admin and owner)
  • public: storage accessible by all users (editable by admin and owner)
--mode ro | rwAccess mode
  • ro: read only
  • rw (default): read and write
-m, --mountPoint pathPath to directory where volume is mounted by container
--bucket s3://bucket_name/folderS3 bucket name and folder (optional) to use for storage service
-c, --credential [accessKey=key_value, secret=secret_value,]Access key string and access secret value, respectively
--endpoint s3_endpointEndpoint for S3 bucket, for example, s3.us-east-1.amazonaws.com
update gsUpdate Google Cloud Storage as a storage servicestorageIDIdentifier associated with this storage service
-n, --name storage_nameName to associate with this storage service
--permision normal | publicStorage service access permission
  • normal (default): storage accessible by all users in group (editable by admin and owner)
  • public: storage accessible by all users (editable by admin and owner)
--mode ro | rwAccess mode
  • ro: read only
  • rw (default): read and write
-m, --mountPoint pathPath to directory where volume is mounted by container
--bucket gs://bucket_name/folderGoogle Cloud Storage bucket name and folder (optional) to use for storage service
-c, --credential [accessKey=key_value, secret=secret_value,]Access key string and access secret value, respectively
--endpoint gs_endpointEndpoint for GCS bucket, for example, storage.googleapis.com

float submit

Use the float submit (or float sbatch) command to submit a job. The command has no subcommands. Use the command with the -h flag to list the options.

Subcommands Usage Option Option Definition
  Submit job for execution --allowList instance_type Allowed instance type(s). Use format like "c5*" which allows any VM type in the c5 family. Include --allowList multiple times to add other VM instance types to the allow list. Default is [ ] which allows all types.
    --bandwidth bandwidth Bandwidth (in Mbps) required for workload (only use in AliCloud)
    -A, --cmdArgs command_string Commands that are executed immediately after container starts (can be used with or without a job script)
    -c, --cpu min_cpu:max_cpu Range of number of virtual CPUs to select from to run job (can omit max_cpu)
    --cpuVendor cpu_vendor_name CPU vendor of allowed VM instances. Allowed values are amd, intel, or " ". Default is " " which allows any vendor. Use with AWS only.
    --customTag tagName:tagValue Tag customized by each MMCloud subscriber, for example, to generate customized usage reports. Multiple custom tags can be applied by using --customTag multiple times in a single command.
    -D, --dataVolume [size=vol_size,throughput=rate]:/data_dir or

vol_id:/container_mnt_pt/path/to/dir or

nfs://ip_address/server_export_dir: /container_mnt_pt/path/to/dir or

[accesskey=x,secret=y,mode=rw]s3://bucketname/path:/container_mnt_pt/path/to/dir or

[opts=xx,yy]lustre://dnsname/mountname/path:/container_mnt_pt/path/to/dir or

[accesskey=xxx,secret=yyy]jfss3://bucketname:/container_mnt_pt/path/to/dir
Data volume mounted by container. Use option multiple times to attach multiple data volumes.
  • (New EBS volume) vol_size in GB, rate in Mbps (can omit), data_dir is data directory created
  • (Existing EBS volume) vol_id specifies existing EBS volume, /container_mnt_pt/path/to/dir is path to where EBS volume is mounted on container
  • (NFS mount) ip_address is IP address of NFS server, /server_export_dir is directory exported by NFS server, and /container_mnt_pt/path/to/dir is path to where NFS directory is mounted on container
  • (S3|GCS|OSS bucket) bucketname is S3 bucket (access credentials optional if action allowed by IAM role). Replace s3 with gs for GCP or oss for AliCloud.
  • (Lustre FS) dnsname is the DNS name for the Lustre server and mountname is the name assigned to the Lustre file system
  • (JuiceFS on S3) bucketname is the name of the S3 bucket used as the physical data store for the JuiceFS file system
    -d, --def definition_file Path to file if using definition file (yaml or json format) to provide input parameters to submit command
    --denyList instance_type Excluded instance type(s). Use format like "c5*" which excludes all VM types in the c5 family. Include --denyList multiple times to exclude other VM instance types. Default is [ ] which does not exclude any types.
    --dumpMode full | incremental Setting for snapshot type. Full means complete memory snapshot taken every time. Incremental means only incremental changes captured after initial snapshot. Default is full.
    -e, --env env_key=env_value Environment variable setting for the job. Include --env multiple times to add multiple variables.
    -E, --errPolicy err_policy Policy (AWS only) to use if the job fails. The choices are:
  • reclaimAll: reclaim virtual machine and all EBS volumes (default)
  • retainVolumes: reclaim virtual machine and retain all EBS volumes. User must manually resume or cancel job. Retained EBS volumes mounted on new virtual machine.
  • restart: retain virtual machine and all EBS volumes; resubmit the job on same virtual machine with same EBS volumes (job may run without stopping if the application does not limit the number of restarts)
    --extraContainerOpts Extra options passed directly to container (enclose in quotes)
    -f, --force Automatically answer "yes" at confirmation prompt
    --gateway gateway_id ID of gateway to connect job to. Replacing gateway_id with the word "auto" directs the OpCenter to select a gateway automatically.
--gpu-count min_gpu : max_gpu Range of number of GPUs to select from to run job
--gpu-disable Disallow the use of GPUs for this job
--gpu-mem min_gpu_mem : max_gpu_mem Range of size of GPU memory to select from to run job
--gpu-name GPU_NAME Specific GPU to use for this job, for example, m60 or h100 or a100, and so on.
--gpu-vendor GPU_VENDOR Allowed GPU vendor to use for this job, for example, nvidia or amd.
-i, --image image_name | image_URI Image name or image URI to pull container image for job
--imageVolSize image_vol_size Size of volume (in GB) to act as root volume for image (default: 6)
--imageVolType image_vol_type (AWS only) Type of EBS volume to use for image volume, for example, gp2 or gp3.
    -t, --instType instance_type VM instance type, for example, c5.4xlarge for AWS. Overrides --cpu or --mem options.
    -j, --job job_script Job script to run workload. Use format: path to local file, s3 file, OSS file or https | http file.
-k, --keepDump Save dump log during job migration
    --mem min_mem:max_mem Range of memory size (in GB) to select from to run job (can omit max_mem)
    --metricsInterval metrics_int Time in seconds between queries to obtain container metrics (default: 10s)
    -M, --migratePolicy [migrate_policy] Policy to determine auto-migration behavior. Format is [option1=value1,option2=value2...]. The available options with their default values are (if no units are attached, the value is a percentage):
  • disable=false
  • evadeOOM=true
  • stepAuto=true
  • cpu.disable=true
  • cpu.upperBoundRatio=90
  • cpu.lowerBoundRatio=5
  • cpu.upperBoundDuration=120s
  • cpu.lowerBoundDuration=5m0s
  • cpu.step=50
  • cpu.limit=0 (no limit)
  • cpu.lowerLimit=0 (no limit)
  • mem.disable=true
  • mem.upperBoundRatio=90
  • mem.lowerBoundRatio=5
  • mem.upperBoundDuration=120s
  • mem.lowerBoundDuration=5m0s
  • mem.step=50
  • mem.limit=0 (no limit)
  • mem.lowerLimit=0 (no limit)
--miTag [tag1:value1 tag2:value2...] Tag(s) to select virtual machine image. Use option multiple times to submit multiple tag,value pairs.
    -n, --name job_name Name to associate with job
    --noPublicIP No public IP address assigned to container host. Ensure that host can reach AWS services. AliCloud users must include --endpoint option in their job scripts.
    -o, --output /path/to/dir Folder to save stdout and stderr as files with names: stdout.autosave.$jobid and stderr.autosave.$jobid. Options for path are:
  • nfs://nfs_server_ip/remote_folder
  • file:///local_folder
  • /local_folder (local_folder must be mounted by worker node)
    --outputFlag If included (set to true), job status is included in job output folder.
    -P, --publish host_port:container_port Rule for publishing container port to container host port, for example, 8080:80. Include option multiple times to publish multiple ports.
    --rootVolSize root_vol_size Root volume size in GB to load base OS (default: 40)
    --securityGroup sec_group AWS security group (or tag in GCP) added to VM instance for this job. Use option multiple times to apply multiple security groups.
    --shmSize shm_size Size of /dev/shm in format nu where n is a number and u is b, k, m, or g for bytes, KiB, MiB or GiB, respectively (default: 64m)
--snapLocation local | s3://bucketname Location to save snapshot image and metadata. Choices are:
  • local: volume automatically mounted by OpCenter (EBS for AWS, Persistent Disk for Google Cloud, Block Storage for AliCloud)
  • s3://bucketname: JuiceFS with bucket type dependent on available cloud object storage service. Replace s3 with gs in GCP. Replace s3 with oss in AliCloud.
--storage id | name:/container_mnt_pt/path/to/dir Registered storage service included with job. Specify storage service by name or ID. Specify path to where storage service is mounted on container.
    -I, --snapshotInterval snap_int Time between periodic snapshots in format hhmmss where h, m, and s are numbers and h, m, and s are hours, minutes, and seconds, respectively. Default is 0 (disable). Minimum is 10m.
    -s, --subnet subnet_ID AWS subnet ID in which to execute the job. Default is that the OpCenter automatically selects the subnet.
    --tag tag_name Tag to select image version for job. Default is "latest" or only tag.
    --targetPort port_num Port to connect job to on gateway. Use option multiple times to connect multiple ports.
    -T, --template template_name Template in format name:tag to use for submitting this job. Default tag is only tag or "latest".
    -l, --timeLimit max_time Maximum time that a job is allowed to run. Use format hhmmss where h, m, and s are numbers and h, m, and s are hours, minutes, and seconds, respectively. Default is unlimited (use 0).
    -V, --vmPolicy [vm_policy] VM creation policy. Format is [key1=value1,key2=value2...]. The allowed keys (and values) are:
  • spotFirst (true | false): Try to start spot instance. If none available, start on-demand instance.
  • spotOnly (true | false): Try to start spot instance. If none available, stop.
  • onDemand (true | false): Start on-demand instance
  • retryLimit (n): Number of cycles of attempts to start spot instance
  • retryInterval (ns): Time to wait between cycles of attempts
  • priceLimit (number in $/hour): Maximum price to bid for spot instance
  • optimize (true | false): Migrate job from on-demand instance to spot instance when spot instances are available.
  • mode (loose | strict | same): CPU compatibility check when migrating to new compute instance. Loose: same model and same or newer version. Strict: CPU metadata compatible with new instance. Same: same model and same version.
  • maxSpotReclaim (integer): When this number of spot reclaim events reached, job migrates to on-demand instance
The default VM creation policy is:
  • spotFirst=true
  • retryLimit=3
  • optimize=true
  • retryInterval=10m0s
  • priceLimit: no upper bound on price imposed
  • mode=loose
  • maxSpotReclaim=0 (no limit on number of spot reclaim events)
    --withRoot Run job with root privileges
    -z, --zone availability_zone Availability zone in which to execute job

Examples

$ float submit -i tidyverse -j run_genericr.sh --cpu 4 --mem 8 --dataVolume [size=10]:/data
id: pw6nupnmbej0qedlvz3dx
name: tidyverse
user: admin
imageID: docker.io/rocker/tidyverse:latest
status: Submitted
submitTime: "2023-12-13T19:08:27Z"
duration: 0s
queueTime: 0s
cost: 0.0000 USD
inputArgs: -i tidyverse -c 2 -m 4 --dataVolume [size=10]:/data
cpu: 2
memGB: 4
vmPolicy:
    policy: spotFirst
    retryLimit: 3
    optimize: true
    retryInterval: 10m0s
migratePolicy:
    evadeOOM: true
    stepAuto: true
    cpu:
        upperBoundRatio: 90
        lowerBoundRatio: 5
        upperBoundDuration: 2m0s
        lowerBoundDuration: 5m0s
        step: 50
        disable: true
    mem:
        upperBoundRatio: 90
        lowerBoundRatio: 5
        upperBoundDuration: 2m0s
        lowerBoundDuration: 5m0s
        step: 50
        disable: true
actualEmission: 0.0000 g
baselineEmission: 0.0000 g
$ float submit -i docker.io/centos -j helloworld.sh -c 2 -m 4 --dataVolume [size=10]:/data  --name hw -o nfs://172.31.81.17/mnt/memverge/shared

Log in to NFS server and check that output files are populated.

$ ls -l /mnt/memverge/shared/*
-rw-r--r-- 1 5001 5001   0 Feb 15 20:02 /mnt/memverge/shared/hw.08ogctot3ufqnyf9k3wg8.stderr.autosave
-rw-r--r-- 1 5001 5001 116 Feb 15 20:02 /mnt/memverge/shared/hw.08ogctot3ufqnyf9k3wg8.stdout.autosave

float suspend (hibernate)

Use the float suspend (or hibernate) command to temporarily suspend a job. In-memory state and relevant files are saved so that the job can resume executing at a later time. The command has no subcommands. Use the command with the -h flag to list the options.

Subcommands Usage Option Option Definition
  Suspend a running job --cold With this option, storage resources are reclaimed, which may save cost, but it slows the suspend operation (and the associated resume operation). Use if suspending for extended period.
    -f, --force Automatically answer "yes" to confirmation prompts
    -j, --job job_id ID of job to suspend

float template

Use the float template command to use a template or to display information about templates. Use subcommand with the -h flag to list the options.

Subcommands Usage Option Option Definition
delete, rm, remove Delete a template -f Force deletion of template
    -T, --template name:tag Template name and tag. If tag omitted, default is "latest."
deploy Submit a job using a template -c, --cpu min_cpu:max_cpu Range of number of virtual CPUs to select from to run job (can omit max_cpu, in which case max_cpu is set to twice the value of min_cpu).
    --gateway gw_id ID of gateway to connect job to (if applicable)
    -t, --instType instance_type VM instance type (e.g., c5.xlarge for AWS). If used, this option overrides -c and -m options.
    -m, --mem min_mem:max_mem Range of memory size (in GB) to select from to run job (can omit max_mem , in which case max_mem is set to twice the value of min_mem).
    --noPublicIP No public IP address assigned to container host. Make sure host is reachable from within the VPC.
    --securityGroup sec_group AWS security group (or tag in GCP) applied to VM instance for this job. Include option multiple times to apply multiple security groups (or tags).
    --targetPort port_number Port used to connect to job, for example, 8787 for RStudio. Include --targetPort multiple times to connect multiple ports from a single server to the gateway
    -T, --template name:tag Template (in format name:tag) to use for submitting this job. Default tag is only tag or "latest".
info, show Display information about a template -T, --template name:tag Template (in format name:tag) to query. Default tag is only tag or "latest".
list, ls Display available templates. Templates labeled "official" are from the MemVerge template repository.    
save Save submit string from an earlier job as a template -f, --force Force template save
    -j,--job job_id ID of job whose submit string is saved as template
    --overwrite Overwrite existing template
    -T, --template name:tag Name and tag to save template as
sync Sync OpCenter template library with the MemVerge template repository    

float top

Use the float top command to show a sorted list of information (including CPU and memory utilization) about current jobs. The display is updated at regular intervals. Enter q to stop the display. The command has no subcommands. Use the command with the -h flag to list the options.

Subcommands Usage Option Option Definition
  Show utilization and other information about running jobs. Display updates continually. Enter q to exit. --interval top_int Time in seconds (3 or greater) between queries to obtain container metrics (default: 3s). Format is *n*s where *n* is integer.
    -j, --job_Id job_id Job to query

float user

Use the float user command to manage OpCenter users. Use a subcommand with the -h flag to list the options.

Subcommands Usage Option Option Definition
add, create Add a new user new_username Username for new user
    --create Create group if the group does not exist (default: false)
    --email email_address Email address for new user
    -gid gid Group ID to use for group. If value not provided, gid automatically assigned.
    -g, --groups group_name Group associated with new user (default: no group). Use option multiple times to include multiple groups.
    --ldap Associate username with LDAP directory (default: false)
    --passwd password Password for new user (default: "memverge")
    --quota-policy policy_name Name of quota (SurfZone) policy to associate with this user
    --uid uint32 User Identifier (uid) for new user. For regular user accounts, uid normally starts at 1000.
delete, remove, rm Delete a user user_name | user_id Username or uid of user to delete
    -f, --force Remove all active tokens belonging to user and delete user immediately
disable Disable a user's account without deleting it user_name | user_id Username or id of user to query
enable Enable a user's account user_name | user_id Username or id of user to query
info, show Display information about user user_name | user_id Username or id of user to query
list, ls List groups that current user belongs to  
passwd Reset user's password user_name | user_id Username or id of user to update
    --passwd new_password New password for user
update Update information associated with a user user_name | user_id Username or id of user to update
    --email email_address New email address for user
    -gid gid New group ID to use for group.
    -g, --group group_name New group associated with user
    --name user_name New username for user
    --passwd password New password for user
    --quota-policy policy_name New quota (SurfZone) policy to associate with this user
    --uid uint32 New uid for user

Examples

$ float user add testcase2 --passwd secret123 --groups crops
username: testcase2
uid: 5007
gid: 5007
role: normal
group: crops
email: ""
type: builtin
enabled: true
ownGroup: ""

$ float group update crops --add testcase2 --admin

$ float user info testcase2
username: testcase2
uid: 5007
gid: 5007
role: normal
group: crops
email: ""
type: builtin
enabled: true
ownGroup: crops

$ float user update testcase2 --group default
username: testcase2
uid: 5007
gid: 5007
role: normal
group: default,crops
email: ""
type: builtin
enabled: true
ownGroup: crops
$ float user info user-five
username: user-five
uid: 2012
gid: 2010
role: normal
group: group-five
email: ""
type: ldap
enabled: true
ownGroup: ""

float version

Use the float version command to display the version of the float CLI client and the version of the OpCenter it is connected to. The command has no subcommands. Use the command with the -h flag to list the options.

Example

$ float version
float version: v3.0.0-69ce0c9-Imperia
OpCenter version: v3.0.0-69ce0c9-Imperia