CLI Command Reference
Use float
CLI commands to interact with the OpCenter; for example, to submit and manage jobs.
Version
The MMCloud CLI commands described here are consistent with the OpCenter release shown in the table.
MMCloud CLI | OpCenter Release | Date |
---|---|---|
FLOAT_v3.1.0-0293166-Jericho.bin | FLOAT_v3.1.0-0293166-Jericho.bin | 2025-04-01T03:22:50Z |
Usage
Use float
in the following format.
float [global_flags] [command] [subcommand] [options]
Global Flags
Use global flags with any float
command or subcommand.
Flag | Usage | Definition |
-a, --address ip_address | Connect to OpCenter server | IP address of OpCenter (default: localhost or last OpCenter IP address used) |
-F, --format json | yaml | table | Specify format for output | Output format (default: yaml) |
-h, --help | Display help | Help for MMCloud CLI |
--logLevel log_level | Specify log level | Log level (default: info) |
-p, --password password | Log in to OpCenter | Login password |
--scroll | Enable scroll mode for multiple page output | Enable navigation (up, down, left, right) for displays that span multiple pages |
-u, --username user_name | Log in to OpCenter | Login username |
-v, --verbose on | off | Turn verbose mode on or off | Verbose mode setting (default: off) |
Categories
The MMCloud CLI commands are grouped into categories as shown in the table. An alphabetical listing of the commands follows.
Category | Commands |
---|---|
Job Management | list, show, submit, cancel, migrate, modify, suspend, resume, snapshot, rerun, promote, hosts, cluster, project, queue, ptop |
Job Status Monitoring | log, ps, top, df, mi |
Authentication & Authorization | login, logout, secret, nis, ldap |
User Management | user, group, quota-policy |
OpCenter Core Services | image, gateway, template, config, license, report, storage, smtp, node |
OpCenter Management Operations | status, version, release, restart |
Other | queues, completion |
float cancel (scancel)
Use the float cancel
(or float scancel
) command to cancel a job. The command has no subcommands. Use the command with the -h
flag to list the options.
Subcommands | Usage | Option | Option Definition |
Cancel job | --filter filter |
Filter to select jobs if --job not specified. A simple filter is [attribute][operator][value] . Use --filter multiple times to apply multiple filters combined with "and" operator. Create complex filters by combining simple filters using parentheses and operators (and, or). Use "? to match a single character and "*" to match multiple characters.Values are strings, datetimes, or numbers. Operators are:
Examples:
| |
-f, --force | Automatically answer "yes" at confirmation prompt | ||
-j, --job job_id |
Job to cancel if --filter not specified
|
Example
$ float cancel -j ctZLDo7OFG4BuJ8ytiTem
Warning: Are you sure you want to cancel this job?
ID: ctZLDo7OFG4BuJ8ytiTem
Name: python-c5d.large
All the related resources will be released. (yes/No): y
Request to cancel ctZLDo7OFG4BuJ8ytiTem has been submitted
float cluster
Use the float cluster
command to manage High Performance Computing (HPC) clusters. Use a subcommand with the -h
flag to list the options.
Subcommands | Usage | Option | Option Definition |
add (create, new)
| Add new HPC cluster | -n cluster_name | Name of new HPC cluster |
-I, --appImage image1,image2,... | Container image(s) allowed for use in the cluster | ||
-A, --auth auth_provider_id | Identifier of authorization provider for cluster | ||
--clusterStatusCheckInterval duration | Interval (in minutes and seconds) between checks whether the cluster has the appropriate number of nodes | ||
-D, --dataVolume [size=vol_size,throughput=rate]:/data_dir or [mode=rw]vol_id :/container_mnt_pt/path or [mode=rw]nfs://ip_address/server_export_dir :/container_mnt_pt/path or [opts=xx,yy]lustre://dnsname/mountname/path :/container_mnt_pt/path |
Data volume mounted by container. Use option multiple times to attach multiple data volumes.
| ||
--defaultQueue queue_name | Name of default queue | ||
--dns dns_ip | IP address of domain name server | ||
-H, --headNode stringArray |
Configuration of head node in this format: "[mi=xxx,count=yyy,instType=zzz]" | ||
-m, --instType inst1,inst2,... | Allowed instance type(s) for any cluster nodes | ||
--jobCollectDeviation duration | Tolerance (in minutes and seconds) within which two timestamps are considered to be the same | ||
--jobCollectInterval duration | Interval (in minutes and seconds) between collection of information on jobs | ||
-l, --loginNode stringArray |
Configuration of login node in this format: "[mi=xxx,count=yyy,instType=zzz,noPublicIP=true|false]" | ||
--metricsCollectInterval duration | Duration (in minutes and seconds) between collection of information regarding the operation of the cluster | ||
-M, --mi machine_id | Identifier of machine image to use in cluster nodes | ||
--noPublicIp | Flag to indicate that no public IP required for any cluster node. Confirm cloud services are accessible from these instances. | ||
--nodeReadyTimeout duration | Maximum time to wait for a cluster node to initialize | ||
--nodeThrottlingSize node_num | Maximum number of concurrent nodes to create to run one job | ||
--precheck | Run pre-check before adding node | ||
-Q, --qos cpu=cpu_num,mem=mem_num | CPU limit and memory limit (in GB) that each user can use for their jobs (default: no limits) | ||
-s, --scheduler scheduler_type | Scheduler type (currently, only slurm supported) | ||
-G, --securityGroup SG1,SG2,... | Security group(s) (or network tags in GCP) to add to the instance for the job | ||
--setupScript script_name | Setup script for cluster | ||
--smtp_id | SMTP server ID | ||
--storage storage_config | Storage configuration | ||
-N, --subnet subnet_id | Specify subnet (aws)or vSwitch (alicloud) to use (default: system chooses one for you) | ||
--timezone | Time zone of cluster (default "Local") | ||
addNode (createNode, newNode)
| Add new node, with specified parameters, to HPC cluster | -i, --id cluster_id | ID of HPC cluster |
-n, --name cluster_name | Name of HPC cluster | ||
--noPublicIp | Flag to indicate that no public IP required for this node. Confirm cloud services are accessible from this instance. | ||
-s, --nodeSpec node_spec | Node specification | ||
-q, --queue queue_name | If adding a compute node, name of queue associated with this node. | ||
-r, --role node_type | Type of node to add (default: login). Options are login or compute. | ||
delete (del, rm, remove)
| Delete an existing HPC cluster (by name or by ID) | -i, --id cluster_id | ID of HPC cluster to delete (if using cluster ID to identify cluster) |
-n, --name cluster_name | Name of HPC cluster to delete (if using cluster name to identify cluster) | ||
deleteNode (delNode, rmNode, removeNode)
| Delete an existing node (by name or by ID) | -H, --hosthost_id1,host_id2, ... | ID(s) of node(s) to delete |
-i, --id cluster_id | ID of HPC cluster that node(s) is(are) part of | ||
-n, --name cluster_name | Name of HPC cluster that node(s) is(are) part of | ||
get
| Get information on an existing HPC cluster (by name or by ID) | -i, --id cluster_id | ID of HPC cluster to query (if using cluster ID to identify cluster) |
-n, --name cluster_name | Name of HPC cluster to query (if using cluster name to identify cluster) | ||
ls (list)
| List available HPC clusters | -f, --filter filter_definition | Filter applied to list of all HPC clusters |
-s, --orderBy order_definition | Criterion for sorting filtered list | ||
modify (update)
| Modify an HPC cluster | -i, --id cluster_id | ID of HPC cluster to modify |
-n, --name cluster_name | Name of HPC cluster to modify | ||
-o, --overwrite | Overwrite existing fields (default: append to existing fields) | ||
-r, --reconfigure | Modify configuration, then reconfigure cluster immediately | ||
Any flag from float cluster add
|
Modify any flag used with float cluster add by including the flag, and its new value, with float cluster modify
| ||
nodes (nodels, nls)
| List available nodes in an HPC cluster | -f, --filter filter_definition | Filter applied to list of all nodes in the cluster |
-i, --id cluster_id | ID of HPC cluster that nodes are part of | ||
-n, --name cluster_name | Name of HPC cluster that nodes are part of | ||
-s, --orderBy order_definition | Criterion for sorting filtered list | ||
queues (queuels, qls)
| List available queues in an HPC cluster | -f, --filter filter_definition | Filter applied to list of all queues in the cluster |
-i, --id cluster_id | ID of HPC cluster that queues are part of | ||
-n, --name cluster_name | Name of HPC cluster that queues are part of | ||
-s, --orderBy order_definition | Criterion for sorting filtered list | ||
start (run, launch)
| Start an HPC cluster | -i, --id cluster_id | ID of HPC cluster to start |
-n, --name cluster_name | Name of HPC cluster to start | ||
stop (shutdown, suspend)
| Stop an HPC cluster | -i, --id cluster_id | ID of HPC cluster to stop |
-n, --name cluster_name | Name of HPC cluster to reconfigure | ||
reconfigure (refresh, apply)
| Reconfigure an HPC cluster | -i, --id cluster_id | ID of HPC cluster to reconfigure |
-n, --name cluster_name | Name of HPC cluster to reconfigure |
float completion
Use the float completion
command to generate auto-completion script for use in current shell or in every shell started subsequently. Use a subcommand with the -h
flag to display help on how to use the auto-completion script.
Subcommands | Usage | Option | Option Definition |
bash
| Generate auto-completion script for bash | --no-descriptions | Disable completion descriptions |
fish
| Generate auto-completion script for fish | --no-descriptions | Disable completion descriptions |
powershell
| Generate auto-completion script for powershell | --no-descriptions | Disable completion descriptions |
zsh
| Generate auto-completion script for zsh | --no-descriptions | Disable completion descriptions |
float config
Use the float config
command to view or change the OpCenter configuration. Use a subcommand with the -h
flag to list the options.
Subcommands | Usage | Option | Option Definition |
cert
| Set server certificate and key (must supply both) | -c, --cert /path/to/cert | Path to certificate file |
-k, --key /path/to/key | Path to key file | ||
get
| Show runtime value of a single OpCenter configuration parameter | config_parameter | Configuration parameter to display, for example, sessionTimeout |
ldap
| Set values for LDAP configuration parameters | --addr ldap_server_address | IPv4 address of LDAP server |
--adminGroup ldap_admin_group | LDAP admin group | ||
--anonymous=true | false | LDAP anonymous bind (default false) | ||
--base base_DN | LDAP base Distinguished Name (DN) | ||
--bindDN bind_DN | LDAP bind DN | ||
--bindPW bind_pw | LDAP bind password | ||
--cert /path/to/cert | Path to LDAP certificate file | ||
--conf /path/to/conf_file | Path to LDAP configuration file | ||
--connTimeout duration | Duration (format: HhMmSs) until LDAP connection times out (default 10s) | ||
--enable=true | false | Enable LDAP (default true) | ||
--id ldap_id | Pre-configured LDAP configuration ID | ||
--key /path/to/key | Path to LDAP key file | ||
--network tcp | udp | LDAP connection protocol (default "tcp") | ||
-W, --passwordStdin | Flag to prompt user for bind password | ||
--peopleOU people_OU | LDAP people OU (default "People") | ||
--reset | Reset all LDAP parameters to empty (null) | ||
--tls=true | false | Use tls for LDAP connection (default: true) | ||
list , ls
| List configuration parameters for OpCenter (shows if a parameter can be changed and if a change requires a system restart) | -s, --scope filter | Condition(s) to filter the list of configuration parameters (default: list all parameters) |
mset
| Set runtime values of multiple configuration parameters | config_parm1=parm_value1 config_parm2=parm_value2... | Array of configuration parameters and the values to set them to. If the word "default" is used for `parm_value`, the configuration parameter is reset to its default value. |
set
| Set runtime value of a single configuration parameter | config_parm parm_value | Configuration parameter and the value to set it to. If the word "default" is used for `parm_value`, the configuration parameter is reset to its default value. |
--file value.txt | Text file containing value for `config_parameter` (see example) |
Examples
$ float config set sessionTimeout 48h
sessionTimeout is set to 48h0m0s
$ float config set sessionTimeout default
sessionTimeout is set to 1h0m0s
$ float config ls --scope "session"
+----------------+----------+----------+--------------+
| KEY | VALUE | EDITABLE | NEED RESTART |
+----------------+----------+----------+--------------+
| sessionTTL | 168h0m0s | Y | |
| sessionTimeout | 1h0m0s | Y | |
+----------------+----------+----------+--------------+
$ echo -n 48h > session.txt
$ float config set sessionTimeout --file session.txt
The value of sessionTimeout is set to '48h0m0s'.
float df
Use the float df
command to display the file systems mounted by an executing job (includes used and available disk space). The linux command df
must be installed in the container. This command has no subcommands. Use the command with the -h
flag to list the options.
Subcommands | Usage | Option | Option Definition |
Display mounted file systems and associated disk space. | --args df_args |
Arguments to pass to df command
| |
-j, --jobId job_id | Job to query |
Example
$ float df --args "-h" -j WIY92p0jWyaCMP0CNQYjC
Filesystem Size Used Avail Use% Mounted on
overlay 6.0G 1.1G 5.0G 17% /
/dev/nvme3n1 10G 105M 9.9G 2% /data
tmpfs 64M 0 64M 0% /dev
172.31.81.17:/mnt/memverge/slurm/work/nzG6oM5DoLCysXAkoZFCA/app 50G 20G 31G 39% /mmce
/dev/nvme2n1 6.0G 1.1G 5.0G 17% /etc/hosts
/dev/nvme0n1p2 40G 7.5G 33G 19% /opt/aws
shm 63M 0 63M 0% /dev/shm
devtmpfs 1.7G 0 1.7G 0% /proc/key
float gateway
Use the float gateway
command to create and manage a reverse proxy server. Use a subcommand with the -h
flag to list the options.
Subcommands | Usage | Option | Option Definition |
connect
| Connect a running job to the gateway (reverse proxy) | -g, --gateway gw_id | ID of gateway to connect job to (format is g- followed by a fixed-length character string) |
-j,--job job_id | ID of job to connect to gateway | ||
--targetPort port_number |
Port used to connect to job, for example, 8787 for RStudio. Include --targetPort multiple times to connect multiple ports from a single server to the gateway.
| ||
create
| Create a gateway | --bandwidth bw | Minimum gateway bandwidth in Mbps (default: 25). Only use with AliCloud. |
-c,--cpu min_cpu | Minimum number of virtual CPUs for gateway (default: 0) | ||
-t,--instType instance_type |
VM instance type for gateway, for example, c5.xlarge for AWS. Do not combine with --cpu or --mem options.
| ||
-m,--mem min_mem | Minimum memory capacity in GB for gateway (default: 0) | ||
-n,--name gw_name | Name to associate with gateway | ||
--noPublicIP | Create gateway with private IP address only. Ensure that gateway is reachable from the hosts that need to connect. | ||
--portRange min:max | Range of client-side ports opened on gateway (allowed range is between 1000 and 65535). | ||
--securityGroup sec_group | AWS security group (or tag in GCP) applied to gateway. Include option multiple times to apply multiple security groups or use comma-separated list. | ||
--subnet subnet_id | Subnet (aws) or vswitch (ali) in which to create gateway. Include option multiple times to apply multiple subnets or use a comma-separated list. Default: system chooses for you. | ||
-z, --zone availability_zone | Availability zone in which to create gateway | ||
destroy
| Destroy a gateway | -g, --gateway gw_id | ID of gateway to destroy (format is g- followed by a fixed-length character string) |
-f, --force | Automatically answer "yes" at confirmation prompt | ||
disconnect
| Disconnect a job from a gateway | -g, --gateway gw_id | ID of gateway to query (format is g- followed by a fixed-length character string) |
-j, --job job_id | ID of job to disconnect from gateway | ||
--port port_num |
Server-side port to disconnect from gateway. If gateway connects to multiple ports on server, include --port for each port or use comma-separated list
| ||
info
| Display information about a gateway (including IP address and connected jobs) | -g, --gateway gw_id | ID of gateway to query (format is g- followed by a fixed-length character string) |
list
| List all running gateways (optionally include stopped gateways) | -A, --showAll | Option to list stopped as well as running gateways (overrides filters) |
-f, --filter filter | Filter(s) to apply to list of gateways. Use option multiple times to apply multiple filters combined with "and" operator. See details in Working with OpCenter/Filters/Gateway Filters. | ||
-o, --orderBy attribute |
Attribute used to order gateway listing (prepend with "-" to reverse order). Supported values are (default: lastUpdate):
| ||
modify
| Modify attributes of a gateway | --addSecurityGroup sec_group | AWS security group (or tag in GCP) added to gateway. Include option multiple times for multiple security groups or use comma-separated list. |
-g, --gateway gw_id | ID of gateway to modify | ||
--rmSecurityGroup sec_group | AWS security group (or tag in GCP) removed from gateway. Include option multiple times for multiple security groups or use comma-separated list |
Example
$ float gateway create -n NewGateway --securityGroup sg-0fbb6a83983183364 --portRange 10000:10500
id: g-wsfthf3z8zeb0cyoc6dsb
name: NewGateway
status: Creating
configuration: ""
IPAddress: ""
portRange: 10000-10500
startTime: "2024-01-01T16:33:20Z"
usedPorts: 0
cost: ""
clientJobs: {}
float group
Use the float group
command to manage OpCenter user groups. Use a subcommand with the -h
flag to list the options.
Subcommands | Usage | Option | Option Definition |
add , create
| Create a new group | new_group | Name of new group |
--admin user1,user2,... | User(s) given admin role in new group | ||
--gid GID | Group ID for this group (default is next available gid) | ||
--user user1,user2,... | User(s) added to new group | ||
add --ldap
| Associate an LDAP group with an LDAP directory | ldap_group_name | Name of group to associate with LDAP directory |
delete , remove , rm
| Delete a group | group_name | group_id | Name or id of group to delete |
info , show
| Display information about group including members | group_name | group_id | Name or id of group to query |
list , ls
| List groups that current user belongs to | ||
update
| Update attributes of a group | group_name | group_id | Name or id of group to update |
--add user1, user2,... | User(s) added to group | ||
--admin |
Flag to indicate that any username listed after --add is given admin role in group and any username listed after --remove has admin role in group removed
| ||
--gid GID | New group ID for this group | ||
--name group_name | New name for group | ||
--remove user1, user2,... | User(s) removed from group |
Examples
float group update crops --admin --remove barley --add wheat
name: crops
gid: 2007
admins: wheat
users: wheat,barley
type: builtin
float hosts (sinfo)
Use the float hosts
(or sinfo
) command to show details and current status of current (or all) worker nodes. The command has no subcommands. Use the command with the -h
flag to list the options.
Subcommands | Usage | Option | Option Definition |
Show current (or all) worker node details including status | -A, --all | Clear all filters to show all jobs | |
-f, --filter filter |
Filter(s) to apply to jobs whose associated hosts are displayed. Use option multiple times to apply multiple filters combined by "and" operator. See details in Working with OpCenter/Filters/Host Filters.
The default filter is: -f "running=true or update<=1h" Example: -f status=executing -f timeRange=2010-10-22~ | ||
-o, --orderBy attribute |
Attribute used to order host listing (prepend with "-" to reverse order). Supported values are (default: start):
| ||
-w, --windowSize num_hosts | Number of hosts displayed before header row is repeated (default: 512) |
float image
Use the float image
command to manage container images on OpCenter. Use a subcommand with the -h
flag to list the options.
Subcommands | Usage | Option | Option Definition |
add
| Add image pull information to OpCenter | image_name image_uri |
Name to associate with image in the AppLibrary and URI to pull image from repository. Do not use image_uri if --link specified.
|
--cachePath /path/to/cache |
Location to cache image from a running job. Only valid if combined with -j flag identifying a running job. Choose a name different from image name in AppLibrary. Path choices are
| ||
--import-all | Import all tags for this image from repository | ||
-j, --job job_id | ID of running job to identify container image | ||
--link file_link |
Link to access AWS S3 or AliCloud OSS file, for example, s3://bucket_name/file_path Do not use if image_uri specified.
| ||
--token repo_access_token | Token to pull image from private repository | ||
--user repo_access_user | Username to pull image from private repository | ||
cache
| Add (delete) image to (from) cache configured for OpCenter (NFS-mounted directory or S3 bucket) | image_name | Container image to add to or delete from cache. If cachePath flag not included, image is cached at location specified by image.cachePath in the OpCenter configuration. |
--cachePath /path/to/cache |
Path to image cache. Choices are
| ||
-d, --delete | If option included, delete image from cache | ||
-f, --force | Overwrite existing cached version of image | ||
--imageSize image_size | Size of image to cache | ||
--tag tag_name | Tag to select specific container image (default: "latest") | ||
delete , rm , remove
| Delete container image(s) and one or more tags | image1 image2... or image1:tag1 image2:tag2... | Container image(s) to delete. If no tags are specified, then all tags are removed. |
-f, --force | Automatically answer "yes" at confirmation prompt | ||
--tag tag_name | Tag associated with image1 image2... to remove. If this is the only tag associated with images, images are removed. | ||
list , ls
| Show images available on OpCenter | -f, --filter filter_definition |
Filter applied to all images, for example, -f "name=ubuntu" . See Working with OpCenter/Filters for details.
|
tags , tag
| Display tags associated with image and image status | image_name | Container image name |
update
| Update information associated with image | image_name | Container image name |
--addtag image_tag1,image_tag2... | Additional tags to associate with image | ||
--name image_name | New name to identify image | ||
--token repo_access_token | New token required to access private repository | ||
--user repo_access_user | New username associated with token required to access private repository | ||
upload
| Load image from local server | image_name | Name to identify image in App Library |
-i, --image local_name |
Name of image in local repository (cannot use if --path included)
| ||
--path /path/to/image |
Path to where image is located ((cannot use if --image included))
|
Examples
$ float image list
+------------+-------------------------------+--------+-------------+
| NAME | URI | TAGS | ACCESS USER |
+------------+-------------------------------+--------+-------------+
| python | docker.io/bitnami/python | latest | |
| r-base | docker.io/rocker/r-base | latest | |
.....(edited)
+------------+-------------------------------+--------+-------------+
$ float image add r-base docker.io/rocker/r-base
name: r-base
uri: docker.io/rocker/r-base
owner: admin
tags:
latest:
status: Available
locked: false
lastUpdated: 2023-06-15T16:12:41.739947963Z
size: Unknown
$ float image add ubuntu docker.io/ubuntu --cachePath snapshot:// -j dit41mw0127mh5b4ozzmo
Error: Resource already exists, image already exists (code: 1018)
$ float image add ubu3 docker.io/ubuntu --cachePath snapshot:// -j dit41mw0127mh5b4ozzmo
name: ubu3
uri: docker.io/library/ubuntu
owner: admin
tags:
latest:
status: Ready
uri: snapshot://snap-0c2f373876759e470
locked: false
lastUpdated: 2025-04-23T15:51:38.874421771Z
lastPushed: 2025-04-23T15:51:38.874421871Z
size: 6.00 GB
$ float image cache blast
Request to cache image blast (tag: latest) has been submitted
$ float image cache blast --delete
Request to clean image blast (tag: latest) has been submitted
$ float image cache postgres --cachePath snapshot://
Error: Resource not found (code: 1002)
$ float image add postgres docker.io/postgres
name: postgres
uri: docker.io/library/postgres
owner: admin
tags:
latest:
status: Available
locked: false
lastUpdated: 2025-04-23T16:04:12.482960127Z
size: Unknown
$ float image cache postgres --cachePath snapshot://
Request to cache image postgres (tag: latest) has been submitted
$ float image tags blast
+--------------------------+--------+-----------+----------------------+
| URI | TAG | STATUS | LAST UPDATED |
+--------------------------+--------+-----------+----------------------+
| docker.io/memverge/blast | 2.14.0 | Available | 2023-06-26T05:15:02Z |
| | latest | Available | 2024-08-13T15:51:33Z |
$ float image upload test2 --path /tmp/hello-world-latest.tar
Start uploading image test2 (localhost/hello-world:latest) from /tmp/hello-world-latest.tar
Progress: |>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>| 100.00% Complete (ETA. 0s)
Uploaded image /tmp/hello-world-latest.tar, time spent: 0s
name: test2
uri: localhost/hello-world
owner: admin
tags:
latest:
status: Ready
uri: file:///mnt/memverge/images/test2-latest.tar
locked: false
lastUpdated: 2023-04-17T19:26:38.584557866Z
lastPushed: 2023-04-17T19:26:38.584557866Z
size: Unknown
$ float image upload testinggzupload --path ./testinggzupload.tar.gz
Start uploading image testinggzupload (docker.io/library/testinggzupload:latest) from ./testinggzupload.tar.gz
Progress: |>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>| 100.00% Complete (ETA. 0s)
Uploaded image ./testinggzupload.tar.gz, time spent: 31m20s
name: testinggzupload
uri: docker.io/library/testinggzupload
owner: admin
tags:
latest:
status: Ready
uri: s3://opcenter-bucket-594424d0-c760-11ee-8abd-128ad93e7ec9/images/ntltcroq2g.tar
locked: false
lastUpdated: 2024-04-11T15:39:29.686393328Z
lastPushed: 2024-04-11T15:39:29.686393428Z
size: 1.33 GB
Note
The user that queries the local repository for the IMAGE ID must be the same user that executes the float image upload command
In this example, the user is root.
$ sudo podman images
REPOSITORY TAG IMAGE ID CREATED SIZE
quay.io/podman/hello latest 39ae24b9cabf 3 days ago 1.7 MB
$ sudo /opt/memverge/bin/float image upload testimage --image 39ae24b9cabf
Found image quay.io/podman/hello:latest, using it as repository and tag
Using /bin/podman to save image quay.io/podman/hello:latest to testimage.tar
Start uploading image testimage (quay.io/podman/hello:latest) from testimage.tar
...[edited]
float ldap
Use the float ldap
command to manage LDAP configurations on the OpCenter. Use a subcommand with the -h
flag to list the options.
Subcommands | Usage | Option | Option Definition |
add
| Add a new LDAP configuration | --addr ldap_addr | LDAP server IP address |
--adminGroup group_name | LDAP admin group | ||
--anonymous | Use anonymous bind | ||
--base base_DN | Base DN used in ldap search | ||
--bindDN bind_DN | LDAP bind DN for authentication | ||
--bindPW bind_PW | LDAP bind password for authentication | ||
--cert /path/to/cert_file | Path to LDAP certificate file | ||
--connTimeout duration | LDAP connection timeout (default: 10s) | ||
--groupOU group_OU | LDAP group OU | ||
--key /path/to/key_file | Path to key file | ||
-n, --name ldap_conf | Name of the LDAP configuration | ||
--network tcp | udp | Transport protocol to connect to LDAP server | ||
--passwordStdin | Prompt for bindPassword | ||
--peopleOU people_OU | LDAP people OU | ||
--precheck | Run pre-check before adding LDAP config | ||
--useTLS | Use TLS for transport layer security | ||
delete
| Delete an LDAP configuration | -i, --id ldap_config_id | ID of LDAP configuration to delete |
get
| Get an LDAP configuration | -i, --id ldap_config_id | ID of LDAP configuration to get |
list (ls)
| List all LDAP configurations | ||
update
| Update an LDAP configuration | --addr ldap_addr | LDAP server IP address |
--adminGroup group_name | LDAP admin group | ||
--anonymous | Use anonymous bind | ||
--base base_DN | Base DN used in ldap search | ||
--bindDN bind_DN | LDAP bind DN for authentication | ||
--bindPW bind_PW | LDAP bind password for authentication | ||
--cert /path/to/cert_file | Path to LDAP certificate file | ||
--connTimeout duration | LDAP connection timeout (default: 10s) | ||
--groupOU group_OU | LDAP group OU | ||
-i, --id ldap_config_id | ID of LDAP configuration to update | ||
--key /path/to/key_file | Path to key file | ||
-n, --name ldap_conf | Name of the LDAP configuration | ||
--network tcp | udp | Transport protocol to connect to LDAP server | ||
--passwordStdin | Prompt for bindPassword | ||
--peopleOU people_OU | LDAP people OU | ||
--precheck | Run pre-check before adding LDAP config | ||
--useTLS | Use TLS for transport layer security |
float license
Use the float license
command to manage OpCenter licenses. Use a subcommand with the -h
flag to list the options.
Subcommands | Usage | Option | Option Definition |
acquire
| Acquire license from MMCloud Portal | -A, --account acct_name | Username (email address) to access the MMCloud Portal |
-P, --password passwd | Password to access the MMCloud Portal | ||
info
| Display license information and status |
float list (squeue, ls)
Use the float list
(or squeue
or ls
) command to show a filtered list of queued jobs. The command has no subcommands. Use the command with the -h
flag to list the options.
Subcommands | Usage | Option | Option Definition |
Show filtered list of queued jobs (default: list jobs from oldest to newest) | -A, --all | Clear all filters to show all jobs | |
-f, --filter filter |
Filter(s) to apply to list of queued jobs. Use option multiple times to apply multiple filters combined with "and" operator. See details in Working with OpCenter/Filters/Job Filters. Example: -f status=executing -f timeRange=2023-10-22~ -f tags=kind:training,project:finance-\* | ||
-o, --orderBy attribute |
Attribute used to order job listing (prepend with "-" to reverse order). Supported values are (default: start):
| ||
-w, --windowSize num_jobs | Number of jobs displayed before header row is repeated (default 10000) |
Examples
$ float squeue -f "imageID=*blast*"
+-------+------------------+...+-------+-----------+----------+----------------------+------------+
| ID | NAME |...| USER | STATUS | DURATION | SUBMIT TIME | COST |
+-------+------------------+...+-------+-----------+----------+----------------------+------------+
| m8t...| blast-c5.9xlarge | | admin | Completed | 8h44m18s | 2024-02-14T04:07:38Z | 5.4851 USD |
| w70...| blast-c5.9xlarge | | bean | Completed | 9h3m16s | 2024-02-14T04:22:45Z | 5.3502 USD |
+-------+------------------+...+-------+-----------+----------+----------------------+------------+
...[edited for clarity]
$ float squeue -f "imageID=*blast*" -f user=bean
+-------+------------------+...+-------+-----------+----------+----------------------+------------+
| ID | NAME |...| USER | STATUS | DURATION | SUBMIT TIME | COST |
+-------+------------------+...+-------+-----------+----------+----------------------+------------+
| w70...| blast-c5.9xlarge | | bean | Completed | 9h3m16s | 2024-02-14T04:22:45Z | 5.3502 USD |
+-------+------------------+...+-------+-----------+----------+----------------------+------------+
...[edited for clarity]
$ float squeue -o name
+--------+----------------------+------------------+-------+-----------+--------------+----------+------------+
| ID | NAME | WORKING HOST | USER | STATUS | SUBMIT TIME | DURATION | COST |
+-------------------------------+------------------+-------+-----------+--------------+----------+------------+
| SZd... | apple | 54.163.154.116...| admin | Executing | ...19:11:47Z | 1h35m1s | 0.1143 USD |
| btC... | banana | 34.234.64.157... | admin | Executing | ...19:12:19Z | 1h34m30s | 0.0926 USD |
| hnP... | camel | | admin | Completed | ...20:43:01Z | 3m28s | 0.0031 USD |
| 5d1... | cherry | 54.89.203.124... | admin | Executing | ...19:12:27Z | 1h34m22s | 0.0924 USD |
| grd... | python-t3a.xlarge | 34.229.17.82... | admin | Completed | ...20:43:48Z | 2m51s | 0.0026 USD |
| nVD... | tidyverse-t3a.xlarge | 3.89.224.246... | admin | Executing | ...19:14:39Z | 1h32m10s | 0.0903 USD |
+--------+----------------------+------------------+-------+-----------+--------------+----------+------------+
...[edited for clarity]
$ float squeue -f "imageID=*python:latest and status=Executing"
+-----------------------+-------------------------+------------------------------+--------+-----------+...
| ID | NAME | WORKING HOST | USER | STATUS |...
+-----------------------+-------------------------+------------------------------+--------+-----------+...
| w2763vpgwuy9cksihxbcd | python-d72rur-t3.medium | 44.220.82.26 (2Core4GB/Spot) | banana | Executing |...
| q6r4ome7wyywk33b5jo9g | testpy | 3.239.38.179 (2Core4GB/Spot) | bean | Executing |...
+-----------------------+-------------------------+------------------------------+--------+-----------+...
...[edited for clarity]
$ float squeue -f "imageID=*python* or imageID=*tidyverse*"
+-----------------------+-------------------------+------------------------------+--------+-----------+...
| ID | NAME | WORKING HOST | USER | STATUS |...
+-----------------------+-------------------------+------------------------------+--------+-----------+...
| w2763vpgwuy9cksihxbcd | python-d72rur-t3.medium | 44.220.82.26 (2Core4GB/Spot) | banana | Executing |...
| q6r4ome7wyywk33b5jo9g | testpy | 3.239.38.179 (2Core4GB/Spot) | bean | Executing |...
| bxulaob02n1d91cmymvxk | Rtest | 44.212.52.61 (2Core4GB/Spot) | bean | Executing |...
+-----------------------+-------------------------+------------------------------+--------+-----------+...
...[edited for clarity]
float log
Use the float log
command to view and manage log files. Use subcommand with the -h
flag to list the options.
Subcommands | Usage | Option | Option Definition |
cat
| Write log file contents to standard output | log_file | Log file whose contents are displayed |
-c, --cluster cluster_id | Cluster whose logs are displayed | ||
-i, --hid host_id | Host whose logs are displayed (default: OpCenter server) | ||
-j, --job job_id | Job whose logs are displayed (default: OpCenter) | ||
download
| Download a zip file of selected logs associated with job | -i, --include logs1,logs2,... |
Logs included in zip file (default: "all"). Options are:
|
-j, --job job_id | Job whose log files are included | ||
--path /path/to/dir | Path to save zip file (default "./") | ||
list , ls
| List all logs associated with target | --All | Display all log files |
-c, --cluster cluster_id | Cluster whose log files are listed | ||
-i, --hid host_id | Host whose log files are listed (default: OpCenter server) | ||
-j, --job job_id | Job whose log files are listed (default: OpCenter) | ||
-H, --readable | Display log size in human-readable format | ||
rm
| Remove all logs associated with target | -i, --hid host_id | Host whose log files are removed (default: OpCenter server) |
-j, --job job_id | Job whose log files are removed (default: OpCenter) | ||
tail
| Write last n lines of log file to standard output | log_file | Log file to display |
-f, --follow | Display new lines as they are appended to log file | ||
-c, --cluster cluster_id | Cluster whose log file lines are displayed | ||
-i, --hid host_id | Host whose log file lines are displayed (default: OpCenter server) | ||
-j, --job job_id | Job whose log file lines are displayed (default: OpCenter) | ||
-n, --num n | Number of lines to display (default: 100) |
Examples
$ float log tail --follow output -j XGiUDRto7kwofWBNPkiW5
Ready to prepare source data
Ready to download pbmc_1k_v3_fastqs from s3
Ready to download refdata-gex-GRCh38-2020-A from s3
Ready to run test
...[output edited]
$ float log ls -H
+---------------------+---------------+----------------------+
| LOG NAME | READABLE SIZE | LAST UPDATE TIME |
+---------------------+---------------+----------------------+
| opcenter.access_log | 7.40 MB | 2024-02-14T19:02:02Z |
| opcenter.log | 712.73 KB | 2024-02-14T18:55:28Z |
| upgrade.log | 1.92 KB | 2024-02-11T03:54:27Z |
| messages | 220.56 KB | 2024-02-14T18:48:48Z |
+---------------------+---------------+----------------------+
float login
Use the float login
command, with valid username and password, to log in to OpCenter. The command has no subcommands. Use the command with the -h
flag to list the options.
Subcommands | Usage | Option | Option Definition |
Log in to OpCenter | --info | Display login status |
Examples
float logout
Use the float logout
command to log the current user out of the OpCenter and invalidate the authorization token. The command has no subcommands. Use the command with the -h
flag to list the options.
Example
$ float logout
Logout Succeeded!
$ float login --info
Error: cannot find any login session (code: 2004)
float mi
Use the float mi
command to manage virtual machine images. Use a subcommand with the -h
flag to list the options.
Subcommands | Usage | Option | Option Definition |
list
| Display information on available machine images | --source filter | Filter to select which source of machine images to show (default: "all") |
update
| Upload machine image file | -f, --file /path/to/file | Path to machine image file |
Example
$ float mi list --source "official"
aws:
us-east-1:
- id: ami-01a1370c28ed96480
oid: mi-xw3smh9ojzacdpko7uyf0
name: rocky_94-nvidia_550
description: 'os: Rocky9.4, gpu: NVIDIA 550'
tags:
gpu: NVIDIA
os: rocky-9.4
source: official
cloudMachineImage:
arch: x86_64
id: ami-01a1370c28ed96480
name: Rocky-9.4-x86_64-MMC-v3.0.0-NVIDIA_550
description: ""
tags: {}
updateTime: 2025-04-02T23:01:59.139806718Z
- id: ami-05fa59fc8129caa16
oid: mi-ny39fy3h6d53lec7ka3fe
name: rocky_94
description: 'os: Rocky9.4'
tags:
os: rocky-9.4
source: official
cloudMachineImage:
arch: x86_64
id: ami-05fa59fc8129caa16
name: FLOAT-release_v3.0-20240716014405-8a0e60e3-8e27-4221-8bee-69944e051d58
description: FLOAT-release_v3.0-20240716014405
tags: {}
updateTime: 2025-04-02T23:01:59.139806718Z
float migrate
Use the float migrate
command to move a job from one VM instance to another VM instance of the same type or of a different type. The command has no subcommands. Use the command with the -h
flag to display the options.
Subcommands | Usage | Option | Option Definition |
Migrate job to a new VM instance. If no options used, migrate to identical instance in same availability zone. | -c, --cpu min_cpu:max_cpu | Range of number of virtual CPUs to select from for new VM instance. If max_cpu omitted, value of twice min_cpu used. | |
-e, --env |
Environment variables that apply when the migrated job resumes. Use format env_key=env_value . Include secrets in this format: "SECRET={secret:TEST_SECRET}"
| ||
-f, --force | Automatically answer "yes" at confirmation prompt | ||
--gpu-count min_gpu:max_gpu | Range of number of GPUs to select from for new VM instance | ||
--gpu-mem min_gpumem:max_gpumem | Range of memory size (in GB) of GPUs to select from for new VM instance | ||
--gpu-vendor gpu_vendor |
GPU vendor name (for example, nvidia or amd )
| ||
-t, --instType instance_type |
VM instance type to migrate to (e.g., c5.xlarge in AWS). Do not combine
with --cpu or --mem options.
| ||
-j, --job job_id | Job to migrate | ||
-m, --mem min_mem:max_mem | Range of memory size (in GB) to select from for new VM instance. If max_mem omitted, value of twice min_mem used. | ||
-P, --payType spot | ondemand | Pricing tier for VM instance (Spot or On-demand) | ||
--rerun | Ignore snapshot and restart job from the beginning on new VM instance (job must be running) | ||
--sync | Block all terminal input until job has migrated | ||
-z, --zone availability_zone | Availability zone in which to execute job |
Examples
$ float migrate -j lL07E84pQQpYqCQ88xeIQ
entity:
id: i-0cfbd3f1f82087dd5
type: host
name: 192.168.0.2
status: normal
instanceType: c5.xlarge
startTime: "2022-09-23T15:09:09Z"
downTime: ""
$ float migrate -f --sync --payType ondemand -j tqrGc4Z6g18nkphzTeaxM
tqrGc4Z6g18nkphzTeaxM is now migrating...
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
tqrGc4Z6g18nkphzTeaxM has been migrated to 44.212.92.162 (4Core16GB/OnDemand).
float modify
Use the float modify
command to change a subset of attributes associated with a running job. The command has no subcommands. Use the command with the -h
flag to list the options.
Subcommands | Usage | Option | Option Definition |
Modify certain attributes associated with running job | --addCustomTag tagName:tagValue | Tag name and value to associate with running job. Separate multiple tags with "," or include option multiple times to add multiple tags. | |
--addSecurityGroup sec_group | Security group (or tag in GCP) added to VM instance for this job. Include option multiple times to add multiple security groups. | ||
--errPolicy err_policy |
The policy to use if the job fails. The allowed policies are the following.
| ||
--force | Automatically answer "yes" at confirmation prompt | ||
-j, --job job_id | Job to apply changes to | ||
-M, --migratePolicy migrate_policy |
New migrate policy to apply. See float submit for format.
| ||
-m, --monthly | Turn monthly subscription on (or off). Only used for Alicloud and Tencent cloud. | ||
-P, --period | Duration of monthly subscription in months (default: 1). Only used for Alicloud and Tencent cloud. | ||
--rmSecurityGroup sec_group | Security group (or tag in GCP) to remove from VM instance for this job. Include option multiple times to remove multiple security groups. | ||
--snapshotInterval snapshot_interval | New periodic snapshot interval for this job. Use "disable" or "0" to turn off periodic snapshots. | ||
-V, --vmPolicy vm_policy |
New VM creation policy to apply. See float submit for format.
|
Example
$ float modify --vmPolicy [SpotOnly=true] -j PF0bgCvlpdJkog0RCBZPg
Warning: Are you sure you want to modify PF0bgCvlpdJkog0RCBZPg?
New vmPolicy may trigger migration.(yes/No): yes
Successfully modified PF0bgCvlpdJkog0RCBZPg: --vmPolicy [SpotOnly=true]
$ float modify -j y78iefqeo8k6ng2rpvduj --addCustomTag run-name=task1
Successfully modified y78iefqeo8k6ng2rpvduj: --addCustomTag run-name=task1
$ float show -j y78iefqeo8k6ng2rpvduj
...
customTags:
run-name=task1: ""
...[edited]
$ float modify -j y78iefqeo8k6ng2rpvduj --addCustomTag project=RNA-sequencing,dept=Research
Successfully modified y78iefqeo8k6ng2rpvduj: --addCustomTag project=RNA-sequencing,dept=Research
$ float modify -j y78iefqeo8k6ng2rpvduj --addCustomTag PI=jones --addCustomTag funding=NHS
Successfully modified y78iefqeo8k6ng2rpvduj: --addCustomTag PI=jones --addCustomTag funding=NHS
$ float show -j y78iefqeo8k6ng2rpvduj
...
customTags:
PI=jones: ""
funding=NHS: ""
project=RNA-sequencing,dept=Research: ""
run-name=task1: ""
...[edited]
float nis
Use the float nis
command to manage NIS configurations. Use a subcommand with the -h
flag to list the options.
Subcommands | Usage | Option | Option Definition |
add
| Update an SMTP configuration | -n, --name nis_config | Name to associate with the nis config |
-D, --domain nis_domain | NIS domain | ||
--precheck | Run precheck before adding NIS configuration | ||
-S, --server nis_addr | IP address of NIS server | ||
delete
| Delete an NIS configuration | nis_id | ID that identifies NIS configuration to delete |
get
| Get NIS configuration | nis_id | ID that identifies NIS configuration to get |
list, ls
| List all NIS configurations | ||
update
| Update an NIS configuration | -i , --id nis_id | ID to identify NIS config |
-D, --domain nis_domain | NIS domain | ||
-n, --name nis_name | Name to identify NIS config | ||
S, --server nis_ip_addr | IP address of NIS server |
float node
Use the float node
command to create and manage nodes (host or cloud instance). Use a subcommand with the -h
flag to list the options.
Subcommands | Usage | Option | Option Definition |
create
| Create a new node (host or cloud instance) | --cphLimit limit | Maximum allowed cost per hour for this node |
-c, --cpu min_cpu:max_cpu | Range of number of virtual CPUs to select from for new node. If max_cpu omitted, value of twice min_cpu used. | ||
--imageID mi_id | Machine image ID | ||
-t, --instType instance_type |
VM instance type (e.g., c5.xlarge in AWS). Do not combine
with --cpu or --mem options.
| ||
--maxCpuCores max_cpu | Maximum number of CPU cores | ||
--maxMemGB max_mem | Maximum memory capacity in GB | ||
-m, --mem min_mem:max_mem | Range of memory capacity to select from for new node. If max_mem omitted, value of twice min_mem used. | ||
--miTag mi_tag1,mi_tag2,... | Machine image tag(s) required for node | ||
--minCpuCores min_cores | Minimum number of CPU cores (default: 1) | ||
--minCudaCores min_cuda | Minimum number of Cuda cores | ||
--minGpuMemoryGB min_gmem | Minimum GPU memory capacity in GB | ||
--minGpuCores min_gpu | Minimum number of GPU cores | ||
--minMemGB min_cmem | Minimum CPU memory capacity in GB | ||
--name node_name | Name of node (optional) | ||
--onDemand | Create on-demand instance | ||
--onDemandCphLimit od_limit | Maximum allowed cost per hour for on-demand instance | ||
--optional-miTag mi_tag1,mi_tag2, ... | Optional machine image tags for this node | ||
--region region | Region to create node in | ||
--rootVolSizeGB root_vol | Minimum size of root volume in GB (default: 32) | ||
--spotCphLimit sp_limit | Maximum allowed cost per hour for spot instance | ||
--spotFirst | Create a spot instance if possible (default: true) | ||
--spotOnly | Create a spot instance only | ||
--zone | Availability zone in which to create node | ||
delete (del, destroy, rm)
| Delete node by name | node_name | Name of node to delete |
list
| List the managed worker nodes | -f, --filter filter |
Filter(s) to apply to list of queued jobs. Format is [attribute][operator][value] . Use option multiple times to apply multiple filters combined with "and" operator. See details in Working with OpCenter/Filters/Job Filters.Supported attributes are:
|
-l, --limit num_results | Limit to the number of results displayed (0 means no limit) | ||
-o, --orderBy attribute |
Attribute used to order node listing (prepend with "-" to reverse order). Nest attributes by separating with a comma. Supported values are:
| ||
-s, --pos num_pos | Start position of the display of the results | ||
info
| Show detailed information about a node | node_id | Node ID |
float project
Use the float project
command to create and manage projects. Use a subcommand with the -h
flag to list the options.
Subcommands | Usage | Option | Option Definition |
add (create, new)
| Create new project with specified parameters | --dataVolume stringArray |
Data volume(s) to run job. See float submit for syntax.
|
-m, --instType instance_type |
VM instance type to run job (e.g., c5.xlarge in AWS). Do not combine
with --cpu or --mem options.
| ||
-I, --maxIdleTime duration | Maximum idle time for nodes in the project | ||
-M, --maxNodes integer | Maximum number of nodes in project | ||
-mi mi_id | ID of machine image to use in project | ||
-n, --name string | Name of project | ||
--precheck | Run check before adding project | ||
-s, --subnet net1,net2,... | Subnet(s) in AWS or vSwitch in Alicloud to create nodes in. | ||
-t, --tag key1=value1,key2=value2,... | Tag(s) to associate with project | ||
-V, --vmPolicy stringArray |
VM creation policy. See float submit for syntax.
| ||
delete (del, remove, rm)
| Delete project by name or ID | i, --id project_id | ID of project to delete |
n, --name project_name | Name of project to delete | ||
get
| Get properties of project | i, --id project_id | ID of project to query |
n, --name project_name | Name of project to query | ||
jobs (jl, joblist, jobls, jls)
| Display jobs in project | i, --id project_id | ID of project to query |
n, --name project_name | Name of project to query | ||
list (ls)
| List all projects | ||
modify (update)
| Modify project parameters | --dataVolume stringArray |
Data volume(s) to run job. See float submit for syntax.
|
i, --id project_id | ID of project to modify | ||
-m, --instType instance_type |
VM instance type to run job (e.g., c5.xlarge in AWS). Do not combine
with --cpu or --mem options.
| ||
-I, --maxIdleTime duration | Maximum idle time for nodes in the project | ||
-M, --maxNodes integer | Maximum number of nodes in project | ||
-mi mi_id | ID of machine image to use in project | ||
-n, --name string | Name of project | ||
-o, --overwrite | Overwrite existing parameter values | ||
-s, --status | New status of project | ||
-N, --subnet net1,net2,... | Subnet(s) in AWS or vSwitch in Alicloud to create nodes in. | ||
-t, --tag key1=value1,key2=value2,... | Tag(s) to associate with project |
float promote (boost)
Use the float promote
or float boost
command (as an admin user) to move a job in the "Submitted" state to the front of the scheduling queue. The command has no subcommands. Use the command with the -h
flag to list the options.
Subcommands | Usage | Option | Option Definition |
Promote a job in the "Submitted" state to the front of the scheduling queue. Must be admin user. | -j, --job job_id | Job to promote |
float ps
Use the float ps
command to show the complete process tree of a running job (the linux command ps
must be installed in the container). The command has no subcommands. Use the command with the -h
flag to list the options.
Subcommands | Usage | Option | Option Definition |
Show the complete process tree of job | --args podman_args |
Arguments passed to podman
| |
-j, --jobId job_id | Job to query |
float ptop (qtop)
Use the float ptop
or float qtop
commands to show the complete process tree of a running job (the linux command ps
must be installed in the container). The command has no subcommands. Use the command with the -h
flag to list the options.
Subcommands | Usage | Option | Option Definition |
Show top of pending jobs queue visible to the current user (index indicates the order of the job in the pending queue) | -l, --limit integer | Maximum number of pending jobs to return (default: 32) |
float queue
Use the float queue
command to manage queues in an HPC cluster. Use a subcommand with the -h
flag to list the options.
Subcommands | Usage | Option | Option Definition | |
add, create, new
| Add a new queue to an existing cluster | -c, --cluster cluster_name | Cluster name | |
-D, --dataVolume stringArray |
Data volume for jobs in the queue. See float submit for syntax.
| |||
-H, --hyperThread int | Hyperthread setting for queue. 0 means default, 1 means disable, 1 means enable | |||
-m, --instType i_type1,i_type2,... | VM instance type(s) (e.g., c5.xlarge in AWS) | |||
-I, --maxIdleTime duration | Maximum idle time for nodes in the project | |||
-M, --maxNodes max_nodes | Maximum number of nodes for the queue | |||
--imageID mi_id | Machine image ID for jobs in the queue | |||
-n, --name queue_name | Name of queue | |||
--precheck | Run check before adding queue | |||
-Q, --qos conf1,conf2,... | QoS configuration(s) for the jobs in the queue | |||
--reconfigure | Reconfigure queue after modifying | |||
--storage storage_def |
Storage for the jobs in the queue. See float storage for syntax.
| |||
-s, --subnet net1,net2,... | Subnet(s) in AWS or vSwitch in Alicloud for jobs in the queue | |||
--useClusterSg | Use cluster security groups directly | |||
--useClusterStorage | Use cluster storage directly | |||
--useClusterSubnet | Use cluster subnets directly | |||
-V, --vmPolicy policy_def |
VM creation policy for jobs in queue. See float submit for syntax.
| |||
delete (del, rm, remove)
| Delete a queue from an HPC cluster. Identify queue by name or ID. | -c, --clusterID cluster_id | ID of HPC cluster | |
-i, --id queue_id | ID of queue to delete | |||
-n, --name queue_name | Name of queue to delete | |||
get
| Get properties of a queue. Identify queue by name or ID. | -c, --clusterID cluster_id | ID of HPC cluster | |
-i, --id queue_id | ID of queue to get | |||
-n, --name queue_name | Name of queue to get | |||
jobs (jl, joblist, jobls, jls)
| List all jobs in a queue | -c, --clusterID cluster_id | ID of HPC cluster | |
-f, --filter filter_def | Filter to apply to job listing | |||
-i, --id queue_id | ID of queue whose jobs are listed | |||
-n, --name queue_name | Name of queue whose jobs are listed | |||
-s, --orderBy order_def | Criteria to order job listing | |||
-v, --verbose | Show extended information of jobs in queue | |||
ls
| List available queues | -f, --filter filter_def | Filter to apply to queue listing | |
-s, --orderBy order_def | Criteria to order job listing | |||
-v, --verbose | Show extended information of queues | |||
modify (update)
| Modify queue in an existing cluster | -c, --cluster cluster_name | Cluster name | |
-D, --dataVolume stringArray |
Data volume for jobs in the queue. See float submit for syntax.
| |||
-H, --hyperThread int | Hyperthread setting for queue. 0 means default, 1 means disable, 1 means enable | |||
-i, --id queue_id | ID of queue whose jobs are listed | |||
-m, --instType i_type1,i_type2,... | VM instance type(s) (e.g., c5.xlarge in AWS) | |||
-I, --maxIdleTime duration | Maximum idle time for nodes in the project | |||
-M, --maxNodes max_nodes | Maximum number of nodes for the queue | |||
--imageID mi_id | Machine image ID for jobs in the queue | |||
-n, --name string | Name of queue | |||
--precheck | Run check before adding queue | |||
-Q, --qos conf1,conf2,... | QoS configuration(s) for the jobs in the queue | |||
--reconfigure | Reconfigure queue after modifying | |||
--storage storage_def |
Storage for the jobs in the queue. See float storage for syntax.
| |||
-s, --subnet net1,net2,... | Subnet(s) in AWS or vSwitch in Alicloud for jobs in the queue | |||
--useClusterSg | Use cluster security groups directly | |||
--useClusterStorage | Use cluster storage directly | |||
--useClusterSubnet | Use cluster subnets directly | |||
-V, --vmPolicy policy_def |
VM creation policy for jobs in queue. See float submit for syntax.
|
float queues
Use the float queues
command to get information about queues in an HPC cluster. The command has no subcommands. Use the -h
flag to list the options.
Subcommands | Usage | Option | Option Definition | |
List available queues | -f, --filter filter_def | Filter to apply to queue listing | ||
-s, --orderBy order_def | Criteria to order job listing | |||
-v, --verbose | Show extended information of queues |
float quota-policy
Use the float quota-policy
command to define and manage quota (SurfZone) policies. Use a subcommand with the -h
flag to list the options.
Subcommands | Usage | Option | Option Definition |
add , create
| Create quota policy | policy_name | Name to associate with the quota policy |
--limit budget_limit | Maximum amount (in $) allowed to spend in one month | ||
--action cancel | suspend | Action taken when quota limit reached (default: cancel) | ||
--auto-resume=true | false | Action taken on suspended job when quota replenished (default: true) | ||
--threshold threshold | Percentage of quota (budget) consumed to trigger alert (default: 80) | ||
info , show
| Display quota policy details | policy_id | ID that identifies quota policy |
list
| List available quota policies | ||
update , modify
| Update parameters in existing quota policy | policy_id | ID that identifies quota policy |
--name policy_name | New name to associate with the quota policy | ||
--limit budget_limit | Updated maximum amount (in $) allowed to spend in one month | ||
--action cancel | suspend | Updated action taken when quota limit reached | ||
--autoresume true | false | Updated action taken on suspended job when quota replenished | ||
--threshold threshold | Updated percentage of quota (budget) consumed to trigger alert | ||
delete , rm , remove
| Delete a quota policy | policy_id | ID to identify quota policy |
Examples
$ float quota-policy list
id: u2w02ntshv61mxc7muaz0
name: fruit
metric: Cost
overageAction: cancel
autoResume: false
threshold: 80%
limit: 37
- id: 2f4qaesir4wxz71mazstq
name: legume
metric: Cost
overageAction: suspend
autoResume: true
threshold: 80%
limit: 35
$ float quota-policy update 2f4qaesir4wxz71mazstq --limit 50
id: 2f4qaesir4wxz71mazstq
name: legume
metric: Cost
overageAction: suspend
autoResume: true
threshold: 80%
limit: 50
float release
Use the float release
command to manage the OpCenter software. Use a subcommand with the -h
flag to list the options.
Subcommands | Usage | Option | Option Definition |
info
| Display information regarding new features and bug fixes in OpCenter release | -r, --release version | The release to display information about |
list , ls
| List available OpCenter releases | ||
migrate
| Migrate existing database to new database or new location | -d, --dbPath /path/to/db | Destination path of new DB or new DB location |
sync
| Sync CLI version with OpCenter version | ||
upgrade
| Upgrade OpCenter software | --force | Automatically answer "yes" at confirmation prompt |
-r, --release version |
The release to upgrade to (default: "latest"). Only the admin user can upgrade software. Cannot upgrade using web CLI console.
| ||
--sync | Wait for upgrade to complete and then sync the CLI to the new release |
Examples
$ float release ls
+----------+--------------------------------------+----------------------+-----------+
| VERSION | RELEASE | RELEASE TIME | SIZE |
+----------+--------------------------------------+----------------------+-----------+
| * v2.5.0 | FLOAT_v2.5.0-171088e-HalfMoonBay.bin | 2024-02-09T03:31:15Z | 220.65 MB |
| v2.4.1 | FLOAT_v2.4.1-0803674-Goa.bin | 2024-01-09T18:29:33Z | 219.32 MB |
+----------+--------------------------------------+----------------------+-----------+
$ float release sync
downloading ...
Progress: |>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>| 100.00% Complete (ETA. 0s)
The float binary is synced up with opcenter
Note
If you place the float
binary in a directory that is only writable by the root user, use sudo float release sync
float report
Use the float report
command to download reports from OpCenter. Use a subcommand with the -h
flag to list the options.
Subcommands | Usage | Option | Option Definition |
download
| Download OpCenter usage history report | --path /path/to/dir | Path to save report on local computer |
get
| Generate and display usage report with filter(s) applied | report_name | Name of report, e.g., usage_report_by_job |
-A, --all | Compile report from all usage data retained by this OpCenter. If -A not used, default filter applied (different for each report). | ||
-d, --date date_string | Date used to filter reports (default: current report) | ||
--filter filter | Filter(s) to apply to usage report. Use option multiple times to apply multiple filters. Working with OpCenter/Filters/Job Filters for details. Example: timeRange=2010-10-22~ | ||
-l, --limit num_records | Maximum number of records to display (default: 0 which means unlimited) | ||
-o, --orderBy field | Field to order by. Use 'cost' or 'jobs' (default). The column entitled 'JOB COUNT' is used to order entries when 'jobs' specified. | ||
-r, --refresh | Force the refresh of report | ||
ls , list
| List all available usage reports |
Examples:
$ float report get usage_report_by_user -A -f status=Cancelled
+-----------+-----+-----------+------------+--------------+----------------+--------------------+...
| USER NAME | FEE | JOB COUNT | WALL TIME | COMPUTE TIME | SPOT INSTANCES | ONDEMAND INSTANCES |...
+-----------+-----+-----------+------------+--------------+----------------+--------------------+...
| admin | 0 | 69 | 617h57m42s | 615h42m7s | 47 | 11 |...
| apple | 0 | 1 | 6h33m59s | 6h53m31s | 8 | 0 |...
...
(edited)
$ float report download --path ./temp.gzip
Downloaded to ./temp.gzip
`file temp.gzip`
temp.gzip: gzip compressed data, original size modulo 2^32 14336
float rerun (requeue, resubmit)
Use the float rerun
(or requeue
or resubmit
) command to re-submit a completed job, a canceled job, or a job that failed to complete. The command has no subcommands. Use the command with the -h
flag to list the options.
Subcommands | Usage | Option | Option Definition |
Re-submit completed or failed job | -j, --job job_id | Job to re-submit |
float restart (reboot)
Use the float restart
(or reboot
) command to restart OpCenter (terminates all login sessions). The command has no subcommands. Use the command with the -h
flag to list the options.
Subcommands | Usage | Option | Option Definition |
Restart OpCenter | -f, --force | Automatically answer "yes" at confirmation prompt |
Example
$ float list
+-----+---------------------+...+-----------+----------------------+----------+------------+
| ID | NAME |...| STATUS | SUBMIT TIME | DURATION | COST |
+-----+---------------------+...+-----------+----------------------+----------+------------+
| g...| tidyverse-t3.medium |...| Executing | 2023-12-11T23:22:17Z | 20m37s | 0.0066 USD |
+---------------------------+...+-----------+----------------------+----------+------------+
$ float restart
Warning: There are running jobs in server, do you want to restart forcibly?
Some job may fail if you do that!(yes/No): yes
Warning: Are you sure you want to restart OpCenter? All active sessions will be inactivated.(yes/No): yes
OpCenter is now restarting and will be online soon.
Pause briefly and then log back in
$ float list
+-----+---------------------+...+-----------+----------------------+----------+------------+
| ID | NAME |...| STATUS | SUBMIT TIME | DURATION | COST |
+-----+---------------------+...+-----------+----------------------+----------+------------+
| g...| tidyverse-t3.medium |...| Executing | 2023-12-11T23:22:17Z | 21m42s | 0.0072 USD |
+---------------------------+...+-----------+----------------------+----------+------------+
(edited for clarity)
float resume (recover, restore)
Use the float resume
(or recover
or restore
) command to resume a suspended job (see float suspend
command). The command has no subcommands. Use the command with the -h
flag to list the options.
Subcommands | Usage | Option | Option Definition |
Resume a suspended job. If no options used, resume on identical instance as original.. | -c, --cpu min_cpu:max_cpu | Range of number of virtual CPUs to select from for new VM instance (can omit max_cpu). | |
-t, --instType instance_type |
Instance type for new VM (e.g., c4.xlarge in AWS). Do not combine with --cpu or --mem options.
| ||
-j, --job job_id | Job to resume | ||
-m, --mem min_mem:max_mem | Range of memory size (in GB) to select from for new VM instance (can omit max_mem). | ||
-P, --payType spot | ondemand | Pricing tier for VM instance (Spot or On-demand). |
float sbatch (submit)
The float sbatch
command is the same as float submit
.
float scancel (cancel)
The float scancel
command is the same as float cancel
.
float secret
Use the float secret
command to interact with the secret manager. Use the command with the -h
flag to list the options.
Subcommands | Usage | Option | Option Definition |
get
| Retrieve secret from the secret manager | secret_name | Name associated with secret |
ls , list
| List all secrets created by current user | ||
set , put
| Insert {name, value} pair into secret manager database | secret_name secret_value | {name, value} pair to insert |
--file /path/to/file | File containing secret_value. Use instead of specifying secret_value on command line. | ||
-f, --force | Overwrite existing secret value | ||
unset , rm , del , remove , delete
| Delete {name, value} pair from secret manager database | secret_name | Name associated with {secret_name, secret_value} pair to delete |
Example
$ float secret set s3access 1234567
Set s3access successfully
$ float secret ls
+----------+
| NAME |
+----------+
| s3access |
+----------+
$ float secret unset s3access
unset secret s3access successfully
$ cat value.txt
secret123
$ float secret set usersecret --file ./value.txt
Set usersecret successfully
float show
Use the float show
command to display the status of a job and content of job scripts. The command has no subcommands. Use the command with the -h
flag to list the options.
Subcommands | Usage | Option | Option Definition |
Show current status of job or VM instance, or show contents of scripts used for job | -C, --cluster cluster_name | cluster_id | Cluster in which job is running | |
--containerInitScript | Display contents of container init script | ||
-c, --content | Display contents of job script | ||
-i, --hid host_id | VM instance to query | ||
--hostInitScript | Display contents of host init script | ||
--hostTerminateScript | Display contents of host terminate script | ||
-j, --job job_id | Job to query | ||
-q, --queue queue_name | queue_id | Queue to query |
Example
$ float show -j FOIW5Y16KZgJ6Tsd02QuS
id: ctZLDo7OFG4BuJ8ytiTem
name: python-c5d.large
workingHost: 52.7.123.178 (2Core4GB/Spot)
user: admin
imageID: docker.io/bitnami/python:latest
imageDigest: sha256:24c1d45bf41c396184bd9808b307c67267a809754cd176ac8d91cceb47d0f3ef
output: |-
Getting image source signatures
Copying blob sha256:1acb894a7ceb1ba5362fb85123b0248a064ed3195aaa24af74e8cb710ca1c5a4
Copying config sha256:23bacce690702dac91557ef74ab312cb3db5a2b4bb54ada968d1352e7d9a110a
Writing manifest to image destination
Storing signatures
Loaded image: docker.io/bitnami/python:latest
First submit job ctZLDo7OFG4BuJ8ytiTem, call podman directly
No cmd args provided, launch job directly
4eda6b0d471fb31eecfd06b59e1d8c527310861009fa4baf967d834397934bac
status: Executing
.....[output edited]
$ float show -c -j S0JPnWd3a2hQmVmVWCMWc
#!/usr/bin/bash
LOG_PATH=$1
LOG_FILE=$LOG_PATH/output
touch $LOG_FILE
exec >$LOG_FILE 2>&1
echo "Congratulations! You have submitted your first job"
for(( c=1; c<3; c++))
do
if [[ $(($c % 3)) == 1 ]] then
echo "Hello World!"
else
echo "Your next job will be more interesting" >&2
fi
sleep 20s
done
echo "Job complete"
float sinfo (hosts)
The float sinfo
command is the same as float hosts
.
float smtp
Use the float smtp
command to define and manage email (SMTP) configurations. Use a subcommand with the -h
flag to list the options.
Subcommands | Usage | Option | Option Definition |
add
| Update an SMTP configuration | -n, --name smtp_config | Name to associate with the SMTP config |
--from from_addr | Email address to use in From: field of email message | ||
-P, --password user_pw | Password for user | ||
--passwordStdin | Prompt user to enter password | ||
--port port_num | Port number for SMTP server | ||
S, --server ip_addr | IP address of SMTP server | ||
--useSSL | Use SSL for server connection | ||
--useTLS | Use TSL for server connection | ||
-U, --user user_name | Name of SMTP user | ||
delete
| Delete an SMTP configuration | smtp_id | ID that identifies SMTP configuration to delete |
get
| Get SMTP configuration | smtp_id | ID that identifies SMTP configuration to get |
--password show_pwd | Password to get SMTP configuration | ||
list, ls
| List all SMTP configurations | ||
update
| Update an SMTP configuration | -i , --id smtp_id | ID to identify SMTP config |
--from from_addr | Email address to use in From: field of email message | ||
-n, --name smtp_name | Name to identify SMTP config | ||
-P, --password user_pw | Password for user | ||
--passwordStdin | Prompt user to enter password | ||
--port port_num | Port number for SMTP server | ||
S, --server ip_addr | IP address of SMTP server | ||
--useSSL | Use SSL for server connection | ||
--useTLS | Use TSL for server connection | ||
-U, --user user_name | Name of SMTP user |
float snapshot
Use the float snapshot
command to display information about snapshots. Use subcommand with the -h
flag to list the options.
Subcommands | Usage | Option | Option Definition |
list
| List available snapshots | -A, --all | Show all snapshots associated with job |
-f, --filter filter |
Filter to apply to snapshots, for example: status=normal . The default filter is active=true . Use option multiple times to apply multiple filters combined with "and" operator.
| ||
-j, --jobID job_id | Job to query | ||
show , info
| Display information about snapshot | -s, --snapID snapshot_id | ID of snapshot to query |
Example:
$ float snapshot show -s jobSnap-Kp89d7qzJjiuhNeaOUfsA
id: jobSnap-Kp89d7qzJjiuhNeaOUfsA
jobID: h3xg4jpC8ydcgkVWBKj4S
volumeSnapshots:
- zone: us-east-1b
volumeId: vol-02786458cebdeca43
volumeSize: 6
status: completed
snapshotId: snap-08e5b6ed8b7834b96
createTime: "2023-04-18T15:51:17Z"
mountPoint: /mnt/float-data
cost: 0.0000 USD
- zone: us-east-1b
volumeId: vol-06dd42bf6b091466a
volumeSize: 6
status: completed
snapshotId: snap-0f2833cdb639fa4b5
createTime: "2023-04-18T15:51:17Z"
mountPoint: /mnt/float-image
cost: 0.0000 USD
- zone: us-east-1b
volumeId: vol-023c2e91e34f37335
volumeSize: 10
status: completed
snapshotId: snap-025a733db664e9054
createTime: "2023-04-18T15:51:17Z"
mountPoint: /data
cost: 0.0000 USD
status: normal
createTime: 2023-04-18T15:51:17.434161661Z
float squeue (list)
The float squeue
command is the same as float list
.
float status
Use the float status
command to show current status of the OpCenter. The command has no subcommands. Use the command with the -h
flag to list the options.
Example
$ float status
id: b39a9d89-cf9e-4b55-9399-f792c2bec020
Server Status: normal
API Request Status: normal
License Status: valid
Init Time: "2025-04-02T22:59:51Z"
Up Time: "2025-04-02T23:01:56Z"
Current Time: "2025-04-26T22:52:27Z"
float storage
Use the float storage
command to "register" (that is, pre-configure) storage services for use when submitting jobs. Use subcommand with the -h
flag to list the options.
Subcommands | Usage | Option | Option Definition |
delete , rm , remove | Delete a registered storage service | storage_id | Identifier associated with the storage service to delete |
info , show | Display information about a registered storage service | storage_id | Identifier associated with the storage service to query |
list | Display filtered list of registered storage services | -f, --filter filter |
Filter to apply to registered storage services. Use option multiple times to apply multiple filters combined with "and" operator. Simple filters consist of an attribute, operator, and value. Supported attributes include:
|
-o, --orderBy attribute | Attribute used to order listing (prepend with "-" to reverse order). Use any filter attribute (default: createdTime). Create nested ordering by connecting multiple attributes with ",". | ||
register (add) volume | Register an existing volume as a storage service | -n, --name storage_name | Name to associate with this storage service |
--id volume_id | Volume ID that identifies volume (for example, an identifier like vol-0da677a1f4967350a in AWS) | ||
--mode ro | rw | Access mode
| ||
-m, --mountPoint path | Path to directory where volume is mounted by container | ||
--permision normal | public | Storage service access permission
| ||
register (add) nfs | Register NFS-exported directory as a storage service | -n, --name storage_name | Name to associate with this storage service |
--mode ro | rw | Access mode
| ||
-m, --mountPoint path | Path to directory where volume is mounted by container | ||
--permision normal | public | Storage service access permission
| ||
--url nfs://nfs_server_ip/exported_dir | IP address of NFS server and exported directory, for example 192.168.1.1/home | ||
register (add) lustre | Register Lustre file system as a storage service | -n, --name storage_name | Name to associate with this storage service |
--mode ro | rw | Access mode
| ||
-m, --mountPoint path | Path to directory where volume is mounted by container | ||
--options string | Mount options for lustre file system, for example, [opts=rw, noexec, abort_recov, recovery_time_hard=120s] | ||
--permision normal | public | Storage service access permission
| ||
--url lustre://dns-name/mountname/exported_dir | DNS name of lustre server, file system's mount name, and exported directory, respectively. The mount name is the name you assign the file system when configuring FSx for Lustre on AWS. Naming the file system makes it easier to find. | ||
register (add) s3 | Register S3 bucket as a storage service | -n, --name storage_name | Name to associate with this storage service |
--bucket s3://bucket_name/folder | S3 bucket name and folder (optional) to use for storage service | ||
-c, --credential [accessKey=key_value, secret=secret_value,] | Access key string and access secret value, respectively | ||
--endpoint s3_endpoint | Endpoint for S3 bucket, for example, s3.us-east-1.amazonaws.com | ||
--mode ro | rw | Access mode
| ||
-m, --mountPoint path | Path to directory where volume is mounted by container | ||
--permision normal | public | Storage service access permission
| ||
register (add) gs | Register Google Cloud Storage as a storage service | -n, --name storage_name | Name to associate with this storage service |
--bucket gs://bucket_name/folder | Google Cloud Storage bucket name and folder (optional) to use for storage service | ||
-c, --credential [accessKey=key_value, secret=secret_value,] | Access key string and access secret value, respectively | ||
--endpoint gs_endpoint | Endpoint for GCS bucket, for example, storage.googleapis.com | ||
--mode ro | rw | Access mode
| ||
-m, --mountPoint path | Path to directory where volume is mounted by container | ||
--permision normal | public | Storage service access permission
| ||
register (add) omics | Register a new omics sequence readSet | -n, --name storage_name | Name to associate with this storage service |
--mode ro | rw | Access mode
| ||
-m, --mountPoint path | Path to directory where volume is mounted by container | ||
--options string | Mount options for this storage service | ||
--permision normal | public | Storage service access permission
| ||
readSetId string | readSet id for the omics storage | ||
--seqId string | Sequence id for the omics storage | ||
register | Register storage service using dataVolume format (instead of register volume|nfs|lustre|s3|gs ). | --dataVolume [string] | Parameters that define the storage service. Use the following format.
|
update volume | Update an existing volume as a storage service | storage_id | Identifier associated with this storage service |
--id volume_id | Volume ID that identifies volume (for example, an identifier like vol-0da677a1f4967350a in AWS) | ||
-n, --name storage_name | Name to associate with this storage service | ||
--mode ro | rw | Access mode
| ||
-m, --mountPoint path | Path to directory where volume is mounted by container | ||
--permision normal | public | Storage service access permission
| ||
update nfs | Update NFS-exported directory as a storage service | storage_id | Identifier associated with this storage service |
--mode ro | rw | Access mode
| ||
-m, --mountPoint path | Path to directory where volume is mounted by container | ||
-n, --name storage_name | Name to associate with this storage service | ||
--permision normal | public | Storage service access permission
| ||
--url nfs://nfs_server_ip/exported_dir | IP address of NFS server and exported directory, for example 192.168.1.1/home | ||
update lustre | Update Lustre file system as a storage service | storage_id | Identifier associated with this storage service |
--mode ro | rw | Access mode
| ||
-m, --mountPoint path | Path to directory where volume is mounted by container | ||
-n, --name storage_name | Name to associate with this storage service | ||
--options string | Mount options for lustre file system, for example, [opts=rw, noexec, abort_recov, recovery_time_hard=120s] | ||
--permision normal | public | Storage service access permission
| ||
--url lustre://dns-name/mountname/exported_dir | DNS name of lustre server, file system's mount name, and exported directory, respectively. The mount name is the name you assign the file system when configuring FSx for Lustre on AWS. Naming the file system makes it easier to find. | ||
update s3 | Update S3 bucket as a storage service | storage_id | Identifier associated with this storage service |
--bucket s3://bucket_name/folder | S3 bucket name and folder (optional) to use for storage service | ||
-c, --credential [accessKey=key_value, secret=secret_value,] | Access key string and access secret value, respectively | ||
--endpoint s3_endpoint | Endpoint for S3 bucket, for example, s3.us-east-1.amazonaws.com | ||
--mode ro | rw | Access mode
| ||
-m, --mountPoint path | Path to directory where volume is mounted by container | ||
-n, --name storage_name | Name to associate with this storage service | ||
--permision normal | public | Storage service access permission
| ||
update gs | Update Google Cloud Storage as a storage service | storage_id | Identifier associated with this storage service |
--bucket gs://bucket_name/folder | Google Cloud Storage bucket name and folder (optional) to use for storage service | ||
-c, --credential [accessKey=key_value, secret=secret_value,] | Access key string and access secret value, respectively | ||
--endpoint gs_endpoint | Endpoint for GCS bucket, for example, storage.googleapis.com | ||
--mode ro | rw | Access mode
| ||
-m, --mountPoint path | Path to directory where volume is mounted by container | ||
-n, --name storage_name | Name to associate with this storage service | ||
--permision normal | public | Storage service access permission
|
float submit
Use the float submit
(or float sbatch
) command to submit a job. The command has no subcommands. Use the command with the -h
flag to list the options.
Subcommands | Usage | Option | Option Definition |
Submit job for execution | --agentPort port_num | Agent port (default:443). If not using default, confirm OpCenter can access port. | |
--allowList instance_type |
Allowed instance type(s). Use format like "c5*" which allows any VM type in the c5 family. Include --allowList multiple times to add other VM instance types to the allow list. Default is [*] which allows all types.
| ||
--bandwidth bandwidth | Bandwidth (in Mbps) required for workload (only use in AliCloud) | ||
-A, --cmdArgs command_string | Commands that are executed immediately after container starts (can be used with or without a job script) | ||
-c, --cpu min_cpu:max_cpu | Range of number of virtual CPUs to select from to run job (can omit max_cpu). If mac_cpu omitted, value set to twice min_cpu. | ||
--cpuVendor cpu_vendor_name | CPU vendor of allowed VM instances. Allowed values are amd, intel, or " ". Default is " " which allows any vendor. Use with AWS only. | ||
--customTag tagName:tagValue |
Tag customized by each MMCloud subscriber, for example, to generate customized usage reports. Multiple custom tags can be applied by using --customTag multiple times in a single command.
| ||
-D, --dataVolume [size=vol_size,throughput=rate]:/data_dir or [mode=rw]vol_id :/container_mnt_pt/path or [mode=rw]nfs://ip_address/server_export_dir :/container_mnt_pt/path or [accesskey=x,secret=y,token=t,endpoint=z,mode=rw]s3://bucketname/path :/container_mnt_pt/path or [accesskey={secret:S3_BUCKET_ACCESS_KEY},secret={secret:S3_BUCKET_SECRET_KEY},endpoint=z,mode=rw]s3://bucketname/path :/container_mnt_pt/path or [credfile=aws.conf,profile=user1,endpoint=z,mode=rw]s3://bucketname/path :/container_mnt_pt/path or [opts=xx,yy]lustre://dnsname/mountname/path :/container_mnt_pt/path or [accesskey=xxx,secret=yyy]jfss3://bucketname :/container_mnt_pt/path |
Data volume mounted by container. Use option multiple times to attach multiple data volumes.
| ||
-d, --def definition_file |
Path to file if using definition file (yaml or json format) to provide input parameters to submit command
| ||
--denyList instance_type |
Excluded instance type(s). Use format like "c5*" which excludes all VM types in the c5 family. Include --denyList multiple times to exclude other VM instance types. Default is [ ] which does not exclude any types.
| ||
--disableImageVol | Disable float-image volume creation for job. Only use if job migration disabled. | ||
--dumpMode full | incremental | Setting for snapshot type. Full means complete memory snapshot taken every time. Incremental means only incremental changes captured after initial snapshot. Default is full. | ||
-e, --env env_key=env_value |
Environment variable setting for the job. Include --env multiple times to add multiple variables.
| ||
-E, --errPolicy err_policy |
Policy (AWS only) to use if the job fails. The choices are:
| ||
--extraContainerOpts | Extra options passed directly to container (enclose in quotes) | ||
-f, --force | Automatically answer "yes" at confirmation prompt | ||
--gateway gateway_id | ID of gateway to connect job to. Replacing gateway_id with the word "auto" directs the OpCenter to select a gateway automatically. | ||
--gpu-count min_gpu : max_gpu | Range of number of GPUs to select from to run job | ||
--gpu-disable | Disallow the use of GPUs for this job | ||
--gpu-mem min_gpu_mem : max_gpu_mem | Range of size of GPU memory to select from to run job | ||
--gpu-name GPU_NAME | Specific GPU to use for this job, for example, m60 or h100 or a100, and so on. | ||
--gpu-vendor GPU_VENDOR | Allowed GPU vendor to use for this job, for example, nvidia or amd. | ||
--hyperThread thread_setting | Hyper-thread setting for job (0 means use the setting configured by the cloud provider, 1 means disable, 2 means enable) | ||
-i, --image image_name | image_URI | Image name or image URI to pull container image for job | ||
--imageVolSize image_vol_size | Size of volume (in GB) to act as root volume for image (default: 6) | ||
--imageVolType image_vol_type | (AWS only) Type of EBS volume to use for image volume, for example, gp2 or gp3. | ||
-t, --instType instance_type |
VM instance type, for example, c5.4xlarge for AWS. Overrides --cpu or --mem options.
| ||
-j, --job job_script | Job script to run workload. Use format: path to local file, s3 file, OSS file or https | http file. | ||
-k, --keepDumpLog | Save dump log during job migration | ||
-K, --keepPrivateIP | Keep private IP address during job migration | ||
--mem min_mem:max_mem | Range of memory size (in GB) to select from to run job. If max_mem omitted, value set to twice min_mem. | ||
--metricsInterval metrics_int | Time in seconds between queries to obtain container metrics (default: 10s) | ||
-M, --migratePolicy [migrate_policy] |
Policy to determine auto-migration behavior. Format is [option1=value1,option2=value2...] . The available options with their default values are (if no units are attached, the value is a percentage):
| ||
--miTag [tag1:value1 tag2:value2...] | Tag(s) to select virtual machine image. Use option multiple times to submit multiple tag,value pairs. | ||
-n, --name job_name | Name to associate with job | ||
--noPublicIP |
No public IP address assigned to container host. Ensure that host can reach AWS services. AliCloud users must include --endpoint option in their job scripts.
| ||
-o, --output /path/to/dir |
Folder to save stdout and stderr as files with names: stdout.autosave.$jobid and stderr.autosave.$jobid. Options for path are:
| ||
--outputFlag | If included (set to true), job status is included in job output folder. | ||
-P, --publish host_port:container_port |
Rule for publishing container port to container host port, for example, 8080:80 . Include option multiple times to publish multiple ports. Confirm security group allows access to port.
| ||
--rootVolSize root_vol_size | Root volume size in GB to load base OS (default: 40) | ||
--securityGroup sec_group | AWS security group (or tag in GCP) added to VM instance for this job. Use option multiple times to apply multiple security groups. | ||
--shmSize shm_size |
Size of /dev/shm in format nu where n is a number and u is b, k, m, or g for bytes, KiB, MiB or GiB, respectively (default: 64m)
| ||
--snapLocation local | s3://bucketname |
Location to save snapshot image and metadata. Choices are:
| ||
--snapSkipOpenFileLis file1,file2,... | List of files to skip during snapshot. The format is: *.log,*.tmp | ||
-I, --snapshotInterval snap_int | Time between periodic snapshots in format HhMmSs where H, M, and S are numbers and h, m, and s are hours, minutes, and seconds, respectively. Default is 0 (disable). Minimum is 10m. | ||
--storage id | name:/container_mnt_pt/path/to/dir | Registered storage service included with job. Specify storage service by id or name. Specify path to where storage service is mounted on container. | ||
-s, --subnet subnet_id | AWS subnet ID or AliCloud vSwitch ID in which to execute the job. Default is that the OpCenter automatically selects the subnet. | ||
--swapDurationOnOOM duration | Duration for which swap memory is used to trigger OOM evasion (format: Nm). Default is 0 which means use system level setting. | ||
--swapSize swap_size | Swap size in GB to use for job. Defaul is 0 which means use system level swap size setting. | ||
--swapUsageOnOOM swap_usage | Size of swap memory used (measured as a fraction of total swap size) to trigger OOM evasion (format: 0.5). Default is 0 which means use system level setting. | ||
--tag tag_name | Tag to select image version for job. Default is "latest" or only tag. | ||
--targetPort port_num | Port to connect job to on gateway. Use option multiple times to connect multiple ports. | ||
-T, --template template_name |
Template in format name:tag to use for submitting this job. Default tag is only tag or "latest".
| ||
-l, --timeLimit max_time | Maximum time that a job is allowed to run. Use format hhmmss where h, m, and s are numbers and h, m, and s are hours, minutes, and seconds, respectively. Default is unlimited (use 0). | ||
-V, --vmPolicy [vm_policy] |
VM creation policy. Format is [key1=value1,key2=value2...] . The allowed keys (and values) are:
| ||
--withRoot | Run job with root privileges | ||
-z, --zone availability_zone | Availability zone in which to execute job |
Examples
$ float submit -i tidyverse -j run_genericr.sh --cpu 4 --mem 8 --dataVolume [size=10]:/data
id: pw6nupnmbej0qedlvz3dx
name: tidyverse
user: admin
imageID: docker.io/rocker/tidyverse:latest
status: Submitted
submitTime: "2023-12-13T19:08:27Z"
duration: 0s
queueTime: 0s
cost: 0.0000 USD
inputArgs: -i tidyverse -c 2 -m 4 --dataVolume [size=10]:/data
cpu: 2
memGB: 4
vmPolicy:
policy: spotFirst
retryLimit: 3
optimize: true
retryInterval: 10m0s
migratePolicy:
evadeOOM: true
stepAuto: true
cpu:
upperBoundRatio: 90
lowerBoundRatio: 5
upperBoundDuration: 2m0s
lowerBoundDuration: 5m0s
step: 50
disable: true
mem:
upperBoundRatio: 90
lowerBoundRatio: 5
upperBoundDuration: 2m0s
lowerBoundDuration: 5m0s
step: 50
disable: true
actualEmission: 0.0000 g
baselineEmission: 0.0000 g
$ float submit -i docker.io/centos -j helloworld.sh -c 2 -m 4 --dataVolume [size=10]:/data --name hw -o nfs://172.31.81.17/mnt/memverge/shared
Log in to NFS server and check that output files are populated.
$ ls -l /mnt/memverge/shared/*
-rw-r--r-- 1 5001 5001 0 Feb 15 20:02 /mnt/memverge/shared/hw.08ogctot3ufqnyf9k3wg8.stderr.autosave
-rw-r--r-- 1 5001 5001 116 Feb 15 20:02 /mnt/memverge/shared/hw.08ogctot3ufqnyf9k3wg8.stdout.autosave
float suspend (hibernate)
Use the float suspend
(or hibernate
) command to temporarily suspend a job. In-memory state and relevant files are saved so that the job can resume executing at a later time. The command has no subcommands. Use the command with the -h
flag to list the options.
Subcommands | Usage | Option | Option Definition |
Suspend a running job | --cold | With this option, storage resources are reclaimed, which may save cost, but it slows the suspend operation (and the associated resume operation). Use if suspending for extended period. | |
-f, --force | Automatically answer "yes" to confirmation prompts | ||
-j, --job job_id | ID of job to suspend |
float template
Use the float template
command to use a template or to display information about templates. Use subcommand with the -h
flag to list the options.
Subcommands | Usage | Option | Option Definition |
delete , rm , remove
| Delete a template | -f | Force deletion of template |
-T, --template name:tag | Template name and tag. If tag omitted, default is "latest." | ||
deploy
| Submit a job using a template | -c, --cpu min_cpu:max_cpu | Range of number of virtual CPUs to select from to run job (can omit max_cpu, in which case max_cpu is set to twice the value of min_cpu). |
--gateway gw_id | ID of gateway to connect job to (if applicable). Use `auto` to have the system select a gateway automatically. | ||
-t, --instType instance_type | VM instance type (e.g., c5.xlarge for AWS). If used, this option overrides -c and -m options. | ||
-m, --mem min_mem:max_mem | Range of memory size (in GB) to select from to run job (can omit max_mem , in which case max_mem is set to twice the value of min_mem). | ||
--noPublicIP | No public IP address assigned to container host. Make sure host can reach internal cloud services. | ||
--securityGroup SG1,SG2,... | AWS security group(s) (or tag(s) in GCP) applied to VM instance for this job. Include option multiple times to apply multiple security groups (or tags). Use [ ] to clear all security groups or tags. | ||
--targetPort port_num1,port_num2,... | Port(s) used to connect to job, for example, 8787 for RStudio. Include multiple port numbers to connect multiple ports from a single server to the gateway. Use [ ] to clear all ports. | ||
-T, --template name:tag |
Template (in format name:tag ) to use for submitting this job. Default tag is only tag or "latest".
| ||
info , show
| Display information about a template | -T, --template name:tag |
Template (in format name:tag ) to query. Default tag is only tag or "latest".
|
list , ls
| Display available templates. Templates with source labeled "official" are from the MemVerge template repository. | ||
save
| Save submit string from an earlier job as a template | -f, --force | Force template save |
-j,--job job_id | ID of job whose submit string is saved as template | ||
--overwrite | Overwrite existing template | ||
-T, --template name:tag | Name and tag to save template as. Default tag is "latest". | ||
sync
| Sync OpCenter template library with the MemVerge template repository |
float top
Use the float top
command to show a sorted list of information (including CPU and memory utilization) about current jobs. The display is updated at regular intervals. Enter q
to stop the display. The command has no subcommands. Use the command with the -h
flag to list the options.
Subcommands | Usage | Option | Option Definition |
Show utilization and other information about running jobs. Display updates continually. Enter q to exit.
| --interval top_int | Time in seconds (3 or greater) between queries to obtain container metrics (default: 3s). Format is ns where n is integer. | |
-j, --job_Id job_id | Job to query |
float user
Use the float user
command to manage OpCenter users. Use a subcommand with the -h
flag to list the options.
Subcommands | Usage | Option | Option Definition |
add , create
| Add a new user | new_username | Username for new user |
--create | Create group if the group does not exist (default: false) | ||
--email email_address | Email address for new user | ||
-gid gid | Group ID to use for group. If value not provided, gid automatically assigned. | ||
-g, --groups group_name | Group associated with new user (default: no group). Use option multiple times to include multiple groups. | ||
--ldap | Associate username with LDAP directory (default: false) | ||
--passwd password | Password for new user (default: "memverge") | ||
--quota-policy policy_name | Name of quota (SurfZone) policy to associate with this user | ||
--uid uint32 | User Identifier (uid) for new user. For regular user accounts, uid normally starts at 1000. | ||
delete , remove , rm
| Delete a user | user_name | user_id | Username or uid of user to delete |
-f, --force | Remove all active tokens belonging to user and delete user immediately | ||
disable
| Disable a user's account without deleting it | user_name | user_id | Username or id of user to query |
enable
| Enable a user's account | user_name | user_id | Username or id of user to query |
info , show
| Display information about user | user_name | user_id | Username or id of user to query |
list , ls
| List groups that current user belongs to | ||
passwd
| Reset user's password | user_name | user_id | Username or id of user to update |
--passwd new_password | New password for user | ||
update
| Update information associated with a user | user_name | user_id | Username or id of user to update |
--email email_address | New email address for user | ||
-gid gid | New group ID to use for group. | ||
-g, --group group_name | New group associated with user | ||
--name user_name | New username for user | ||
--passwd password | New password for user | ||
--quota-policy policy_name | New quota (SurfZone) policy to associate with this user | ||
--uid uint32 | New uid for user |
Examples
$ float user add testcase2 --passwd secret123 --groups crops
username: testcase2
uid: 5007
gid: 5007
role: normal
group: crops
email: ""
type: builtin
enabled: true
ownGroup: ""
$ float group update crops --add testcase2 --admin
$ float user info testcase2
username: testcase2
uid: 5007
gid: 5007
role: normal
group: crops
email: ""
type: builtin
enabled: true
ownGroup: crops
$ float user update testcase2 --group default
username: testcase2
uid: 5007
gid: 5007
role: normal
group: default,crops
email: ""
type: builtin
enabled: true
ownGroup: crops
$ float user info user-five
username: user-five
uid: 2012
gid: 2010
role: normal
group: group-five
email: ""
type: ldap
enabled: true
ownGroup: ""
float version
Use the float version
command to display the version of the float
CLI client and the version of the OpCenter it is connected to. The command has no subcommands. Use the command with the -h
flag to list the options.
Example