New Features in MMCloud Dana Point 2.1.1 Release
Date Released
Released on 04-14-2023
Supported Clouds
MMCloud is designed to work on any cloud infrastructure. The Dana Point 2.1.1 release supports the following clouds:
- AWS
- Alibaba Cloud
New Features in Dana Point 2.1.1 Release
General
-
SpotSurfer for Memory-optimized Applications
In earlier releases, a snapshot of the current state of a running job is taken only once — immediately prior to migrating the job to a new virtual machine instance. If a Spot Instance reclaim event causes the job to migrate, the short time available to save the snapshot prevents SpotSurfer from working with applications that use a lot of memory. Periodic snapshots are taken at fixed time intervals — intervals long enough to allow a snapshot to complete, that is, captured and written to persistent storage. If a snapshot cannot complete in the Spot Instance reclaim time window, the most recent complete snapshot is used to recover the job in a new virtual machine. -
Additional Authentication Methods
OpCenter maintains its own database of usernames and passwords for authenticating logins. This is the "built-in" method. Dana Point includes additional authentication methods:- LDAP
- Local Linux /etc/passwd file
The authentication policy is specified in the opcenter.yaml configuration file. The policy can be changed without restarting OpCenter by using float config set to toggle the values of security.enableLdap and security.enableLocal between true and false. If both methods are set to true, then authentication proceeds in this order: LDAP first, then local Linux passwd file (if username not found in LDAP directory), and then the "built-in" method (if username not found in /etc/passwd).
- Integration with AWS Billing
MMCloud has a usage-based billing model. OpCenter collects the information needed to render a bill for the jobs scheduled by that OpCenter instance. Dana Point supports the AWS API that allows MMCloud billing data to be pushed to AWS. AWS generates a bill for the subscriber, collects the payment, and distributes MemVerge's share back to MemVerge.
Functional Improvements
-
Support for AWS S3 and AliCloud OSS buckets as data volumes when submitting jobs.
In addition to EBS volumes and NFS-mounted file systems, AWS S3 and AliCloud OSS buckets are supported with the dataVolume option to the float submit command. -
WaveRider with intelligent calculation of VM capacity when moving to larger (or smaller) VM sizes.
If automatic job migration is enabled, jobs move to larger (or smaller) VMs as application resource demands change. If stepAuto=true is included in the migration policy, OpCenter calculates the optimal change in VM size (measured by memory and number of vCPUs) when moving from the current VM to the target VM. -
Upper limit on VM size when using WaveRider
When enabled, WaveRider moves a job to a larger VM if resource utilization (vCPUs or memory) exceeds specified thresholds. Limit the size of the largest VM allowed by including cpu.limit or mem.limit in the migrate policy. -
Upper limit on wall-clock time for a running job
When submitting a job, include the timeLimit option to automatically cancel a job when the wall-clock time reaches the limit specified. -
Display job script contents for current and archived jobs
Most, but not all, jobs require a job script. If a job is submitted with a job script, the contents of the job script can be displayed by enteringfloat show --content --job <job_id>
. Job scripts (if used) are available for current as well as archived jobs. -
Pass "run" command arguments directly to container or job script
When building a Docker container, you can include variables whose values are determined from "run" command arguments when the container is started. For example, this mechanism can be used to pass username/password strings to the container when it is started. Similarly, "run" command arguments can be used to populate variables in a job script that executes when the container runs. Use the cmdArgs option with float submit to pass arguments directly to the container or job script.
User Experience Improvements
- From the OpCenter landing page, a user can launch the MMCloud GUI. Using version 2 of the GUI, the user can (in addition to the functions in GUI version 1):
- Modify, migrate, or cancel running jobs
- Submit jobs with advanced runtime options, such as enabling stepAuto, or setting cpu.limit and mem.limit
- View a dashboard that integrates data on cloud spending and savings in a single screen — view by week, month, year, application, or user
Bug fixes
This release fixes the following.
- Improve support for large scale deployments (over 1,000 worker nodes)
- Fix issue where price limit not checked when selecting a VM instance