Skip to content

Storage

Configuring Storage for MMBatch Checkpoints

Types of storage and file systems supported for storing checkpoint data:

  • AWS EFS

  • JuiceFS with AWS S3

  • AWS FSx Lustre

which can be configured in EC2 Launch Template as a mount point.

Examples below -

  • AWS EFS - code block below. See here for a complete CloudFormation example.

    # Create mount point and mount EFS 
    mkdir -p /mmc-checkpoint 
    mount -t efs ${BatchEFSFileSystem}:/ /mmc-checkpoint 
    echo "${BatchEFSFileSystem}:/ /mmc-checkpoint efs defaults,_netdev 0 0" >> /etc/fstab 
    chown ec2-user:ec2-user /mmc-checkpoint
    

  • JuiceFS with AWS S3 - code block below. See here for a complete CloudFormation example.

    • Create IAM roles, S3, Redis and Required Infra for JuiceFS

      BatchInstanceRole:
          Type: AWS::IAM::Role
          Properties:
          AssumeRolePolicyDocument:
              Version: "2012-10-17"
              Statement:
              - Effect: Allow
                  Principal:
                  Service: ec2.amazonaws.com
                  Action: sts:AssumeRole
          Path: "/"
          ManagedPolicyArns:
              - arn:aws:iam::aws:policy/service-role/AmazonEC2ContainerServiceforEC2Role
              - arn:aws:iam::aws:policy/AmazonS3FullAccess
              - arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy
          Policies:
              - PolicyName: "JuiceFSpolicy"
              PolicyDocument:
                  Version: "2012-10-17"
                  Statement:
                  - Effect: Allow
                      Action:
                      - "elasticache:*"
                      Resource: !Sub "arn:aws:elasticache:${AWS::Region}:${AWS::AccountId}:cluster/mm-engine-${UniquePrefix}"
                  - Effect: Allow
                      Action:
                      - "s3:*"
                      Resource: !Sub "arn:aws:s3:::mm-engine-juice-fs-${UniquePrefix}/*"
          RoleName: !Sub "mm-batch-instance-role-${UniquePrefix}"
      
      JuiceFSS3Bucket:
          Type: AWS::S3::Bucket
          Properties:
            BucketName: !Sub "mm-engine-juice-fs-${UniquePrefix}"
            BucketEncryption:
              ServerSideEncryptionConfiguration:
                - ServerSideEncryptionByDefault:
                    SSEAlgorithm: AES256
            PublicAccessBlockConfiguration:
              BlockPublicAcls: true
              BlockPublicPolicy: true
              IgnorePublicAcls: true
              RestrictPublicBuckets: true
      

    • Launch Template

      mkdir -p /mmc-checkpoint
      chmod 777 /mmc-checkpoint
      curl -sSL https://d.juicefs.com/install | sh -
      
      # Format and mount JuiceFS
      /usr/local/bin/juicefs format --storage s3 --bucket https://${JuiceFSS3BucketName}.s3.${AWS::Region}.amazonaws.com  --trash-days=0 "rediss://${RedisClusterEndpoint}:6379/1" juicefs-metadata
      nohup /usr/local/bin/juicefs mount \
      "rediss://${RedisClusterEndpoint}:6379/1" \
      --cache-dir /mnt/jfs_cache \
      --cache-size 102400 \
      /mnt/jfs > /tmp/juicefs-mount.log 2>&1 &
      
      echo "Waiting for /mnt/jfs to be mounted..."
      while ! mountpoint -q /mnt/jfs; do
          sleep 2
          echo "Still waiting for /mnt/jfs..."
      done
      echo "/mnt/jfs is now mounted."
      
      MOUNTPOINT=/mnt/jfs
      CHECKPOINT_DIR=$MOUNTPOINT/mmc-checkpoint
      
      # Ensure mount point and subdirectories exist
      mkdir -p $CHECKPOINT_DIR
      chmod 777 $CHECKPOINT_DIR
      
      # Handle /mmc-checkpoint symlink
      if [ -e /mmc-checkpoint ]; then
          echo "/mmc-checkpoint exists. Deleting it to recreate as symlink."
          rm -rf /mmc-checkpoint
      fi
      ln -s $CHECKPOINT_DIR /mmc-checkpoint
      echo "Symlink created: /mmc-checkpoint -> $CHECKPOINT_DIR"
      

  • AWS FSx Lustre

where /mmc-checkpoint can be configured through RESTFUL API (see here for reference).

Supported Storage for User Scratch Data

Types of storage and file systems supported for user scratch data:

  • EBS

  • JuiceFS with AWS S3

  • AWS FSx Lustre