Deployment of AWS cluster with custom odyhpc AMIs   (ii – cluster design)

Case 4

When using any of m5dn.24xlarge, m5n.24xlarge, m5zn.12xlarge, m5zn.metal, c5n.18xlarge, c5n.metal, c6gn.16xlarge, r5dn.24xlarge or r5n.24xlarge as compute instances, it is important to tell AWS ParallelCluster that network communications must use EFA. The scripts for Example 1 should look like as follows:
          [cluster wrfcluster]
          …
          master_instance_type = c5.large
          compute_instance_type = c5n.18xlarge
          cluster_type = spot
          enable_efa = compute
          disable_hyperthreading = true
          initial_queue_size = 4
          max_queue_size = 4
          maintain_initial_size = false
          …

          [cluster wrfarmcluster]
          …
          master_instance_type = c6g.large
          compute_instance_type          = c6gn.16xlarge
          cluster_type = ondemand
          enable_efa = compute
          initial_queue_size = 4
          max_queue_size = 4
          maintain_initial_size =          false
          …
Using instances with EFA capabilities is strongly recommended for very large workloads.

Close Menu