โ™Š๏ธ GemiNews ๐Ÿ—ž๏ธ

Demo 1: Embeddings + Recommendation Demo 2: Bella RAGa Demo 3: NewRetriever Demo 4: Assistant function calling

๐Ÿ—ž๏ธRocky Linux 8 and CentOS 7 versions of HPC VM image now generally available

๐Ÿ—ฟSemantically Similar Articles (by :title_embedding)

Rocky Linux 8 and CentOS 7 versions of HPC VM image now generally available

2024-03-22 - Rohit Ramu (from Google Cloud Blog)

Today weโ€™re excited to announce the general availability of Rocky Linux 8-based and CentOS 7-based HPC Virtual Machine (VM) images for high-performance computing (HPC) workloads, with a focus on tightly-coupled workloads, such as weather forecasting, fluid dynamics, and molecular modeling. With the HPC VM image, we have made it easy to build an HPC-ready VM instance, incorporating our best practices running HPC on Google Cloud, including: VMs ready for HPC out-of-the-box - No need to manually tune performance, manage VM reboots, or stay up to date with the latest Google Cloud updates for tightly-coupled HPC workloads, especially with our regular HPC VM image releases. Reboots will be automatically triggered when tunings require them and this process will be managed for you by the HPC VM image.  Networking optimizations for tightly-coupled workloads - Optimizations that reduce latency for small messages are included, which benefits applications that are heavily dependent on point-to-point and collective communications. Compute optimizations - Optimizations that reduce system jitter are included, which makes single-node performance consistent, important to improving scalability. Improved application compatibility - Alignment with the node-level requirements of the Intel HPC platform specification enables a high degree of interoperability between systems. Performance measurement using HPC benchmarks We have compared the performance of the HPC VM images against the default CentOS 7 and GCP-optimized Rocky Linux 8 images across Intel MPI Benchmarks (IMB). The benchmarks were run against the following images. HPC Rocky Linux 8 Image name: hpc-rocky-linux-8-v20240126 Image project: cloud-hpc-image-public Default GCP Rocky Linux 8 Image name: rocky-linux-8-optimized-gcp-v20240111 Image project: rocky-linux-cloud Each cluster of machines was deployed with compact placement with max_distance=1, meaning all VMs were placed on hardware that were physically on the same rack to minimize network latency. Intel MPI Benchmark (IMB) Ping-Pong IMB Ping-Pong measures the latency when transferring a fixed-sized message between two ranks on different VMs. We saw up to a 15% improvement when using the HPC Rocky Linux 8 image compared to the default GCP Rocky Linux 8 image. Benchmark setup 2 x h3-standard-88 MPI library: Intel OneAPI MPI library 2021.11.0 MPI benchmarks application: Intel MPI Benchmarks 2019 Update 6 MPI environment variables: I_MPI_PIN_PROCESSOR_LIST=0 I_MPI_FABRICS=shm:ofi FI_PROVIDER=tcp Command line: mpirun -n 2 -ppn 1 -bind-to core -hostfile <hostfile> IMB-MPI1 Pingpong -msglog 0:16 -iter 50000 Results Pingpong 1 PPN - Rocky Linux 8 (lower is better) Intel MPI Benchmark (IMB) AllReduce - 1 process per node The IMB AllReduce benchmark measures the collective latency among multiple ranks across VMs. It reduces a vector of a fixed length with the MPI_SUM operation. To isolate networking performance, we initially show 1 PPN (process-per-node) results (1 MPI rank) on 8 VMs. We saw an improvement of up to 35% when comparing the HPC Rocky Linux 8 image to the default GCP Rocky Linux 8 image. Benchmark setup 8 x h3-standard-88 1 process per node MPI library: Intel OneAPI MPI library 2021.11.0 MPI benchmarks application: Intel MPI Benchmarks 2019 Update 6 MPI environment variables: I_MPI_FABRICS=shm:ofi FI_PROVIDER=tcp I_MPI_ADJUST_ALLREDUCE=11 Command line: mpirun -n 008 -ppn 01 -bind-to core -hostfile <hostfile> IMB-MPI1 Allreduce -msglog 0:16 -iter 50000 -npmin 008 Results Allreduce 1 PPN - Rocky Linux 8 (lower is better) Intel MPI Benchmark (IMB) AllReduce - 1 process per core (88 processes per node) We show 88 PPN results where there are 88 MPI ranks/node and 1 thread/rank (704 ranks). For this test, we saw an improvement of up to 25% when comparing the HPC Rocky Linux 8 image to the default GCP Rocky Linux 8 image. Benchmark setup 8 x h3-standard-88 1 process per core (88 processes per node) MPI library: Intel OneAPI MPI library 2021.11.0 MPI benchmarks application: Intel MPI Benchmarks 2019 Update 6 MPI environment variables: I_MPI_FABRICS=shm:ofi FI_PROVIDER=tcp I_MPI_ADJUST_ALLREDUCE=11 Command line: mpirun -n 704 -ppn 88 -bind-to core -hostfile <hostfile> IMB-MPI1 Allreduce -msglog 0:16 -iter 50000 -npmin 704 Results Allreduce 88 PPN - Rocky Linux 8 (lower is better) The latency, bandwidth, and jitter improvements in the HPC VM Image have resulted in historically higher MPI workload performance. We plan to update this blog as more performance results become available. Cloud HPC Toolkit and the HPC VM image You can use the HPC VM image through the Cloud HPC Toolkit, an open-source tool that simplifies the process of deploying environments for a variety of workloads, including HPC, AI, and machine learning. In fact, the Toolkit blueprints and Slurm images based on Rocky Linux 8 and CentOS 7 use the HPC VM image by default. Using the Cloud HPC Toolkit, you can add customization on top of the HPC VM image, including installing new software and changing configurations, making it even more useful.  By using the Cloud HPC Toolkit to customize images based on the HPC VM Image, it is possible to create and share blueprints for producing optimized and specialized images, improving reproducibility while reducing setup time and effort. How to get started You can create an HPC-ready VM by using the following options: Cloud HPC Toolkit Google Cloud CLI Google Cloud console - Note: the image is available through Cloud Marketplace in the console. SchedMD's Slurm workload manager, which uses the HPC VM image by default. For more information, see Creating Intel Select Solution verified clusters. Omnibond CloudyCluster, which uses the HPC VM image by default.

[Technology] ๐ŸŒŽ https://cloud.google.com/blog/topics/hpc/ga-rocky-linux-8-and-centos-7-versions-of-hpc-vm-image/

๐Ÿ—ฟarticle.to_s

------------------------------
Title: Rocky Linux 8 and CentOS 7 versions of HPC VM image now generally available
Summary: Today weโ€™re excited to announce the general availability of Rocky Linux 8-based and CentOS 7-based HPC Virtual Machine (VM) images for high-performance computing (HPC) workloads, with a focus on tightly-coupled workloads, such as weather forecasting, fluid dynamics, and molecular modeling.
With the HPC VM image, we have made it easy to build an HPC-ready VM instance, incorporating our best practices running HPC on Google Cloud, including:


VMs ready for HPC out-of-the-box - No need to manually tune performance, manage VM reboots, or stay up to date with the latest Google Cloud updates for tightly-coupled HPC workloads, especially with our regular HPC VM image releases. Reboots will be automatically triggered when tunings require them and this process will be managed for you by the HPC VM image. 


Networking optimizations for tightly-coupled workloads - Optimizations that reduce latency for small messages are included, which benefits applications that are heavily dependent on point-to-point and collective communications.


Compute optimizations - Optimizations that reduce system jitter are included, which makes single-node performance consistent, important to improving scalability.


Improved application compatibility - Alignment with the node-level requirements of the Intel HPC platform specification enables a high degree of interoperability between systems.


Performance measurement using HPC benchmarks
We have compared the performance of the HPC VM images against the default CentOS 7 and GCP-optimized Rocky Linux 8 images across Intel MPI Benchmarks (IMB).
The benchmarks were run against the following images.
HPC Rocky Linux 8


Image name: hpc-rocky-linux-8-v20240126


Image project: cloud-hpc-image-public


Default GCP Rocky Linux 8


Image name: rocky-linux-8-optimized-gcp-v20240111


Image project: rocky-linux-cloud


Each cluster of machines was deployed with compact placement with max_distance=1, meaning all VMs were placed on hardware that were physically on the same rack to minimize network latency.
Intel MPI Benchmark (IMB) Ping-Pong
IMB Ping-Pong measures the latency when transferring a fixed-sized message between two ranks on different VMs. We saw up to a 15% improvement when using the HPC Rocky Linux 8 image compared to the default GCP Rocky Linux 8 image.
Benchmark setup


2 x h3-standard-88


MPI library: Intel OneAPI MPI library 2021.11.0


MPI benchmarks application: Intel MPI Benchmarks 2019 Update 6


MPI environment variables:



I_MPI_PIN_PROCESSOR_LIST=0


I_MPI_FABRICS=shm:ofi


FI_PROVIDER=tcp



Command line: mpirun -n 2 -ppn 1 -bind-to core -hostfile <hostfile> IMB-MPI1 Pingpong -msglog 0:16 -iter 50000


Results







  
    
      
  

    

      
      
        
        
        
        
      
        Pingpong 1 PPN - Rocky Linux 8 (lower is better)
      
    

  
      
    
  





Intel MPI Benchmark (IMB) AllReduce - 1 process per node
The IMB AllReduce benchmark measures the collective latency among multiple ranks across VMs. It reduces a vector of a fixed length with the MPI_SUM operation.
To isolate networking performance, we initially show 1 PPN (process-per-node) results (1 MPI rank) on 8 VMs.
We saw an improvement of up to 35% when comparing the HPC Rocky Linux 8 image to the default GCP Rocky Linux 8 image.
Benchmark setup


8 x h3-standard-88


1 process per node


MPI library: Intel OneAPI MPI library 2021.11.0


MPI benchmarks application: Intel MPI Benchmarks 2019 Update 6


MPI environment variables:



I_MPI_FABRICS=shm:ofi


FI_PROVIDER=tcp


I_MPI_ADJUST_ALLREDUCE=11



Command line: mpirun -n 008 -ppn 01 -bind-to core -hostfile <hostfile> IMB-MPI1 Allreduce -msglog 0:16 -iter 50000 -npmin 008


Results







  
    
      
  

    

      
      
        
        
        
        
      
        Allreduce 1 PPN - Rocky Linux 8 (lower is better)
      
    

  
      
    
  





Intel MPI Benchmark (IMB) AllReduce - 1 process per core (88 processes per node)
We show 88 PPN results where there are 88 MPI ranks/node and 1 thread/rank (704 ranks).
For this test, we saw an improvement of up to 25% when comparing the HPC Rocky Linux 8 image to the default GCP Rocky Linux 8 image.
Benchmark setup


8 x h3-standard-88


1 process per core (88 processes per node)


MPI library: Intel OneAPI MPI library 2021.11.0


MPI benchmarks application: Intel MPI Benchmarks 2019 Update 6


MPI environment variables:



I_MPI_FABRICS=shm:ofi


FI_PROVIDER=tcp


I_MPI_ADJUST_ALLREDUCE=11



Command line: mpirun -n 704 -ppn 88 -bind-to core -hostfile <hostfile> IMB-MPI1 Allreduce -msglog 0:16 -iter 50000 -npmin 704


Results







  
    
      
  

    

      
      
        
        
        
        
      
        Allreduce 88 PPN - Rocky Linux 8 (lower is better)
      
    

  
      
    
  





The latency, bandwidth, and jitter improvements in the HPC VM Image have resulted in historically higher MPI workload performance. We plan to update this blog as more performance results become available.
Cloud HPC Toolkit and the HPC VM image
You can use the HPC VM image through the Cloud HPC Toolkit, an open-source tool that simplifies the process of deploying environments for a variety of workloads, including HPC, AI, and machine learning. In fact, the Toolkit blueprints and Slurm images based on Rocky Linux 8 and CentOS 7 use the HPC VM image by default. Using the Cloud HPC Toolkit, you can add customization on top of the HPC VM image, including installing new software and changing configurations, making it even more useful. 
By using the Cloud HPC Toolkit to customize images based on the HPC VM Image, it is possible to create and share blueprints for producing optimized and specialized images, improving reproducibility while reducing setup time and effort.
How to get started
You can create an HPC-ready VM by using the following options:

Cloud HPC Toolkit

Google Cloud CLI


Google Cloud console - Note: the image is available through Cloud Marketplace in the console.


SchedMD's Slurm workload manager, which uses the HPC VM image by default. For more information, see Creating Intel Select Solution verified clusters.


Omnibond CloudyCluster, which uses the HPC VM image by default.



Author: Rohit Ramu
PublishedDate: 2024-03-22
Category: Technology
NewsPaper: Google Cloud Blog
Tags: Compute, HPC
{"id"=>863,
"title"=>"Rocky Linux 8 and CentOS 7 versions of HPC VM image now generally available",
"summary"=>"

Today weโ€™re excited to announce the general availability of Rocky Linux 8-based and CentOS 7-based HPC Virtual Machine (VM) images for high-performance computing (HPC) workloads, with a focus on tightly-coupled workloads, such as weather forecasting, fluid dynamics, and molecular modeling.

\n

With the HPC VM image, we have made it easy to build an HPC-ready VM instance, incorporating our best practices running HPC on Google Cloud, including:

\n
    \n
  • \n

    VMs ready for HPC out-of-the-box - No need to manually tune performance, manage VM reboots, or stay up to date with the latest Google Cloud updates for tightly-coupled HPC workloads, especially with our regular HPC VM image releases. Reboots will be automatically triggered when tunings require them and this process will be managed for you by the HPC VM image.ย 

    \n
  • \n
  • \n

    Networking optimizations for tightly-coupled workloads - Optimizations that reduce latency for small messages are included, which benefits applications that are heavily dependent on point-to-point and collective communications.

    \n
  • \n
  • \n

    Compute optimizations - Optimizations that reduce system jitter are included, which makes single-node performance consistent, important to improving scalability.

    \n
  • \n
  • \n

    Improved application compatibility - Alignment with the node-level requirements of the Intel HPC platform specification enables a high degree of interoperability between systems.

    \n
  • \n
\n

Performance measurement using HPC benchmarks

\n

We have compared the performance of the HPC VM images against the default CentOS 7 and GCP-optimized Rocky Linux 8 images across Intel MPI Benchmarks (IMB).

\n

The benchmarks were run against the following images.

\n

HPC Rocky Linux 8

\n
    \n
  • \n

    Image name: hpc-rocky-linux-8-v20240126

    \n
  • \n
  • \n

    Image project: cloud-hpc-image-public

    \n
  • \n
\n

Default GCP Rocky Linux 8

\n
    \n
  • \n

    Image name: rocky-linux-8-optimized-gcp-v20240111

    \n
  • \n
  • \n

    Image project: rocky-linux-cloud

    \n
  • \n
\n

Each cluster of machines was deployed with compact placement with max_distance=1, meaning all VMs were placed on hardware that were physically on the same rack to minimize network latency.

\n

Intel MPI Benchmark (IMB) Ping-Pong

\n

IMB Ping-Pong measures the latency when transferring a fixed-sized message between two ranks on different VMs. We saw up to a 15% improvement when using the HPC Rocky Linux 8 image compared to the default GCP Rocky Linux 8 image.

\n

Benchmark setup

\n
    \n
  • \n

    2 x h3-standard-88

    \n
  • \n
  • \n

    MPI library: Intel OneAPI MPI library 2021.11.0

    \n
  • \n
  • \n

    MPI benchmarks application: Intel MPI Benchmarks 2019 Update 6

    \n
  • \n
  • \n

    MPI environment variables:

    \n
  • \n
      \n
    • \n

      I_MPI_PIN_PROCESSOR_LIST=0

      \n
    • \n
    • \n

      I_MPI_FABRICS=shm:ofi

      \n
    • \n
    • \n

      FI_PROVIDER=tcp

      \n
    • \n
    \n
  • \n

    Command line: mpirun -n 2 -ppn 1 -bind-to core -hostfile <hostfile> IMB-MPI1 Pingpong -msglog 0:16 -iter 50000

    \n
  • \n
\n

Results

\n
\n\n\n\n\n\n\n \n
\n
\n \n\n
\n\n \n \n \n \n \n \n \n

Pingpong 1 PPN - Rocky Linux 8 (lower is better)

\n \n
\n\n \n
\n
\n \n\n\n\n\n
\n

Intel MPI Benchmark (IMB) AllReduce - 1 process per node

\n

The IMB AllReduce benchmark measures the collective latency among multiple ranks across VMs. It reduces a vector of a fixed length with the MPI_SUM operation.

\n

To isolate networking performance, we initially show 1 PPN (process-per-node) results (1 MPI rank) on 8 VMs.

\n

We saw an improvement of up to 35% when comparing the HPC Rocky Linux 8 image to the default GCP Rocky Linux 8 image.

\n

Benchmark setup

\n
    \n
  • \n

    8 x h3-standard-88

    \n
  • \n
  • \n

    1 process per node

    \n
  • \n
  • \n

    MPI library: Intel OneAPI MPI library 2021.11.0

    \n
  • \n
  • \n

    MPI benchmarks application: Intel MPI Benchmarks 2019 Update 6

    \n
  • \n
  • \n

    MPI environment variables:

    \n
  • \n
      \n
    • \n

      I_MPI_FABRICS=shm:ofi

      \n
    • \n
    • \n

      FI_PROVIDER=tcp

      \n
    • \n
    • \n

      I_MPI_ADJUST_ALLREDUCE=11

      \n
    • \n
    \n
  • \n

    Command line: mpirun -n 008 -ppn 01 -bind-to core -hostfile <hostfile> IMB-MPI1 Allreduce -msglog 0:16 -iter 50000 -npmin 008

    \n
  • \n
\n

Results

\n
\n\n\n\n\n\n\n \n
\n
\n \n\n
\n\n \n \n \n \n \n \n \n

Allreduce 1 PPN - Rocky Linux 8 (lower is better)

\n \n
\n\n \n
\n
\n \n\n\n\n\n
\n

Intel MPI Benchmark (IMB) AllReduce - 1 process per core (88 processes per node)

\n

We show 88 PPN results where there are 88 MPI ranks/node and 1 thread/rank (704 ranks).

\n

For this test, we saw an improvement of up to 25% when comparing the HPC Rocky Linux 8 image to the default GCP Rocky Linux 8 image.

\n

Benchmark setup

\n
    \n
  • \n

    8 x h3-standard-88

    \n
  • \n
  • \n

    1 process per core (88 processes per node)

    \n
  • \n
  • \n

    MPI library: Intel OneAPI MPI library 2021.11.0

    \n
  • \n
  • \n

    MPI benchmarks application: Intel MPI Benchmarks 2019 Update 6

    \n
  • \n
  • \n

    MPI environment variables:

    \n
  • \n
      \n
    • \n

      I_MPI_FABRICS=shm:ofi

      \n
    • \n
    • \n

      FI_PROVIDER=tcp

      \n
    • \n
    • \n

      I_MPI_ADJUST_ALLREDUCE=11

      \n
    • \n
    \n
  • \n

    Command line: mpirun -n 704 -ppn 88 -bind-to core -hostfile <hostfile> IMB-MPI1 Allreduce -msglog 0:16 -iter 50000 -npmin 704

    \n
  • \n
\n

Results

\n
\n\n\n\n\n\n\n \n
\n
\n \n\n
\n\n \n \n \n \n \n \n \n

Allreduce 88 PPN - Rocky Linux 8 (lower is better)

\n \n
\n\n \n
\n
\n \n\n\n\n\n
\n

The latency, bandwidth, and jitter improvements in the HPC VM Image have resulted in historically higher MPI workload performance. We plan to update this blog as more performance results become available.

\n

Cloud HPC Toolkit and the HPC VM image

\n

You can use the HPC VM image through the Cloud HPC Toolkit, an open-source tool that simplifies the process of deploying environments for a variety of workloads, including HPC, AI, and machine learning. In fact, the Toolkit blueprints and Slurm images based on Rocky Linux 8 and CentOS 7 use the HPC VM image by default. Using the Cloud HPC Toolkit, you can add customization on top of the HPC VM image, including installing new software and changing configurations, making it even more useful.ย 

\n

By using the Cloud HPC Toolkit to customize images based on the HPC VM Image, it is possible to create and share blueprints for producing optimized and specialized images, improving reproducibility while reducing setup time and effort.

\n

How to get started

\n

You can create an HPC-ready VM by using the following options:

\n
",
"content"=>nil,
"author"=>"Rohit Ramu",
"link"=>"https://cloud.google.com/blog/topics/hpc/ga-rocky-linux-8-and-centos-7-versions-of-hpc-vm-image/",
"published_date"=>Fri, 22 Mar 2024 16:00:00.000000000 UTC +00:00,
"image_url"=>nil,
"feed_url"=>"https://cloud.google.com/blog/topics/hpc/ga-rocky-linux-8-and-centos-7-versions-of-hpc-vm-image/",
"language"=>nil,
"active"=>true,
"ricc_source"=>"feedjira::v1",
"created_at"=>Sun, 31 Mar 2024 21:42:31.207927000 UTC +00:00,
"updated_at"=>Mon, 13 May 2024 18:44:20.526911000 UTC +00:00,
"newspaper"=>"Google Cloud Blog",
"macro_region"=>"Technology"}
Edit this article
Back to articles