data:image/s3,"s3://crabby-images/02b32/02b32fb9a0cdfd4b5f592f4542e3c1f7921ded23" alt="Nomachine performance"
On the other hand, if the application's arithmetic intensity is higher than machine balance, then the application is more likely to be limited by how fast the computation can be done. To optimize in this case, memory inefficiencies are usually good places to examine, such as the memory access pattern, data locality and cache reuse. Usually, if an application's arithmetic intensity is lower than this point, it is considered to be bandwidth bound, i.e., bound by how fast the data can be moved through the memory system instead of how fast the calculations can be done on the CPU core or the GPU SMs. The ridge point on the Roofline is called the 'machine balance' point. The resultant curve (hollow purple) can be viewed as a performance envelope under which kernel or application performance exists. It can be used to bound floating-point performance (GFLOP/s) as a function of machine peak performance, machine peak bandwidth, and arithmetic intensity of the application. The most standard Roofline model is as follows. Its ability to extract key computational characteristics and abstract away the complexity of modern memory hierarchies has made Roofline-based analysis an increasingly popular tool in the HPC community.
data:image/s3,"s3://crabby-images/61f58/61f58254869b5db6c854922e644074ff53176856" alt="nomachine performance nomachine performance"
Nomachine performance software#
The Roofline performance model offers an intuitive and insightful way to compare application performance against machine capabilities, track progress towards optimality, and identify bottlenecks, inefficiencies, and limitations in software implementations and architecture designs.
data:image/s3,"s3://crabby-images/3aa25/3aa2505240ed3b5f5181234782c1792b6f16cf1b" alt="nomachine performance nomachine performance"
Performance models and tools are an integral part of the performance analysis and performance optimization process for users who seek higher performance and better utilization of the hardware.
data:image/s3,"s3://crabby-images/02b32/02b32fb9a0cdfd4b5f592f4542e3c1f7921ded23" alt="Nomachine performance"