OREANDA-NEWS. Computing cores aren’t getting faster. Rather processors are getting more parallel.  This has been the trend for the last decade, and is likely to continue.

If you’re a researcher, you can take advantage of parallel computing to accelerate your scientific application with OpenACC. It’s an approach that’s resulted in big leaps forward for many of your colleagues.

The HPC community has embraced OpenACC because it simplifies parallel programming for modern processors, like GPUs. Indeed, 8,000-plus researchers and scientists have already adopted the OpenACC programming standard since it was developed four years ago by leading HPC vendors like Cray, PGI and NVIDIA.

To get this power in the hands of more researchers, we’re releasing our NVIDIA OpenACC Toolkit. It’s a free, all-in-one suite of OpenACC parallel programming tools.

OpenACC Toolkit, Free for Academia

Our OpenACC Toolkit makes it easier than ever to jump in with OpenACC. If you’re a researcher, it provides virtually everything you need to quickly and easily program GPUs.

It features the industry-leading PGI Accelerator Fortran/C Workstation Compiler Suite for Linux, which supports the OpenACC 2.0 standard. With the OpenACC Toolkit, we’re making the compiler free to academic developers and researchers for the first time (commercial users can sign up for a free 90-day trial).

It also includes the NVProf Profiler, which gives guidance on where to add OpenACC “directives,” or simple compiler hints, to accelerate your code. And it includes simple, real-world code samples to help you get started.

OpenACC: Code Once, Run More Places

These simple directives do more than just let researchers take advantage of the power of accelerated computing. They also let your existing CPU code remain intact. So all the time spent building code isn’t wasted.

A key feature of OpenACC is performance portability, and the PGI OpenACC compiler takes this to a new level. The compiler can, for the first time, speed up OpenACC code on x86 multi-core CPUs, as well as on GPUs.

So, when you don’t have a system with GPUs, the compiler will parallelize the code on x86 CPU cores for performance boost. When a GPU is present, the compiler will parallelize the code for the GPU. The result: 5-10x faster performance than with multi-core CPUs.

delivers-portability-web

This x86 CPU portability feature is in beta today with key customers. And we’re planning to make it widely available in the fourth quarter of this year.

Computational Chemistry App: 12x Faster, Under 100 Lines of Code

Janus Juul Eriksen, a post-doc at qLEAP Center for Theoretical Chemistry at Aarhus University, provides a case study in the benefits of OpenACC. He maintains an application known as LS-DALTON that’s used for complex large-scale molecular simulations.

He wanted to use LS-DALTON simulate larger scientific problems on the GPU-accelerated Titan supercomputer at Oak Ridge National Labs. But, like many researchers, Eriksen, who taught himself to code Fortran, has no formal education in computer science.

But with OpenACC, he was able to accelerate key algorithms in LS-DALTON up to 12x over CPU version in just days. And he didn’t have to change any of the algorithms in his application to run on one of the world’s most powerful supercomputers.

ls-dalton-web