Skip to contents

Using OpenMP with Rcpp

The RcppOpenMP R package provides an example of parallelizing a C++ routine with OpenMP directly inside an Rcpp function. It is the pure-C++ companion to rcpp-c-and-openmp, which instead parallelizes a separate C routine.

Usage

To install the package, you must first have a compiler on your system that is compatible with R and that supports OpenMP. For help on obtaining a compiler consult either macOS or Windows guides.

With a compiler in hand, one can then install the package from GitHub by:

# install.packages("remotes")
remotes::install_github("coatless-rd-rcpp/rcpp-openmp")
library("RcppOpenMP")

Implementation Details

The goal of this project is to show how a single Rcpp function can be parallelized with OpenMP without leaving C++. There are two moving parts: the Rcpp function that carries an OpenMP directive, and the Makevars files that turn OpenMP on in a portable way.

The example routine computes the Euclidean norm of each row of a numeric matrix. The work for each row is independent, so the loop over rows is a natural candidate for parallelization.

.
├── DESCRIPTION                         # Package metadata
├── NAMESPACE                           # Function and dependency registration
├── R                                   # R functions
   ├── RcppExports.R                   # Autogenerated R to C++ bindings by Rcpp
   └── RcppOpenMP-package.R            # Package documentation
├── README.md
├── man                                 # Package documentation
   ├── RcppOpenMP-package.Rd
   └── parallel_row_norms.Rd
└── src                                 # Compiled code
    ├── Makevars                        # Enable OpenMP on Unix-alikes (macOS/Linux)
    ├── Makevars.win                    # Enable OpenMP on Windows
    ├── RcppExports.cpp                 # Autogenerated R bindings
    └── parallel_row_norms.cpp          # The OpenMP-parallelized Rcpp routine

The Parallel C++ Routine

The routine lives in src/parallel_row_norms.cpp. The <omp.h> header is included only when the compiler advertises OpenMP support through the _OPENMP macro, which keeps the routine compiling cleanly on toolchains that lack OpenMP. The #pragma omp parallel for directive splits the outer loop over rows across the available threads.

#include <Rcpp.h>
#include <cmath>

#ifdef _OPENMP
#include <omp.h>
#endif

// [[Rcpp::export]]
Rcpp::NumericVector parallel_row_norms(Rcpp::NumericMatrix x) {
  int n = x.nrow();
  int p = x.ncol();
  Rcpp::NumericVector out(n);

  #pragma omp parallel for
  for (int i = 0; i < n; i++) {
    double ss = 0.0;
    for (int j = 0; j < p; j++) {
      double v = x(i, j);
      ss += v * v;
    }
    out[i] = std::sqrt(ss);
  }

  return out;
}

The exported function also carries //' roxygen comments (trimmed from the excerpt here); Rcpp::compileAttributes() copies these into R/RcppExports.R, which is where the package’s R help page is generated from.

Enabling OpenMP with Makevars

OpenMP is enabled through src/Makevars (Unix-alikes) and src/Makevars.win (Windows). Rather than hard-coding a flag such as -fopenmp, both files use R’s $(SHLIB_OPENMP_CXXFLAGS) variable, which expands to the correct OpenMP flag for whichever C++ compiler R was built with. The same variable is supplied to PKG_LIBS so the OpenMP runtime is linked in.

PKG_CXXFLAGS = $(SHLIB_OPENMP_CXXFLAGS)
PKG_LIBS = $(SHLIB_OPENMP_CXXFLAGS)

Because the parallel #pragma lives in a .cpp file, the C++ flag $(SHLIB_OPENMP_CXXFLAGS) is the correct and sufficient choice here. The rcpp-c-and-openmp example places its directive in a .c file instead, so it additionally sets the matching PKG_CFLAGS = $(SHLIB_OPENMP_CFLAGS) for that translation unit alongside the C++ flag.

Thread Safety

The body of the parallel loop reads only the input matrix and writes to the pre-allocated output vector, with each thread writing a distinct index. It never allocates an Rcpp object or calls into the R API, which is what keeps the routine thread-safe. Touching the R API from inside an OpenMP region, for example by creating an Rcpp::NumericVector or calling an R function, is not safe, because R is single-threaded. Allocate any R objects before the parallel region, as the output vector is allocated here, and confine the loop body to plain C++.

The number of threads is governed by the OpenMP runtime, for example the OMP_NUM_THREADS environment variable, and not by the function itself.

DESCRIPTION

Surfacing the C++ code with Rcpp requires Rcpp under both LinkingTo (for the headers used at compile time) and Imports (so it is available at run time).

LinkingTo:
    Rcpp
Imports:
    Rcpp (>= 1.0.12)

License

GPL (>= 2)