Skip to contents

Generating Random Numbers in Parallel within C++

The RcppRNGParallel R package shows how to draw random numbers in parallel so that each thread produces an independent, reproducible stream. It uses the dqrng package’s generators, which can jump a stream ahead to a non-overlapping substream.

Usage

To install the package, you must first have a compiler on your system that is compatible with R and that supports OpenMP. For help on obtaining a compiler consult either macOS or Windows guides.

With a compiler in hand, one can then install the package from GitHub by:

# install.packages("remotes")
remotes::install_github("coatless-rd-rcpp/rcpp-rng-parallel")
library("RcppRNGParallel")

Implementation Details

Parallel random number generation is easy to get wrong. Sharing one generator across threads is a data race, and giving each thread a freshly built generator is worse than it looks: seed them all the same and every thread draws the identical sequence, seed them from the clock and the streams may overlap. The robust approach is to seed a single generator once and then hand each thread a copy advanced to its own substream. The dqrng xoshiro256plus generator supports exactly this through long_jump(), which skips so far ahead that the substreams cannot overlap. The example draws uniform variates this way.

.
├── DESCRIPTION                         # Package metadata
├── NAMESPACE                           # Function and dependency registration
├── R                                   # R functions
   ├── RcppExports.R                   # Autogenerated R to C++ bindings by Rcpp
   └── RcppRNGParallel-package.R       # Package documentation
├── README.md
├── man                                 # Package documentation
   ├── RcppRNGParallel-package.Rd
   ├── draw_numeric_unif.Rd
   └── max_cores.Rd
├── rngparallel.Rproj
└── src                                 # Compiled code
    ├── Makevars                        # Enable OpenMP
    ├── Makevars.win
    ├── RcppExports.cpp                 # Autogenerated R bindings
    └── parallel_rng.cpp                # The parallel sampler using dqrng substreams

The Parallel Sampler

The sampler lives in src/parallel_rng.cpp. A single dqrng::xoshiro256plus generator is seeded on the host. Inside the OpenMP parallel region, each thread copies that generator and calls long_jump() with a count unique to the thread, landing it on a substream that no other thread visits. Each thread then fills its share of the output.

// [[Rcpp::export]]
std::vector<double> draw_numeric_unif(unsigned int n,
                                      double min = 0.0, double max = 1.0,
                                      int seed = 42, int n_cores = 1) {
  std::vector<double> sample_values(n);
  if (n_cores < 1) n_cores = 1;

  // Base generator, seeded once on the host.
  dqrng::xoshiro256plus base(seed);

  #pragma omp parallel num_threads(n_cores)
  {
    // Each thread copies the base generator and jumps to an independent,
    // non-overlapping substream so the threads never overlap.
    dqrng::xoshiro256plus eng(base);
    eng.long_jump(omp_get_thread_num() + 1);

    dqrng::uniform_distribution dist(min, max);

    #pragma omp for schedule(static)
    for (unsigned int i = 0; i < n; ++i) {
      sample_values[i] = dist(eng);
    }
  }

  return sample_values;
}

The source guards the OpenMP directives with #ifdef _OPENMP and provides a serial fallback, and it carries //' roxygen comments (trimmed from the excerpt here) that Rcpp::compileAttributes() copies into R/RcppExports.R. From R:

draw_numeric_unif(100, n_cores = 2, seed = 42)

Reproducibility and Independence

Because the substreams are deterministic given the seed and the jump counts, the result is reproducible for a fixed seed and n_cores within a given build. The schedule(static) clause fixes which thread handles which index, so the mapping from index to substream does not vary between runs. Because each thread occupies its own non-overlapping substream, the threads never produce the same values, which is what makes the parallel draws statistically sound. Changing n_cores repartitions the work, so it also changes the values, and an OpenMP build and the serial fallback draw from different streams by design.

This corrects a common mistake: handing each thread a default-constructed engine. That leaves every thread with the same seed, so the sampler would simply repeat the same sequence on every core.

DESCRIPTION

The dqrng generators are header-only and are reached through LinkingTo, together with their own dependencies BH and sitmo. Rcpp provides the bridge to R.

LinkingTo:
    Rcpp,
    dqrng,
    BH,
    sitmo

src/Makevars and src/Makevars.win turn on OpenMP with R’s $(SHLIB_OPENMP_CXXFLAGS) variable:

PKG_CXXFLAGS = $(SHLIB_OPENMP_CXXFLAGS)
PKG_LIBS = $(SHLIB_OPENMP_CXXFLAGS)

License

GPL (>= 2)