If you are struggling with large data sets in R, you might benefit from a performance boost by using
data.table. However, when loading
data.table, at least on macOS, you may receive a warning that no OpenMP support has been detected and that
data.table is operating in a single-threaded mode. This will limit the benefits of using
data.table in the first place by not taking full advantage of the underlying hardware.
data.table 1.14.0 using 1 threads (see ?getDTthreads). Latest news: r-datatable.com
This installation of data.table has not detected OpenMP support.
It should still work but in single-threaded mode.
If this is a Mac, please ensure you are using R>=3.4.0 and have followed our Mac instructions here:
This warning message should not occur on Windows or Linux. If it does, please file a GitHub issue.
What is the issue here?
OpenMP is an implementation of multithreading and the clang compiler that ships with Xcode on macOS lacks the support for OpenMP. Apple decided not to include the
libomp.dylib run-time library in their compiler which we can check by issuing the following command.
$ clang -c omp.c -fopenmp
clang: error: unsupported option '-fopenmp'
To restore support for OpenMP in clang, one way is to 1) install the latest official LLVM release and 2) instruct R to compile
data.table with OpenMP support using a
Makevars file. Other R packages which support OpenMP will also benefit from this upgrade.
How to proceed?
In macOS, It is highly recommended to use HomeBrew package manager to ensure both automation and reproducibility. First, we need to make sure that we have the latest version of Xcode installed.
Then, if HomeBrew is not already installed, we can install it using the following command (for more details check the HomeBrew website).
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
Now, it’s time to install the latest version of the LLVM compiler.
brew update && brew install llvm
The installation of LLVM compiler is a keg-only, which in HomeBrew’s terms means that it will shadow the system’s compiler rather than overriding it. In other words, it will not symlink its binaries into
/usr/local/. So, to instruct R to build new packages using the newly installed compiler, one needs to create a
Makevars file and place it in
Let’s edit the
Makevars file to include the paths of the new compiler and the required compiler options to support OpenMP. For example, on my machine using macOS Big Sur Version 11.2.3 (at the time of writing this post), my
Makevars file looks as followed.
# General note
# Homebrew bin / opt / lib locations
# MacOS Xcode header location
# (do "xcrun -show-sdk-path" in terminal to get path)
# Make using all cores (set # to # of cores on your machine)
# GNU version
# LLVM (Clang) compiler options
FLIBS=-L$(HL)/gcc/$(GNU_VER) -lgfortran -lquadmath -lm
# STD libraries
STD_FLAGS=-g -O3 -Wall -pedantic -mtune=native -pipe
# Preprocessor FLAGS
# NB: -isysroot refigures the include path to the Xcode SDK we set above
CPPFLAGS=-isysroot $(XH) -I$(HI) \
-I$(HO)/llvm/include -I$(HO)/openssl/include \
# Linker flags (suggested by homebrew)
# Flags for OpenMP support that should allow packages that want to use
# OpenMP to do so (data.table), and other packages that bork with
# -fopenmp flag (stringi) to be left alone
Notice that you can issue the following command
xcrun -show-sdk-path to get the path to the developer tools on your system. In addition, to get
GNU_VER, one can check the contents of the following directory
/usr/local/lib/gcc/ which in my case shows that GNU version 11 is installed.
Note: For every GCC upgrade, one needs to modify the
GNU_VER to match the current version.
Now, with the above
Makevars in place. Let’s start a new R session and compile
data.table from source.
install.packages("data.table", type = "source",
repos = "https://Rdatatable.gitlab.io/data.table")
To check if we have successfully compiled
data.table with OpenMP support, let’s load the library.
# data.table 1.14.0 using 4 threads (see ?getDTthreads). Latest news: r-datatable.com
The message above shows that our
data.table is working in multithreaded mode and is currently using 4 threads. To change the number of threads used by
data.table and to make it persistent for every session, edit the
~/.Rprofile and include the following command.
# data.table configuration
Makevars configuration was adapted from the following gist.