Why is R slow? some explanations and MKL/OpenBLAS setup to try to fix this

Original: 2017-11-20
Updated: 2017-12-02

Introduction

Many users tell me that R is slow. With old R releases that is 100% true provided old R versions used its own numerical libraries instead of optimized numerical libraries.

But, numerical libraries do not explain the complete story. In many cases slow code execution can be attributed to inefficient code and in precise terms because of not doing one or more of these good practises:

I would add another good practise: "Use the tidyverse". Provided tidyverse packages such as dplyr benefit from Rcpp, having a C++ backend can be faster than using dplyr's equivalents in base (i.e plain vanilla) R.

The idea of this post is to clarify some ideas. R does not compete with C or C++ provided they are different languages. R and data.table package may compete with Python and numpy library. This does not mean that I'm defending R over Python or backwards. The reason behind this is that both R and Python implementations consists in an interpreter while in C and C++ it consists in a compiler, and this means that C and C++ will always be faster because in really over-simplifying terms compiler implementations are closer to the machine.

How to be sure if you have the right setup

Open RStudio and run sessionInfo() if you read something like:

Matrix products: default
BLAS/LAPACK: /opt/intel/compilers_and_libraries_2018.0.128/linux/mkl/lib/intel64_lin/libmkl_gf_lp64.so

Or like:

Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.2.20.so

Being the important to read libmkl or libblas/libopenblas at the end of these lines. Any of that means that you are ok and using your resources properly.

But, if you see something like this:

Matrix products: default
BLAS: /opt/R/R-3.4.2-defaults/lib/R/lib/libRblas.so
LAPACK: /opt/R/R-3.4.2-defaults/lib/R/lib/libRlapack.so

With variants like libRblas/libRlapack or libblas/liblapack at the end of the lines, then you are wasting time because setup inefficiencies and I invite you to reinstall R properly.

Basic setup for Ubuntu Desktop

As an Ubuntu user I can say the basic R installation from Canonical or CRAN repositories work for most of the things I do on my laptop.

When I use RStudio Server Pro© that's a different story because I really want to optimize things because when I work with large data (i.e. 100GB in RAM) a 3% more of resources efficiency or reduced execution time is valuable.

Installing R with OpenBLAS will give you a tremedous performance boost, and that will work for most of laptop situations. I explain how to do that in detail for Ubuntu 17.10 and Ubuntu 16.04 but a general setup would be as simple as one of this two options:

Being option (1) a substitute of option (2). It's totally up to you which one to use and both will give you a really fast R compared to installing it without OpenBLAS.

Basic setup for OS X

LIke I explained on this post one reason to take some time to install R properly is when data.table or other packages return curious messages when you load them. In particular, R binaries for OS X are not optimized and if you install and load data.table it will show this message:

This installation of data.table has not detected OpenMP support. 
It will still work but in single-threaded mode. 
If this is a Mac and you obtained the Mac binary of data.table from CRAN, 
CRAN's Mac does not support OpenMP.

If that's your case you will be really benefited if you follow my OS X post and install R using homebrew. I know it is slow to compile the sources but is not cool to have a cool Macbook© and do data analysis using one core of the processor.

Basic setup for Windows

I am kinda ignorant in Windows. When I used it I realized there are no numerical libraries that can be installed easily or easier than what I explain in the rest of the post.

Being Microsoft© R Open an R instance that comes wth Intel© MKL numerical libraries enables by default, I'd install that on Windows and also the graphic installer is straightforward.

Benchmarking different R setups

I already use R with OpenBLAS just like the setup above. I will compile parallel R instances to do the benchmarking.

Installing Intel© MKL numerical libraries

My benchmarks do indicate that in my case it's convenient to take the time it takes to install Intel© MKL. The execution time is strongly reduced for some operations when compared to R with OpenBLAS performance.

Run this to install MKL:

Installing CRAN R with MKL

To compile it from source (in this case it's the only option) run these lines:

Installing CRAN R with OpenBLAS

Just not to interfere with my working installation (using apt-get) I decided to compile a parallel instance from source:

Installing CRAN R with no optimized numerical libraries

There is a lot of discussion and strong evidence from different stakeholders in the R community that do indicate that this is by far the most inefficient option. I compiled this just to make a complete benchmark:

Installing Microsoft© R Open with MKL

This R version includes MKL by default and it's supposed to be easy to install. I could not make it run and that's bad because different articles (like this post by Brett Klamer) state that this R version is really efficient but no different to standard CRAN R with MKL numerical libraries.

In any case here's the code to install this version:

Update: I followed the same steps above and it works on Ubuntu 16.04 but I still can't install it on a machine with Ubuntu 17.10.

Benchmark results

My scripts above do edit ~/.profile. This is to open RStudio and work with differently configured R instances on my computer.

I released the benchmark results and scripts on GitHub. The idea is to run the same scripts from AT&T© and Microsoft© to see how different setups perform.

To work with CRAN R with MKL I had to edit ~/.profile because of how I configurated the instances. So I had to run nano ~/.profile and comment the last part of the file to obtain this result:

After that I log out and then log in to open RStudio to run the benchmark.

The other two cases are similar and the benchmark results were obtained editing ~/.profile, logging out and in and opening RStudio with the corresponding instance.

As an example, this result starts with the R version and the corresponding numerical libraries used in that sessions. Any other result are reported in the same way.

And here are the results from AT&T© script:

And here are the results from Microsoft© script:

Task CRAN R with MKL (seconds) CRAN R with OpenBLAS (seconds) CRAN R with no optimized libraries (seconds)
Matrix multiply 5.985 13.227 165.18
Cholesky Factorization 1.061 2.349 26.762
Singular Value Decomposition 7.268 18.325 47.076
Principal Components Analysis 14.932 40.612 162.338
Linear Discriminant Analysis 26.195 43.75 117.537

Concluding remarks

The benchmarks exposed here are in no way a definitive end to the long discussion on numerical libraries. My results show some evidence that indicates, that because of more speed for some operations, I should use MKL.

One of the advantages of the setup I explained is that you can use MKL with Python. In that case numpy calculations will be boosted.

Using MKL with AMD© processors might not provide an important improvement when compared to use OpenBLAS. This is because MKL uses specific processor instructions that work well with i3 or i5 processors but not neccesarily with non-Intel models.