Parallelisation
One reason for long lasting runtimes is the serial execution within R. In the most basic example, we have, e.g., a for-loop with plenty of iterations that needs to be processed one by one. From the past we know already that for-loops are not the best thing to use with R (although they became much more efficient in the past), so we would e.g. vectorise the loop using the *apply-family.
This is, however, just the first step into fast calculations, as for-loop iterations usually do not depend on each other, they could be processed all at once, or at least in groups. In other words, a for-loop is a easy candidate for parallelisation.
We have then two different strategies, either we stay on the same computer and just make use of the existing cores (which are most likely something between 4 and 8) or we can send it to different computers, run it there and collect the results later on. For the first approach we can use a concept that is known as forking, for the second approach something more sophisticated like Message Passing Interface (MPI) is needed.