This tutorial explains different methods for using the cloud to run an analysis in parallel.
Running the analysis in parallel
So far our analysis is run sequentially so it will not be able to
take advantage of the multiple cores available on a cloud machine. To
make use of parallel computing we can set up our script to run each
model on a separate core. Because this example is so simple I have added
a delay in the function so we can see the benefit of running in
parallel. There are many ways to do this and I will explain a few
options. One option is to create a parallel backend in R using something
like the future
package. This can be quite straight forward
and there are packages to connect this with familiar methods of
iterating (eg for loops, lapply
or
purrr::map
). This can get a bit tricky in that you are
typically running only parts of the script on multiple cores and you
need to make sure the right dependencies are available on the workers.
future
manages this for the most part but see https://future.futureverse.org/articles/future-4-issues.html
for tips when that fails.