Preface

This is the website/book associated with the UCMexus Conservation Genomics Workshop 2022, held January 10th and 11th in Mérida, Mexico.

0.1 Setting up your computer

This course covers topics in landscape genetics/genomics, and relies heavily on the R programming language. In order to follow along with the code and be successful in running all of the examples, it is imperative to have very recent versions of R and RStudio, and updated versions of a number of packages.

The following is a description of the software needed to engage in the course. This setup was tested on a Mac running Mojave 10.14.6, but should also work on most other Mac or Windows operating systems.

If you are running Linux, there will be some external dependencies to install (such as the GEOS library), that are actually wrapped up in the binary versions of packages ‘terra’ and ‘sf’ on CRAN for Mac and Windows.

0.1.1 Step 1. Get the latest versions of R and RStudio

  • First, install the latest version of R. Go to https://cran.r-project.org/ and follow the appropriate link to Download and Install, depending on your operating system (Linux, MacOS, or Windows).
    • For Mac, you can download R-4.1.2.pkg and install.
    • For Windows, first go into the base directory and Download R 4.1.2, and install it. THEN, go back to “R for Window” page where you clicked into base, and download and install the Rtools as well. The latter gives you tools for building packages, which is required for a few packages we use.
  • Download and install the latest stable version of RStudio for your operating system. Go to https://www.rstudio.com/products/rstudio/download/ and choose the big blue download button for “RStudio Desktop, Open Source License, Free,” then hit the download button on the next page and follow instructions to install RStudio.

0.1.2 Step 2. Install a number of R packages that are relatively easy to install

Our work will require a number of packages that can be found in binary form on CRAN. As such, installing them is typically not to arduous.

Sometimes, when installing packages, you may get a message telling you that a later version of the package you want is available in source form than in binary form. Typically, it is still easiest, fastest, and usually reliable, to just use the binary form. So, after R gives you such a message, if you see R asking a question like:

Do you want to install from sources the package which needs compilation? (Yes/no/cancel)

Usually the appropriate answer is to type no into the console, and hit return.

I have found that sometimes, when requesting the installation of a large number of packages, there can be the occasional problem. Sometimes, an error message tells you which package (often a dependency) failed to install. If that is the case, try to install the failed package by itself with the install.packages(), function, directly, and then re-run the install.packages() command that originally failed.

You can install packages in a few rounds of different types. Note that I tend to use the RStudio CRAN mirror.

0.1.2.4 The ‘gradientForest’ package

Now, we come to a somewhat harder package to install, because it requires some compilation. We use these packages for the gradient random forest analysis on the last day. ‘gradientForest’ requires the dependancy ‘extendedForest,’ an R package for classification and regression based on forest trees using random inputs.

Both of these packages require compilation, it seems. With the newest version of R, ‘extendedForest’ is installed automatically as a dependency.

To compile packages might require a little more configuration, etc.

If you are using a Mac

You should be running R 4.0.0 or above. In that case, to build packages, you need the XCode Command line tools. If you don’t already have these, then you should install them. Doing so requires administrator access on your computer. Open the “Terminal” app and type:

xcode-select --install

into the command line. When asked for your password, provide it. Then click to agree on the software license agreement, and finally click the blue “install” button that comes up on the other screen.

Then, you also need to get the gFortran compiler for Mac. Information about this can be found by going to https://mac.r-project.org/tools/. Find the link (inside the yellow boxes) for your appropriate type of mac (intel or ARM64), and download the appropriate installer for the gFortran compiler. Then install it. In my case, since I have an Intel Mac, I downloaded, gfortran-8.2-Mojave.dmg. To install it, you have to double click it and then go inside the folder that creates, to find the gfortran.pkg, which you can double click to launch the installer. Install it to the default location.

On my Mac running Mojave (10.14.6) I got errors when trying to compile this package. It was unable to find the C ‘stdlib.h’ header file. So, I ended up also having to do this:

sudo installer -pkg /Library/Developer/CommandLineTools/Packages/macOS_SDK_headers_for_macOS_10.14.pkg -target /

If you have compilation problems on your Mac you might try something similar. But you might have to name the headers for your own system. perhaps by tab completing on:

sudo installer -pkg /Library/Developer/CommandLineTools/Packages/macOS_SDK_headers_for_macOS_10

If you have a Windows computer

We haven’t tested it, but you should be just fine so long as you installed the Rtools as described above.

Finally, if you have set up your system on Mac or Windows, install the package

The package we need is not on CRAN. Rather, it is on the R-forge repository.

In this case, we might be asked:

Package which is only available in source form, and may need compilation of
  C/C++/Fortran: ‘extendedForest’
Do you want to attempt to install these from sources? (Yes/no/cancel)

And, for this one, we need to respond, Yes

install.packages("gradientForest", repos="https://R-Forge.R-project.org")

If all goes well, this should compile for you.

0.1.3 Step 3. Make sure you have git and an account on GitHub

In a two-day workshop, we don’t have time to go deeply, if much at all, into the many uses of the version control software, git, and the cloud-based code management system GitHub, that is built upon git. But, if you are interested in version control for your analyses, and you are interested in using GitHub to share and present the results of your research, then you really will want to become proficient with both git and GitHub.

Fortunately, there is an outstanding, free book on the web that goes into great detail about how to use git and GitHub with R and RStudio. It is available at https://happygitwithr.com/, and it is well worth a read, and particularly following the steps in:

  • Chapter 4. Register a GitHub account.
  • Chapter 6. Install Git (Note: Mac users , if xcode-select --install ran successfully, then git will have been installed).
  • Chapter 7. Introduce yourself to git

If you want to use GitHub, you will also have to establish an SSH public/private key pair to authenticate your computer to GitHub. That is described in:

  • Chapter 10: Set up keys for SSH