Category Archives: R

Problems setting root.dir in knitr

RStudio sets the working directory to the project directory, but knitr sets the working directory to the .Rmd file directory. This creates issues when you are sourcing files relative to the project directory in your R markdown file. Specifically, knitr tells you it can’t find those files:

Error in file(filename, "r", encoding = encoding) : 
  cannot open the connection
In addition: Warning message:
In file(filename, "r", encoding = encoding) :
  cannot open file 'home/sus/Documents/research_phd/analysis/phenodata_explore.R': No such file or directory

knitr has an option for dealing with this – root.dir – and Phil Mike Jones even gives a nice little knitr chunk example of using it.

But when I tried it, knitr kept failing to find my file.

```{r "setup", include=FALSE}
opts_knit$set(root.dir = "~/Documents/research_phd/")

It turns out I should have read the documentation more closely.

Knitr’s settings must be set in a chunk before any chunks which rely on those settings to be active. It is recommended to create a knit configuration chunk as the first chunk in a script with cache = FALSE and include = FALSE options set. This chunk must not contain any commands which expect the settings in the configuration chunk to be in effect at the time of execution.

Breaking it into two chunks separating the knitr configuration from the sourcing solved the problem.

```{r "knitr config", cache = FALSE, include=FALSE}
opts_knit$set(root.dir = "~/Documents/research_phd/")

```{r, "setup", echo = FALSE}

Pinus contorta distribution map in #rstats

I made a map in R for the first time last week using these guides by Kim Gilbert and Mollie Taylor.

Pinus contorta range map including all subspecies. White areas within the distribution boundary contain no lodgepole. Based on Little 1971.

Pinus contorta range map including all subspecies. White areas within the distribution boundary contain no lodgepole. Based on Little 1971.

As you can see, I wasn’t able to show the holes in the distribution properly. Ideally, they would be actual holes showing the base map. I couldn’t get geom_map to not fill in the holes, so I overfilled them with white.

The code for the map is below and the shapefile I used is from the USGS GECSC Tree Species Distribution Maps for North America.

If anyone’s got a shapefile for just subspecies latifolia or a more recent distribution map, I’d love to use it.

pcontorta <- readShapePoly("pinucont.shp")
colors <- brewer.pal(9, "BuGn") # make pretty color palette

basemap <- get_map(location = c(lon = -120, lat= 50), #build basemap of Western North America
  color = 'color',
  source = 'google',
  maptype = 'terrain',
  zoom = 4)
basemap <- ggmap(basemap)

pcontorta.points <- fortify(pcontorta) 

lodgepole <- geom_map(inherit.aes = FALSE, #make a layer for the lodgepole distribution
  fill = colors[9],
  alpha = .5 )

holes <- geom_map(inherit.aes = FALSE, #fill the holes with white
  data = pcontorta.points[which(pcontorta.points$hole==TRUE),],
  fill = "#FFFFFF",
  alpha = 1 )

basemap + lodgepole + holes + #put it all together
xlab("Longitude") + ylab("Latitude") +
ggtitle("Lodgepole Pine distribution") 
Tagged ,

Importing date data into R from Excel

If you’re importing dates from Excel into R with the gdata package, you might get some funny numbers instead of the dates you’re expecting. These numbers represent the number of days from January 1, 1904 (or 1900) depending on what version of Excel the dates were entered in.

This StackOverflow question helped me get numbers like 40693 looking more like dates. So something like

as.Date(40693, origin="1899-12-30")

worked for me. I don’t really know why I had to use 1899-12-30 for the origin – I thought  should use 1900-01-01, but that was off by a couple days. 1899-12-30 gave me the right dates, so I won’t argue. I will, however, be double checking the other dates for weirdness.

Making R go faster

If you need more than your for loops speeded up in R, you might want to see Noam Ross’s FasteR! HigheR! StrongeR! – A Guide to Speeding Up R Code for Busy People. It’s super practical and easy to understand.