Category Archives: Fixes

Importing date data into R from Excel

If you’re importing dates from Excel into R with the gdata package, you might get some funny numbers instead of the dates you’re expecting. These numbers represent the number of days from January 1, 1904 (or 1900) depending on what version of Excel the dates were entered in.

This StackOverflow question helped me get numbers like 40693 looking more like dates. So something like

as.Date(40693, origin="1899-12-30")

worked for me. I don’t really know why I had to use 1899-12-30 for the origin – I thought  should use 1900-01-01, but that was off by a couple days. 1899-12-30 gave me the right dates, so I won’t argue. I will, however, be double checking the other dates for weirdness.

Slow for loops in R

When I wrote my first for loop ever (in R), it was a pretty exciting moment in my life. There was definitely some happy dancing. Some of my for loops were really fast, but others that seemed pretty similar to my eyes took a really long time. I had a hard time figuring out why. If you are in the same boat, I recommend reading this section of Thomas Girke‘s Programming in R manual. It’s much faster than the “trial-and-error with occasional advice from other R users” approach I took.

Error installing R package e1071

I wanted to play around with the Floyd-Warshall algorithm today. Lucky for me, it’s included as part of the e1071 package. When I tried to install it – from the CRAN repository and from source – I got

ERROR: 'configure' exists but is not executable

The solution (from a recent post on Jonathan Callahan’s blog) is to “set the TMPDIR environment variable which R will use as the compilation directory.” For me, this meant

$ mkdir ~/tmp
$ export TMPDIR=~/tmp

R adding ‘X.’ to column names

I received some data recently in an Excel file. I opened it in LibreOffice Calc to have a quick look, saved it as a csv, and (tried) to get down to business in R. But all of my column names were an awful mess. Instead of

SPU_Number  SPU_Name  Long_Site  Site

my column names were prepended with X. and appended with a period, like so:

X.SPU_Number.  X.SPU_Name.  X.Long_Site.  X.Site.

I’d used read.csv(), which uses read.table(). This stackoverflow answer clued me in to the fact that R thought there were special characters at the beginning and end of my column names. I reopened the file in LibreOffice Calc and annoying saw absolutely no special characters at the beginning and end of my column names. But when I opened up the file in a normal text editor I saw that my column names had all been single quoted!

To prevent this from happening to you, make sure you check Edit Filter Settings in the Save As dialogue and then make sure both Save cell content as shown and Quote all text cells are unchecked. Or don’t use LibreOffice Calc – Excel won’t screw up your column names like this.