rotate rotate2 rotate3 rotate4

Linking C with R in Windows

R is great for many aspects of statistical analysis but it is often criticized for running iterative programs (e.g., for loops) slowly.  There are many new packages that are attempting to rectify this by allowing the user to 1) use multiple processing cores (see Todd Jobe’s Blog for an example), and 2) by calling compiled code within R (the purpose of this post).  Both solutions can take time to implement but are often worth the development time as they can really pay off in the long run.  I have found that solution #1 usually is best if there are a few iterations that each take several seconds to process and solution #2 is usually best if there are many thousands of iterations over a very trivial calculation.

There is actually a decent amount on the web about how to format  your syntax in R and C to make things work out well (see this page for example); however, almost all of the instructions and tutorials out there assume that you are able to compile your C code yourself.  If you’re like me an you use a PC that runs Windows you may not be very familiar with compiling your own code. This is where I can lend some instruction based upon my own experiences.

Tutorial Outline:
1) Getting Windows ready to go
2) Writing the C code
3) Compiling the C code
4) Loading and calling C code in R

1) Getting Windows ready to go:

First you are going to have to install some tools that will allow you to build R packages and compile C code from the command line (R installation and Administration – Appendix D).

Luckily all of the tools you will need have been bundled together for you as Rtools.exe which can be downloaded from any CRAN mirror (e.g., http://cran.cs.wwu.edu/bin/windows/Rtools)

Once you have downloaded and installed those tools, you will need to change the PATH of your environment variables. I think this is necessary so that the R tools you just installed can be called from the command line or DOS prompt. Here is a link that describes three different ways to perform this: R FAQ: 2.15. We need to redefine the path such it that includes (see R installation and Administration – Appendix D):

PATH=c:\Rtools\bin;c:\Rtools\gcc-4.6.3\bin;c:\R-2.15.1\bin\i386;<others>

Note that there may be a way when installing R tools to have this path changed for you automatically, although I found that I had to perform this task manually.  Here I will describe in detail how to do this because information is scant on the web on this topic.  As noted above there are three ways to change the PATH of your environment variables, here I will describe the third method (quoted directly from R FAQ: 2.15):

“For all applications via Windows. How you set an environment variable is system specific: under Windows 2000/XP/2003 you can use `System’ in the control panel or the properties of `My Computer’ (under the `Advanced’ tab). Under Vista, go to `User Accounts’ in the Control Panel, and select your account and then `Change my environment variables’.”

Once you are in the change environment variable box, select “New”. Name the new variable “PATH” and then set the value to (at a minimum):

PATH=c:\Rtools\bin;c:\Rtools\gcc-4.6.3\bin;c:\R\R-2.15\bin\i386;

Note that above c:\R\R-2.15\bin\i386 should be tailored to your specific path directory of R that you downloaded the Rtools for. Then click “OK” twice and your environment variables should now be set and you should be ready to compile some C code via the command line with R.

After performing these steps be sure to restart your machine.

2) Writing the C code:

Before we compile our C code we need to generate some sample code or lift some from the web. Here are two example functions written in C that you can copy and paste into a text file you name “sequence_examples.c” (exactly what you name it does not matter).

/*
Filename: "sequence_examples.c"
Return a vectors of sequentially summed values
Arguments:
start -- value to start the sum at
size -- the number of elements to return
sumVect -- the vector of summed output values
*/


void sumSeq(int *start, int *size, int *sumVect){
    /*
    This function provides a simple sequential sum
    where F[n] = F[n-1] + n
    */

    int i, j ;
    j = 0 ;
    for(i = *start; i < (*start + *size); i++){
        if(i == *start){
            sumVect[j] = i ;
        }
        else{
            sumVect[j] = sumVect[j-1] + i ;
        }
        j ++ ;
    }
}

void fiboSeq(int *size, int *sumVect){
    /*
    This function returns the Fibonacci sequence
    where F[n] = F[n-1] + F[n-2]
    */

    int i ;
    sumVect[0] = 0 ;
    sumVect[1] = 1 ;
    for(i = 2; i < *size; i++){
        sumVect[i] = sumVect[i-1] + sumVect[i-2] ;
    }
}

3) Compiling the C code:

Once you have R ready to go and your bit of C code you’ll need to compile it into a .dll file that can be dynamically loaded into R.  To compile the .C file you must open the Windows Command Prompt.  This is a program that allows you to write DOS code straight into the computer’s innards (for lack of a better word).  Once the Command Prompt is open you’ll have to direct it towards the folder that contains the .C file you wish to compile. For example if my file was on my desktop, I would enter the following into the command prompt:

cd C:\Users\dmcglinn\Desktop

The cd in the above statement stands for change directory.  Now all you have to do is send the compile command to the Command Prompt with your file name. For example if I wanted to compile the file sequence_example.c I would send the following command to the Command Prompt

R CMD SHLIB sequence_examples.c

If no errors occur (error statement will appear in the Command Prompt, they are actually usually fairly informative) then two files should be produced: sequence_examples.o and sequence_examples.dll.  The file sequence_examples.dll is the only one you will need to link in with R.

4) Loading and calling C code in R

Now that our C code is compiled we are ready to bring it into the R environment like so:

dyn.load("sequence_examples.dll")

.C("sumSeq", start = as.integer(10), size = as.integer(5),
   sumVect = as.integer(rep(0, 5)))

.C("fiboSeq", size = as.integer(5),
   sumVect = as.integer(rep(0, 5)))

These R commands should return lists that contain the input to the C function and the resulting output. Because we are only usually interested in the output it is a good idea to be more specific about what portion of the list you would like returned.

.C("sumSeq", start = as.integer(10), size = as.integer(5),
   sumVect = as.integer(rep(0, 5)))$sumVect

.C("fiboSeq", size = as.integer(5),
   sumVect = as.integer(rep(0, 5)))$sumVect

That’s all there is to it. To make this cleaner next I would place my call to the .C function in an R wrapper function that handles the passing of my R variables to the C functions for me.

Please leave feedback or suggestions below. Good luck!

11 Comments

  1. By: Luke on March 3, 2011 at 12:48 pm      

    Thank you soooo much for posting this. You rock. Thank you.

  2. By: Dan McGlinn on March 10, 2011 at 9:29 am      

    Hey Luke,
    I’m glad this was helpful to you!
    Dan

  3. By: Brady on April 14, 2011 at 10:30 am      

    Dan,

    I hope all is well in NC. The prairies are greening up beautifully out here (though a little slower than normal, we need some rain).

    You might want to check out the ‘inline’ package. It does the compiling, linking, and loading of C/C++ code stored as strings in R. I’ve been experimenting/playing with it, along with the ‘Rcpp’ package. Both work quite nicely. I’m using linux, however, haven’t tried it in windows.

    All the best,

    Brady

  4. By: Dan McGlinn on April 14, 2011 at 10:45 am      

    Hey Brady,

    Great to hear from you! I miss the prairies particularly at this time of year! Thanks for the tips on those packages. I had tried to get them to work briefly but had little success. Maybe I should take another shot at it.

    Dan

  5. By: Tom Boucher on June 29, 2011 at 11:39 am      

    Hi All,
    I have the ‘inline’ package running on Windows 7, with some very simple C code being compiled, linked and loaded. So far, this is very nice. Thanks for your help.

    Best,

    T-

  6. By: Carol on November 11, 2011 at 5:48 pm      

    Thank you soooo much for this. You saved my life!

  7. By: Freddy on May 14, 2012 at 6:36 pm      

    Muchas gracias, excelente documentación.

  8. By: Philipp on November 7, 2012 at 6:23 pm      

    Hello,

    some feedback:

    * Second PATH example is listing the last path twice.
    * C source has HTML entities instead of operators.
    * R source lacks the final closing parentheses.
    * Fibbonaci breaks on size smaller than 2.

    Thanks.

  9. By: Dan McGlinn on January 17, 2013 at 12:48 am      

    Hey Philipp,

    Thank you for pointing out those mistakes. I think I fixed them all except for the note about the Fibbonaci function. I was able to use the existing code to work when the size was smaller than 2. You do have to make sure to also update the size of the the sumVect argument. For example,

    .C(“fiboSeq”, size = as.integer(1), sumVect = as.integer(rep(0, 1)))$sumVect

    will correctly return 0.

    Thanks again!
    Dan

  10. By: lisa on April 18, 2013 at 9:52 pm      

    Thanks it’s gorgeous!

    I’ve tried 2 days linking R with C and I only succeed last night in my Mac.. Now I can fully use it in my office desktop (which is Windows)!

    However when I compiled my .c file in cmd there’s a warning and I didn’t get it.. But that does not influence the generation of .o and .dll files. So I just ignore it..

    But I’d be very much appreciated if you could help me with the warning :)

    Thanks anyway! It’s really awesome!

    ————- My warnings:
    cygwin warning:
    MS DOS style path detected: …
    Preferred POSIX equivalent is: /cygdrive/….
    CYGWIN environment variable option “nodosfile warning” turns off this warning

  11. By: Dan McGlinn on April 25, 2013 at 1:32 am      

    Hey Lisa, I’m glad the post was useful for you. I’m not sure about that warning. It sounds like its just a cygwin warning and not an error coming from the c complier so yea its safe to ignore. If you are still curious about the warning I would post a question at http://stackoverflow.com/

Leave a response






Your response: