Question of the Week

Using Deducer to work with R

2011-08-16 by andyw. 1 comments

If one checks out the initial question that prompted this series, a common theme in the answers is that one should use the GUI as a tool to help one build code (and not just as a crutch to do the analysis). Being able to view the code produced by the GUI should help beginner R users learn the commands (and facilitate scripting analysis in the future).  The following blog post highlights one of the popular GUI extensions to R, the Deducer package, and my initial impressions of the package.

It is obvious the makers of the Deducer package have spent an incredible amount of time creating a front end GUI to build plots using the popular ggplot2 package. While it has other capabilities, it is probably worth checking out for this feature alone (although it appears R-Commander recently added similar capabilities as well).

Installation: Long story short, on my Windows 7 laptop I was able to install Deducer once I updated to the latest version of R (2.13 as of writing this post), and installed some missing Java dependencies. After that all was well, and installing Deducer is no different than installing any other package.

What does Deducer do? Deducer adds GUI functionality to accomplish the following tasks:

  • load in data from various formats (csv, SPSS, etc.)
  • view data + variable types in separate data viewer
  • conduct data transformations (recode, edit factors, transformations, transpose, merge)
  • statistical analysis (mean differences, contingency tables, regression analysis)
  • A GUI for building plots using the ggplot2 package

Things I really like:

  • data viewer (with views for spreadsheet and variable view)
  • ability to import data from various formats
  • regression model explorer
  • interactive building of plots (the ability to update currently built plots is pretty awesome)

Minor Quibbles:

  • I would like all of the commands to open in their own call window, not just plots (or be able to route commands to an open (or defined script window) ala log files.
  • I am unable to use the console window if another deducer window is open (e.g. data view, regression model explorer).

Overall I’m glad I checked out the package. I suspect I will be typing library(Deducer) in the future when I am trying to make some plots with the ggplot2 package. The maintainers of the package did a nice job with including a set of commands that are essential for data analysis, along with an extensive set of online tutorials and a forum for help with the software. While the code Deducer produces by point and click is not always the greatest for learning the R language, it is a start in the right direction for those wishing to learn R.

Using Emacs to work with R

2011-08-02 by chlalanne. 10 comments

A simple yet efficient way to work with R consists in writing R code with your favorite text editor and sending it to the R console. This allows to build efficient R code in an incremental fashion. A good editor might even provide syntax highlighting, parenthesis matching, and a way to send a selected portion of code to R directly. That may appear a crude way of working with R, especially for those used to statistical packages featuring a spreadsheet-view of the data and a lot of menus and buttons with which the user can interact. However, R is a statistical language and offers a lot more interactivity, though that might hardly be reduced in a series of click and go actions. So, basically, let’s keep it simple and just use an R-aware text editor.

Well, install Emacs if it is not already present on your system, and you’re almost done. Emacs is a powerful tool (it’s difficult to say it is just an editor) for programmers and users dealing with text file. It offers a lot of functionalities and will be suitable for the basic copy/paste activity described above. But, wait. There is more to see with the ESS package. Now, you will have access to a lot of R-specific functionalities, including syntax highlighting, auto-indentation of code, line-by-line evaluation, etc. and you won’t have to open an external R console: everything can be done from within Emacs.

A nice overview of Emacs capabilities has been given by Dirk Eddelbuettel in his answer to the following question on SO.

In the following we will describe one possible way of working with ESS. This will be oriented towards users which have minimal experience with Emacs. First, you will need to learn how to perform basic text operations with Emacs. Since Emacs is very sophisticated, finding out how to simply select text and copy might be a challenge. So invest some time in finding out how to do that efficiently. Reading the manual might help. For Mac OS X, you can use Aquamacs which supports native shortcuts, among other things.

Working with ESS does not differ from working with R. The same rules for organizing code should apply. We suggest using a separate directory for every project, which resides in a parent directory called for example R, which resides in some directory which is easily accessible. For Unix type OS (Linux and Mac OS X) this would be your home directory. For Windows, I recommend to point Emacs home directory to the directory where all your source resides, this might involve setting some environmental variables for older versions of Windows.

As you see the initial investment might seem a bit daunting. But do not worry, the rewards as with R are great.

To start ESS simply start Emacs and press M-x R. You should see something like that:

Start ESS with Emacs

This how Emacs window looks on Mac OS X using Aquamacs Emacs distribution. You should see something similar with other Emacs distributions. One of the defining features of Emacs is its mini-buffer. It is the line at the bottom of the window, which contains the line M-x R. Every shortcut you press is reflected in this mini-buffer. If Emacs is stuck and does not respond look at the mini-buffer first, chances are that you inadvertently used some keyboard shortcut to invoke some Emacs command and now it is waiting for your further input. The way to get out of this is to quit the current command in the mini-buffer using the shortcut C-g. This will probably be the most used shortcut for first time Emacs users.

After pressing M-x R press enter. You should get the following prompt:

Select R directory

Select the directory (you can use Tab to complete long directory names) and press Enter. What you get is similar to usual R prompt:

R start

The improvement on the usual prompt is that it is also simple text file containing everything R produced. So for example you can scroll to your previous command and put the cursor on it, press Enter and it will be reproduced.

After starting R process, we suggest dividing emacs in two windows (Emacs terminology). Then on the left you can have a source code, which will be sent to the R process on the right. The relevant shortcuts are C-x 3 for splitting windows vertically, C-x 1 for making the current buffer the only window and C-x 2 for splitting windows horizontally. Here is how it looks after C-x 3:

Sample R ESS workflow

When sending code to R, it is advisable to keep distinction between functions and R statements. You can do this by keeping all your functions in one file called 10code.R for example. Then you can simply load this file using load ESS file option (shortcut C-c C-l). The advantage of this approach is that it sources all the functions and produces nothing in the R buffer. If there is an error in your code then ESS shows a message in the minibuffer and you can investigate it by pressing C-c `.

The other pieces of code are R statements, which should be kept self-explanatory: load data, clean data, fit statistical model, inspect the results, produce the final results. The source code for these statements should be the current status of the project. The intention is that after your project is finished, sourcing those source files should allow to reproduce the entire project. You can use git or other CVS for tracking history, since Emacs has good support for working with them.

When working with R files, you can work with one R statement at a time, which you send to the R process via the “Eval function, paragraph, statement” command, aka C-c C-c. This command sends the paragraph to the active R process, i.e. the text which is delimited by new lines. This is handy since you can group R statements into tasks, and send whole task to the R process. It also does not require selecting text, which is also very convenient. The shortcut C-c C-c has the advantage that it moves the cursor to R window, so you can immediately inspect R output.

In sum, the basic workflow for working with R and ESS is moving a lot between windows and buffers. To facilitate this you can use the following shortcuts, which should be put in your .emacs file:

(define-key global-map [f1] 'Control-X-prefix)
(define-key global-map [f3] 'find-file)
(define-key global-map [f2] 'save-buffer)
(define-key global-map [f8] 'kill-buffer)
(define-key global-map [f5] 'switch-to-buffer)
(define-key global-map [f6] 'other-window)
(define-key global-map [f9] 'ess-load-file)

Other specific ESS settings you can use are the following:

(setq comint-input-ring-size 1000)
(setq ess-indent-level 4)
(setq ess-arg-function-offset 4)
(setq ess-else-offset 4)

This tells ESS to make the tab 4 characters wide (the default is 2), which is a personal preference for some, and expands the number of your issued commands ESS saved in the history.

For working with R process directly, the following shortcuts can be very useful:

(add-hook 'inferior-ess-mode-hook
    '(lambda nil
          (define-key inferior-ess-mode-map [\C-up]
          (define-key inferior-ess-mode-map [\C-down]
          (define-key inferior-ess-mode-map [\C-x \t]

This recalls the R statement from your R statement history, but it tries to match it with the one which is already on your line. So, for example, typing pl in R process and pressing \C-up (that’s control and the up arrow) will cycle through all the statements which start with pl, so it will recall for example all the plot(... commands.

Another setting which you might find useful is:

 (setq ess-ask-about-transfile t)

This way ESS always asks where to save the text in the buffer with R process. You can number these files according to date, so you will always have another way to track what exactly you were doing. The only caveat of this option is that for some reason ESS sets the R buffer to read only, after loading the R. The shortcut for making buffer writable is C-x C-q.

So this is one way of working with ESS and R. Other ways are possible, Emacs and ESS are very customizable, so you can set it the way you like it. You will soon learn than using Emacs to interact with R overcome the limitations raised by using different task-oriented tools while allowing you to document and version your code, Sweave or TeXify R output in an easy way. The following screencast shows a live Emacs session where the use of R is interleaved with various Emacs tools.

A short intro to Emacs+ESS on Vimeo