Chapter 1 Introduction

This document explains how to use R, subject to some caveats: starting in chapter 3, I only cover the packages I use; chapter 4 only covers the analytical tools I use; and, to keep it short and correct, I have skipped discussion of statistical methodology and pretty graphs.

Why R instead of Stata?

  • It’s free, so you and others can run your code now and into the future.
  • While Stata restricts by-group operations to a select few, R’s data.table package allows the equivalent of

    # Stata pseudocode
    by g: {
      ...
      ...
    } if condition
  • Objects are treated symmetrically, so we don’t have to use different rules for data tables, matrices, functions and variables. Nor do we have to learn new rules for when those objects are in Stata vs Mata or when data is loaded in memory vs on disk. Nor do we have to shuttle objects around between the various levels encountered in Stata.

1.1 Getting started with R

Go to the R project site and follow the instructions to download the latest version from CRAN, the Comprehensive R Archive Network.

After installation, I know of two good options for running R on Windows:

  • Hook up R to notepad++ with NppToR. This allows you to edit R scripts alongside all other documents. Commands are passed to the R console by customizable keyboard shortcuts and switching between multiple running R consoles is easy.

  • Install the Rstudio development environment. It puts everything in a monolithic window, which means you cannot maximize the R console on one monitor while viewing a help file on another. Although you can open multiple scripts, searching text across them is not supported. It also lacks drag-and-drop (of scripts into the editor or data into the console). On the plus side, it is cross-platform; does not require administrator privileges to install; and has “development” features like project management and report compilation.

1.2 Reading this book

code that you can type at the console appears in boxes like this.
# Results from code appear in boxes like this.

The comment symbol, #, appears in the results box so that you can safely copy-paste them into the console as well. Unfortunately, R does not have any syntax for block comments.

Some notes warn about inconsistencies and pitfalls and should be read.

Other notes give technical details and can be skipped the first time through.

Exercises are interspersed and should be pretty quick to do.

1.3 About this book

The book’s R code was compiled using R, maintained by the R Foundation; written in the Rstudio IDE; and typeset with Rmarkdown and bookdown by RStudio. Icons are from the “Very Basic. Android L Lollipop” set by Ivan Boyko licensed under CC BY 3.0. There is no explicit or implied endorsement of this document by any of the parties mentioned.