Code styling

The philosophy of style

At emLab, we view code not just as a set of instructions for a computer, but as a medium of communication between researchers. In a collaborative environment where projects span years and multiple contributors, style is a form of documentation.
The primary goals of our code style guide are:

  1. Reduced Cognitive Load: When every script follows the same conventions, your brain doesn’t have to “re-parse” the structure of a file every time you switch projects.
  2. Meaningful Diffs: Standardized formatting ensures that GitHub “diffs” reflect actual logic changes rather than trivial changes in whitespace or indentation.
  3. Error Prevention: Many style rules (like those enforced by linters) catch common bugs—such as unused variables or misplaced parentheses—before the code is even run.
  4. Inclusivity: Clean, well-spaced code is more accessible to junior researchers and external collaborators.

Style guides by language

We maintain specific style preferences for each language used at the lab. While these guides provide the rules, the Code Formatters section below explains how to automate them.

R and the Tidyverse

Our R code follows the Tidyverse Style Guide. While you should read the full guide, here are the universal requirements that should be used across all emLab projects:

  • Object names: Use snake_case for variables and functions (e.g., sea_surface_temp, not SeaSurfaceTemp or sst).
  • Spacing: Always put a space after a comma, and around most binary operators (==, +, -, <-, etc.).
  • Pipes: Use the native pipe |> (R 4.1+) or the magrittr pipe %>%. Always put a space before the pipe and a new line after it.
  • Indentation: Use 2 spaces for indentation. Never use tabs.
  • Long lines: Limit code to 80 characters per line. This ensures scripts are readable on smaller laptop screens and in side-by-side GitHub views.

Other languages

For other languages, we defer to established industry standards:

  • Python: We follow PEP 8, which is the de facto standard for Python code.
  • Stata: We follow the principles outlined in the Stata Guide and common research best practices.
  • General (C++, Java, etc.): For languages not explicitly covered, we follow the Google Style Guides.
  • SQL: Use uppercase for keywords (e.g., SELECT, FROM, WHERE) and snake_case for table and column names.

Code formatters

A code formatter is a tool that automatically adjusts your code’s layout—spacing, indentation, line breaks, and alignment—to match a specific style guide. Instead of manually fixing every comma or bracket, the formatter does it for you instantly.
At emLab, we use a new generation of “Rust-based” tools. They are significantly faster than older tools and can be configured to run automatically every time you save a file.

Note: To maintain consistency across emLab, the use of these formatters is required for all new code development.

R formatting: Air (required for R users)

Air is the required high-performance R formatter for emLab projects. It is developed by the Posit team and ensures strict adherence to Tidyverse standards.

Install the extension

  • In Positron or VS Code, open the Extensions Marketplace (Cmd+Shift+X).
  • Search for and install “Air” (published by Posit).

Enable “format on save”

To make formatting automatic, open your settings.json (Cmd+Shift+P -> search for “Open User Settings (JSON)”) and paste the following:

{  
  "[r]": {  
    "editor.defaultFormatter": "posit.air",  
    "editor.formatOnSave": true  
  }  
}

Python formatting: Ruff (required for Python users)

Ruff is the required formatter and linter for all Python development at emLab. It is extremely fast and replaces several older tools with a single extension.

Install the extension

  • In Positron or VS Code, open the Extensions Marketplace (Cmd+Shift+X).
  • Search for and install “Ruff” (published by Astral Software).

Enable “format on save”

Open your settings.json and paste the following:

{  
  "[python]": {  
    "editor.defaultFormatter": "charliermarsh.ruff",  
    "editor.formatOnSave": true,  
    "editor.codeActionsOnSave": {  
      "source.fixAll.ruff": "explicit",  
      "source.organizeImports.ruff": "explicit"  
    }  
  }  
}

R Studio code formatting

If you are using RStudio (where Air is not as tightly integrated), you are still required to maintain styling.

You can use the styler package:

  1. Install the package: install.packages(“styler”).
  2. Use the Addins menu in the top toolbar to select “Style active file” before every commit.
  3. Navigate to Tools -> Global Options -> Code -> Saving and check “Ensure that source files end with a newline”.

It is still possible to use Air in RStudio by first downloading the command line tool and setting Air as an external formatter, following the instructions from Posit. Note that there is currently no way to use Air on quarto documents in RStudio.