Why this matters

  • Our analyses go to policymakers, on-the-ground implementers, funders, and the scientific community.
  • Our code has real-world impact. And coding mistakes have real-world implications.

Two goals drive all the code we write:

  • Accuracy — the code does what we think it does.
  • Reproducibility — someone else (or future you) can re-run it and get the same result.

‘Universal requirements’ vs. ‘best practices’

The SOP distinguishes two tiers for guidelines:

  • Universal requirements — the baseline everyone follows, on every project.
  • Best practices — what we aim for, adjusted per project.

Today’s session will focus on universal requirements, particularly the new ones.

Universal requirements at a glance

All code at emLab must:

  • Live in a repo under the emlab-ucsb GitHub organization— with a README and logical structure.
  • Have every change tracked in Git with clear commit messages.
  • [NEW] Use a code formatter — Air (R) or Ruff (Python), run on save.
  • [NEW] Follow branch-based development — no direct commits to main.
  • [NEW] Go through code review via pull requests, with code review integrated into entire project planning process.

1 · Code formatters

Style is documentation

  • Code at emLab needs to be durable
  • We share code publicaly, hand off projects between team members, take leave, and return to our own work years later.
  • Style isn’t just aesthetic — it’s how your work stays usable:
    • diffs reflect real changes, not whitespace noise
    • a collaborator can read unfamiliar code without reverse-engineering it
    • future you can pick up where you left off

Formatters are required [NEW]

A formatter handles all the spacing, indentation, and line-break decisions automatically. Set it up once and forget about it.

Important

Formatters are required for all new code development.

Can be configured to “run on save” - so there’s nothing to remember.

Air — required for R [NEW]

Air is the required R formatter (built by Posit, enforces Tidyverse style).

  1. In Positron / VS Code, open Extensions (Cmd+Shift+X) and install “Air” (by Posit).
  2. Turn on format-on-save in your settings.json:
{
  "[r]": {
    "editor.defaultFormatter": "posit.air",
    "editor.formatOnSave": true
  }
}

Ruff — required for Python [NEW]

Ruff is the required Python formatter.

  1. Install the “Ruff” extension from the marketplace.
  2. Enable format-on-save:
{
  "[python]": {
    "editor.defaultFormatter": "charliermarsh.ruff",
    "editor.formatOnSave": true,
    "editor.codeActionsOnSave": {
      "source.fixAll.ruff": "explicit",
      "source.organizeImports.ruff": "explicit"
    }
  }
}

On RStudio

Styling is still required if you still use RStudio, where Air is less integrated. Use the styler package:

  • install.packages("styler"), then Addins → “Style active file” before each commit.
  • Tools → Global Options → Code → Saving → check “Ensure that source files end with a newline.”

2 · Branch-based development

The universal requirement [NEW]

All code development happens on a separate branch. We don’t commit directly to main.

  • Your work stays isolated until it’s ready.
  • A dead end never touches main — abandon the branch and move on.
  • Branches are what make code review possible.

What a branch is

Git history is a sequence of snapshots:

A --- B --- C --- D   (main)

What a branch is

A branch is a named pointer that moves forward independently:

A --- B --- C --- D           (main)
                  \
                   E --- F    (my-branch)

main is untouched until you explicitly merge.

What a branch is

When the work is ready, you merge back into main:

A --- B --- C --- D ----------- G   (main)
                  \           /
                   E --- F ---      (my-branch → merged)

main now includes the work; the branch can be deleted.

Why not commit to main?

main should always run successfully. It’s the reviewed, stable version of the project.

  • Branching creates an isolated checkpoint: work in the branch → PR → review → merge back to main.
  • This matters even when you’re working alone — reviewing your own diff before merging catches things.
  • Repos can be set to automatically block direct pushes to main; if you hit that error, you need a branch.

The workflow

  1. Start from an up-to-date maingit switch main && git pull
  2. Create a branchgit switch -c add-spatial-aggregation
  3. Work and commit in logical units
  4. Pushgit push -u origin <branch>
  5. Open a pull request and merge
  6. Delete the branch after merge

Keep branches small and current

  • Stay current: pull from main periodically — the longer a branch diverges, the harder it will be to review and merge.
  • Stay focused: one branch, one task. Short-lived branches are easier to review and less likely to conflict.
  • Are big changes needed? Break them into branches that build on each other.

GitHub Issues

  • Issues are an excellent project management and code development tool
  • Use an Issue to flag a bug, propose a change, or raise a question — before writing any code.
  • Keeps the conversation in the repo, not in Slack.
  • Reference issues from commits and PRs (“Closes #42”).
  • When you come back to the code later, you’ll know why decisions were made.

3 · Pull requests & code review

Pull requests [NEW]

A pull request is how you propose merging your branch into main — and how code review happens.

  • GitHub shows exactly which lines changed.
  • Collaborators can comment, ask questions, and approve.
  • Leaves a record of what changed and why — useful when something breaks six months later.

All changes to main come through a PR.

Opening a PR

Push your branch, then open a PR. Fill out:

  • Base: the primary branch you’re merging into — usually main.
  • Compare: the new branch you’ve been working on and are merging from.
  • Title: a clear, specific summary of what changed.
  • Description: what changed, why, and what a reviewer should focus on. Link issues (“Closes #42”).

Not ready for review? Open a Draft PR.

Reviewing a PR

A reviewer looks at both the code and the analytical approach. GitHub gives three options for the reviewer:

  • Comment — provide feedback without approving or blocking.
  • Approve — provide feedback; there may still be minor issues to resolve, but the author should be able to handle these without a re-review (Generally recommended!)
  • Request changes — large issues must still be resolved, and a mandatory re-review is necessary before merge.

In practice, we recommend using Comment + Approve on small, low-risk PRs — reserve Request changes for substantive concerns.

Types of of code review

Pull requests

Timing What to check Required?
Self-review Ongoing; Before every PR; revisit after receiving PR comments Accuracy, logic, efficiency Always
Team member peer review Ongoing; for every PR Accuracy, logic, efficiency Whenever there’s more than one researcher on the project
GitHub Copilot AI “peer” review Ongoing; for every PR Accuracy, logic, efficiency Optional; use with caution, your mileage may vary

Types of of code review

Reproducibility check

Timing What to check Required?
External peer reviewer End of project and before submission; again before publication A designated reviewer re-runs the full analysis independently and regenerates all outputs Publications and external-facing deliverables

Best practices

If your code is being reviewed:

  • Use coding best practices (good documentation; formatters; function/task-based programming; etc)
  • Keep PRs small and focused.
  • Write a clear description of what changed and what reviewer should look for.
  • Make sure it runs end-to-end before requesting review.

Best practices

If you’re reviewing:

  • Know what kind of review is being asked for.
  • Be specific; label things “Required:” vs. “Nit:”.
  • Prioritize correctness over style.
  • For reproducibility review, actually run the code.
  • Be timely — a stale PR gets harder to merge as the rest of the project moves on.

Project management for planning code review [NEW]

The fundamental shift: code review is planned for at the start of a project, not triggered reactively by a journal requirement or an error someone found.

A code review plan answers four questions:

  • What gets reviewed · when · by whom · to what standard

Project management for planning code review [NEW]

  • Budget 10–20% of coding time for it.
  • Explicitly discuss code review plan:

    • During SOP and budget development
    • During staffing decisions
    • At project kickof
    • Throughout the project
    • At project exit interview

Some closing thoughts on code review

  • Writing good code has always been important - code review just ensures we do it every time
  • AI is making code review more important than ever
  • Finding a mistake during code review is much better than finding it after it’s in a report or a paper.
  • Let’s embrace the culture of why code review is important; and also plan for the realities of how much work it takes

The rest of the SOP

Lots of updated material to check out!

Recap

Three new universal requirements:

  • Format on save — Air for R, Ruff for Python.
  • Branch-based development — never commit to main.
  • Code review through pull requests — planned from the start.

All three exist for the same reason: accurate, reproducible science.

Find everything in the SOP

The full Standard Operating Procedures live at:

emlab-ucsb.github.io/SOP

Thank you!