1.4 GitHub Structure

1.4.1 Project Repository Structure

The structure of each emlab repository on GitHub will likely vary depending on the needs of the project, but the following structure is suggested as a starting point.

A documents (or docs) folder may be useful for storing code files that are used to generate text-based documents or presentations. Types of files that might live here include things like markdown files.

A results folder may be useful for storing plots or other types of results generated by the project. Some discretion needs to be used here, as some results may actually be considered to be “processed” or “output” data. However, results in the form of figures or workspace image files might live here.

A scripts folder may be useful for storing the code files that do everything from processing the raw data to running the analysis and generating outputs.

A functions folder may be useful for storing the code files in which functions that are used by many scripts many be stored.

Different types of projects may require more or fewer folders and these are only meant to act as suggestions. Regardless, the structure of the repository should be sufficiently organized such that it can be easily navigated and understood by others by the time the project is completed.

1.4.2 A repo inside a repo

Sometimes, a project may have more than one paper or analysis sections. On some corner scenarios, we might want to have multiple “paper folders” within a “project folder”. This would imply that we will have a repo inside a repo. If that is something that makes sense for you, your project, and your team, then git submodules are your solution. If you want to read more on when / how to use submodules, visit the documentation page here.

Including submodules in your workflow is simple. Here’s an example. you are working on a big project called “Blue Future”. The project has six PIs, 13 Research Specialists, two PostDocs, and three PhD Students. After a long kick-off meeting, the team realizes that the project will produce two papers and a ShinyApp. You are all determined to keep everything on the same folder, but correctly categorized and organized. As such, you go to GitHub and create the following four repositories:

  • Blue Future
  • Paper 1
  • Paper 2
  • ShinyApp

You’ll clone the Blue Future repo into your comuter, using the usual:

git clone https://github.com/emlab-ucsb/BlueFuture.git

Now, instead of cloning the repos for each paper and the app into their own folder, you’ll navigate into your local BlueFuture folder. Then, instead of cloning them there, you can just do:

git submodule add https://github.com/emlab-ucsb/Paper1.git

This will clone the Paper 1 repo, but not without first telling the BlueFuture repo about it (just so that you don’t end up tracking things twice). You can repeat the operation for Paper2 and ShinyApp. That’s it!