5  Reports and Publications

This section describes best practices related to emLab reports and publications.

5.1 emLab Affiliation

Since emLab is joint between MSI and Bren and also its own research entity, we recommend emLab staff use the following affiliations for any emLab publication:

  1. Marine Science Institute, University of California, Santa Barbara, Santa Barbara, CA, USA
  2. Bren School of Environmental Science & Management, University of California, Santa Barbara, Santa Barbara, CA, USA
  3. Environmental Markets Lab, University of California, Santa Barbara, Santa Barbara, CA, USA

Note that journals may list affiliation addresses slightly different than what is listed above.

5.2 Reports

Many final deliverables for our projects include creating reports. To aid in the final report development, please see the Report template for information on a draft cover page, inside page, logos to use, and suggested citation.

To ensure future searchability, please upload all final reports to Zenodo - a public repository that creates a DOI and link for all uploaded information - if the report can be made publicly available. If you don’t have an account, you can sign up here with your email, GitHub account, or ORCID.

We have created an emLab repository for all of our reports to live. To upload to the emLab community repository, use this link and fill out all of the required information.

5.3 Author Contribution

Transparency in authors’ contribution is a critical component of open science. Determining authorship is ultimately the responsibility of the project leads (i.e. principal investigator and/or first author), but because author contribution is not always straightforward, we follow McNutt et al. (2018) and others in recommending the Contributor Roles Taxonomy (CRediT) system. For a simplified version of this most relevant to our project infrastructure, feel free to use and modify our Author Contribution Template, which allows individuals to identify their contributions or to opt-out of participation in a paper. We recommend discussing authorship expectations early in a project to avoid future complications.

5.4 Making Your Data Publicly Available

Generally, emLab researchers should be prepared to share a public GitHub repository (see Section 6.4) and a Dryad data repository with publications. The repositories contain code scripts and data (respectively) needed to conduct the study. Increasingly, journals are requiring both sets of documentation at the point of submission or publication.

Dryad is an open-access repository of research data that makes data searchable, freely accessible, and citable. Data repositories do not need to be associated with a publication. Dryad is free for UCSB affiliates with a UCSB NetID. The following steps document the process for creating a Dryad repository. Review Dryad’s Best Practices website for more useful information.

  1. If you do not already have one, visit the ORCID site to obtain an ORCID identifier. You will need this to deposit data on Dryad.
  2. Use your ORCID username and password to sign into Dryad.
  3. Under My Datasets, select “+ Start New Dataset.”
  4. Fill in the required fields and describe your dataset. If your data is related to a manuscript in progress or a published article, you will be asked to provide the Manuscript Number or DOI.
  5. Upload your files. Note that while you cannot load a folder to Dryad, you can load a .zip file that includes a folder structure. This approach is suggested as it allows for improved organization.
  6. Review and submit. In this phase you can choose to “Enable Private for Peer Review,” which keeps your dataset private during your article’s review period. You will have access to a private dataset download URL that you can share with collaborators or the journal. If you make this selection, your dataset will not enter curation or be published. If you do not select “Enable Private for Peer Review,” you will submit your dataset to Dryad for curation. Skip to step 9 if you did not “Enable Private for Peer Review.”
  7. If you chose “Enable Private for Peer Review,” copy the URL to share with collaborators or the journal. Those who use this link may be asked to provide an email address in order to obtain the dataset. Note that in this case, users will receive an email from Dryad with instructions for how to download the dataset – notify users to check their Spam folder for this email.
  8. Once the manuscript is accepted, you can go back to your dataset and submit it to Dryad for curation. Make sure to incorporate any relevant changes that occurred during the revision period.
  9. During curation, Dryad will check your submission to ensure the validity of files and metadata. Dryad may contact you questions, suggestions, and/or identified problems.
  10. Once the dataset is approved, the Dryad DOI is officially registered and made public. Note that you can contact Dryad in order to delay the publication of your data until your publication date. Include DOI in your manuscript’s Data Availability statement, or consider citing it the References section.

5.5 Preparing a Public GitHub Repository

As with data, we strive to make all our code available. This provides a roadmap of converting input data into tangible results, which may be of interest for external people seeking to replicate our study or for internal emLabers seeking to understand what someone did a few years ago.

5.5.1 Documentation

One of the most important things to include in the repo is a README.md file. This will be automatically displayed as rendered markdown on GitHub, and should provide a simple explanation of what’s in the repo, how to run it, and how it was run in the past. If possible / necessary, you might want to include a file structure (Take a look at using startR::create_readme() for automating this). If relevant, you might want to include the title of the paper / project, and a link to any online material (e.g. the publication itself).

In paragraph or bullet-list form, make sure to specify the following:

  • Operative System(s) in which the project was run (e.g. MacOSX Catalina or Ubuntu 18.4)
  • The version of R / STATA / MATLAB / Julia / Python… including major and minor (e.g. R 3.6.2)
  • Any special mentions of performance needed (e.g. “This analyses requires a machine with at least 32 GB RAM and 16 cores”)
  • Link to any relevant data repositories
  • Any relevant contact information, should interested people have trouble running your code

When in doubt, check out the repository that Grant McDermott and Matt Burgess provided for their Science paper on Effort reduction and bycatch.

5.5.2 Sanitizing the repository

When tracking a project, we’ll usually end up with many small, meaningless commit messages such as “fixed typo”, “fixed bug”, or “actually fixed bug”. While these small incremental changes allow us to revert back during the production process, in the end, we may not want to have the full list of bug fixes and meaningless commit messages visible. Thankfully, Git allows us to clean things up a bit using git rebase. Here’s an example of what your code might look like:

871adf OK, plot actually done       --- newer commit
0c3317 Whoops, not yet...
87871a Plot finalized
afb581 Fix this and that
4e9baa Fixed typo on x-axis
d94e78 Plot model output
6394dc Fixing model                 --- older commit

The top 6 commit messages are all related to each other. And, had you been making this plot at 9 am and not 3 pm, it would all have been a single push. Instead, we might want this to look like this:

871adf OK, plot actually done       --- newer commit -┐
0c3317 Whoops, not yet...                             |
87871a Plot finalized                                 |
afb581 Fix this and that                              | ---- Join all this into one
4e9baa Fixed typo on x-axis                           |
d94e78 Plot model output           -------------------┘
6394dc Fixing model                 --- older commit

In this case, we want to merge the last 6 commits into one. We want it to look like this in the end:

84d1f8 Plot model output                          --- newer commit (result of rebase, combining 6 messages)
6394dc Fixing model                               --- older commit

We can do so by running the following line:

git rebase --interactive HEAD~6

Notice that I’ve specified the value 6 after the argument HEAD~. If you don’t want to count the number of commits, you can simply reference the last commit (by its hash) that you want to leave out. For example, we wanted to leave out the Fixing model commit, with hash (6394dc). Therefore, we can also run:

git rebase --interactive 6394dc

Whichever way you go, your predetermined text editor will open. You’ll see a list of commits, containing the ones you want. (Head’s up, the older one will be on top). At the bottom of the page, you’ll see the following list of possible instructions:

# Commands:
# p, pick <commit> = use commit
# r, reword <commit> = use commit, but edit the commit message
# e, edit <commit> = use commit, but stop for amending
# s, squash <commit> = use commit, but meld into previous commit
# f, fixup <commit> = like "squash", but discard this commit's log message
# x, exec <command> = run command (the rest of the line) using shell
# b, break = stop here (continue rebase later with 'git rebase --continue')
# d, drop <commit> = remove commit
# l, label <label> = label current HEAD with a name
# t, reset <label> = reset HEAD to a label
# m, merge [-C <commit> | -c <commit>] <label> [# <oneline>]
# .       create a merge commit using the original merge commit's
# .       message (or the oneline, if no original merge commit was
# .       specified). Use -c <commit> to reword the commit message.

You’ll need to preface the hash with whichever command (or shortcut) you want to use. You might want to reword a commit (i.e. remove all those “F%@#!!”), so you’ll use r. You might want to pick the head of the commit, so you’ll use p. You might want to squash multiple commits into one, so you’ll use s. In the example above, you’ll have to edit the first word of each hash to make it look like this:

pick d94e78 Plot model output            --- older commit
s 4e9baa Fixed typo on x-axis
s afb581 Fix this and that
s 87871a Plot finalized
s 0c3317 Whoops, not yet...
s 871adf OK, plot actually done          --- newer commit

Now, simply save and close the file; you’ll be prompted back to your command line. The next thing to do is to give the new commit a name. Your editor will pop up. You can use the default message, or replace it with something like “Plot model output”. Save the file, close it, and push your changes. You can read much more about this on Git’s help page for Rewriting History (sounds cool, right?).

5.6 Sharing public data, Shiny apps, and tools on our website

Any time emLab develops publicly available datasets, Shiny apps, or tools, we would love to share these on the emLab website. Any time you release a publicly available dataset, Shiny app, or other tool, please Slack or email Erin O’Reilly (eoreilly@ucsb.edu) the following information: name of the dataset app or tool, link, date published, and the emLab project it is associated with.