Learning Objectives
Following this assignment students should be able to:
- use version control to keep track of changes to code
- collaborate with someone else via a remote repository
Reading
Lecture Notes
How To
The exercises in this assignment should be worked through along with the Version Control lecture notes. Start at the beginning of the lecture notes and do the exercises where they are linked to in the notes.
Exercises
Set Up Git (15 pts)
Install
Git
for your operating system following the setup instructions. Then create a new repo at the Github organization for the class:- Navigate to Github in a web browser and login.
- Click the
+
at the upper right corner of the page and chooseNew repository
. - Choose the class organization (e.g.,
dcsemester
) as theOwner
of the repo. - Fill in a
Repository name
that follows the formFirstnameLastname
. - Select
Private
. - Select
Initialize this repository with a README
. - Click
Create Repository
.
Next, set up a project for this assignment in RStudio with the following steps:
- File -> New Project -> Version Control -> Git
- Navigate to your new Git repo -> Click the
Clone or download
button -> Click theCopy to clipboard
button. - Paste this in
Repository URL:
. - Leave
Project directory name:
blank; automatically given repo name. - Choose where to
Create project as subdirectory of:
. - Click
Create Project
. - Check to make sure you have a
Git
tab in the upper right window.
First Solo Commit (15 pts)
This is a follow up to Set Up Git.
In
fish-analysis.R
, add a comment above the creation offish_data
describing what this code does.Commit this change to version control with a good commit message. Then check to see if you can see this commit in the history.
Second Solo Commit (15 pts)
This is a follow up to First Solo Commit.
You discover that the device used to measure the scale length of the fish in
Gaeta_etal_CLC_data.csv
is not accurate for those smaller than 1 mm. Use dplyr to remove the fish with a scalelength of less than 1 mm fromfish_data
. The new dataset will have 4,029 rows.Commit this change to version control with a good commit message.
Commit Multiple Files (15 pts)
This is a follow up to Second Solo Commit.
After talking to a colleague, you realize that
Gaeta_etal_CLC_data.csv
is only the first in a series of similar files that you will receive. To help keep track of files, you decide to number them. Rename theGaeta_etal_CLC_data.csv
file toGaeta_etal_CLC_data_1.csv
manually, using the Files tab in RStudio. You’ll also need to change the first line offish-analysis.R
so that the script will still work.To include all of these changes in a single commit, stage both data files and the saved R script and then commit with a good commit message.
Git will initially think you’ve deleted
Gaeta_etal_CLC_data.csv
and created a new fileGaeta_etal_CLC_data_1.csv
. But once you click on both the old and new files to stage them, git will recognize this by making the two files into one and marking this with anR
.Pushing Changes (20 pts)
Now that you’ve set up your GitHub repository for collaborating with your colleague and made some changes, you’d better get them some work so they can see what you’re doing.
- To look at the relationship between the length of each fish’s body and the size of its scale across the different lakes sampled in these data, create a scatterplot with length on the x-axis and scalelength on the y-axis, then color the points using lakeid.
- Commit this change.
- Once you’ve committed the change click the
Push
button in the upper right corner of the window and then clickOK
whengit
is done pushing. - You should be able to see the changes you made on Github.
- Email your teacher to let them know you’ve finished this exercise. Include in the email a link to your Github repository.
Pulling and Pushing (20 pts)
This is a follow up to Pushing Changes.
STOP: Make sure you sent your teacher an email following the last exercise with a link to your Github repository and wait until your teacher has told you they’ve updated your repository before doing this one.
While you were working on your plot of size among lakes, your colleague (who has suddenly developed some pretty impressive computational skills) wrote some code to generate a histogram of scale lengths. To get it you’ll need to
pull
the most recent changes from Github.-
On the
Git
tab click on thePull
button with the blue arrow. You should see some text that looks like:From github.com:ethanwhite/gryffindorforever 1e24ac8..815e600 master -> origin/master Updating 1e24ac8..815e600 Fast-forward testme.txt | 1 + 1 file changed, 1 insertion(+) create mode 100644 youareawesome.txt
- Click
OK
. -
You should see the new lines of code in your
fish-analysis.R
.ggplot(fish_data, aes(x = scalelength, fill = length_cat)) + geom_histogram()
- Modify this code to look at narrower ranges of scale size classes by setting the bins argument to 80.
- Save this plot as
scale_hist_by_length.jpg
usingggsave
. - Commit the new code and resulting .jpg file by adding both files to the stage and committing with a good commit message, then push this to GitHub.
-