Learning Objectives
Following this assignment students should be able to:
- use, modify, and write custom functions
- use the output of one function as the input of another
- understand and use the basic relational operators
- use an
if
statement to evaluate conditionals
Reading
-
Topics
- Functions
- Conditionals
-
Readings
Lecture Notes
Exercises
Writing Functions (5 pts)
Write a function that converts pounds to grams (there are 453.592 grams in one pound). It should take a value in pounds as the input and return the equivalent value in grams (i.e., the number of pounds times 453.592). Use that function to calculate how many grams there are in 3.75 pounds.
[click here for output]Use and Modify (10 pts)
The length of an organism is typically strongly correlated with its body mass. This is useful because it allows us to estimate the mass of an organism even if we only know its length. This relationship generally takes the form:
Mass = a * Lengthb
Where the parameters
a
andb
vary among groups. This allometric approach is regularly used to estimate the mass of dinosaurs since we cannot weigh something that is only preserved as bones.The following function estimates the mass of an organism in kg based on its length in meters for a particular set of parameter values, those for Theropoda (where
a
has been estimated as0.73
andb
has been estimated as3.63
; Seebacher 2001).get_mass_from_length_theropoda <- function(length){ mass <- 0.73 * length ^ 3.63 return(mass) }
- Add a comment to this function so that you know what it does.
- Use this function to print out the mass of a Spinosaurus that is 16 m long based on its reassembled skeleton.
- Create a new version of this function called
get_mass_from_length()
that estimates the mass of an organism in kg based on its length in meters by takinglength
,a
, andb
as parameters. This lets us pass the function all 3 values that it needs to estimate a mass as parameters, which makes it much easier to reuse for all of the non-theropod species. Use this new function to estimate the mass of a Sauropoda (a = 214.44
,b = 1.46
) that is 26 m long.
Combining Functions (10 pts)
This is a follow up to Use and Modify.
Measuring things using the metric system is the standard approach for scientists, but when communicating your results more broadly it may be useful to use different units (at least in some countries). Write a function that converts kilograms into pounds (there are 2.205 pounds in a kilogram). Use that function along with your
[click here for output]get_mass_from_length()
function from Use and Modify to estimate the weight, in pounds, of a 12 m long Stegosaurus. In Stegosauria,a
has been estimated as10.95
andb
has been estimated as2.64
(Seebacher 2001).Choice Operators (10 pts)
Create the following variables.
w <- 10.2 x <- 1.3 y <- 2.8 z <- 17.5 colors <- c("red", "blue", "green") masses <- c(45.2, 36.1, 27.8, 81.6, 42.4) dna1 <- "attattaggaccaca" dna2 <- "attattaggaacaca"
Use them to print whether or not the following statements are
TRUE
orFALSE
.w
is greater than 10"green"
is incolors
x
is greater thany
- Each value in
masses
is greater than 40. - 2 *
x
+ 0.2 is equal toy
dna1
is the same asdna2
dna1
is not the same asdna2
w
is greater thanx
, ory
is greater thanz
x
timesw
is between 13.2 and 13.5- Each mass in
masses
is between 30 and 50.
Simple If Statement (10 pts)
To determine if a file named
thesis_data.csv
exists in your working directory you can use the code to get a list of available files and directories:list.files()
- Use the
%in%
operator to write a conditional statement that checks to see ifthesis_data.csv
is in this list. - Write an
if
statement that loads the file usingread.csv()
only if the file exists. - Add an
else
clause that prints “OMG MY THESIS DATA IS MISSING. NOOOO!!!!” if the file doesn’t exist. - Make sure your actual thesis data is backed up.
- Use the
Size Estimates by Name (20 pts)
This is a follow up to Use and Modify.
To make it even easier to work with your dinosaur size estimation functions you decide to create a function that lets you specify which dinosaur group you need to estimate the size of by name and then have the function automatically choose the right parameters.
Create a new function
get_mass_from_length_by_name()
that takes two arguments, thelength
and the name of the dinosaur group. Inside this function useif
/else if
/else
statements to check to see if the name is one of the following values and if so seta
andb
to the appropriate values.- Stegosauria:
a
=10.95
andb
=2.64
(Seebacher 2001). - Theropoda:
a
=0.73
andb
=3.63
(Seebacher 2001). - Sauropoda:
a
=214.44
andb
=1.46
(Seebacher 2001).
If the name is not any of these values set
a
andb
toNA
.Once the function has assigned
a
andb
have it runget_mass_from_length()
with the appropriate values and return the estimated mass.Run the function for:
- A Stegosauria that is 10 meters long.
- A Theropoda that is 8 meters long.
- A Sauropoda that is 12 meters long.
- A Ankylosauria that is 13 meters long.
Challenge (optional): If the name is not one of values that have
a
andb
values print out a message that it doesn’t know how to convert that group that includes that groups name in a message like “No known estimation for Ankylosauria”. (the functionpaste
will be helpful here). Doing this successfully will modify your answer to (4).Challenge (optional): Change your function so that it uses two different values of
[click here for output]a
andb
for Stegosauria. When Stegosauria is greater than 8 meters long use the equation above. When it is less than 8 meters long usea
=8.5
andb
=2.8
. Run the function for a Stegosauria that is 6 meters long.- Stegosauria:
DNA or RNA (15 pts)
Write a function that determines if a sequence of base pairs is DNA, RNA, or if it is not possible to tell given the sequence provided. RNA has the base Uracil (
"u"
) instead of the base Thymine ("t"
), so sequences with u’s are RNA, sequences with t’s are DNA, and sequences with neither are unknown.You can check if a string contains a character (or a longer substring) in R using
grepl(substring, string)
, sogrepl("u", sequence)
will check if the string in thesequence
variable has the baseu
.Name the function
dna_or_rna()
and have it takesequence
as an argument. Have the function return one of three outputs:"DNA"
,"RNA"
, or"UNKNOWN"
. Add documentation describing what the function does. Call the function on each of the following sequences.seq1 <- "ttgaatgccttacaactgatcattacacaggcggcatgaagcaaaaatatactgtgaaccaatgcaggcg" seq2 <- "gauuauuccccacaaagggagugggauuaggagcugcaucauuuacaagagcagaauguuucaaaugcau" seq3 <- "gaaagcaagaaaaggcaggcgaggaagggaagaagggggggaaacc"
Challenge (optional): Figure out how to make your function work with both upper and lower case letters, or even strings with mixed capitalization*
[click here for output]Climate Space Rewrite (20 pts)
This is a follow up to Climate Space.
Producing a plot of occurrences on the available climate space for each of the three species required a lot of repetition of very similar code. Whenever this happens, it is usually an indication that a function could be used instead. Such functions reduce the repetition in producing the three species plots, which enables you to save time and prevent errors by not having to rewrite the same code multiple times.
-
Create a function to download occurrence data and extract the corresponding climate data, which should return a dataset of all the bioclim variables for a single species. Because the latitude and longitude columns for each occurrence dataset have different names you can select and set them to the same name using the column index, instead of the column name, to get only those columns (e.g.,
select(longitude = 2, latitude = 3
). -
Create a second function for plotting the occurrences for a single species onto the available climate space, then use this function to generate separate plots for each of the three tree species.
-