Solutions to the exercises of this homework 6 should, just as for HW1-HW5, be written in an R-Markdown document with output: github_document
. Both the R-Markdown document (.Rmd-file) and the compiled Markdown document (.md file), as well as any figures needed for properly rendering the Markdown file on GitHub, should be uploaded to your Homework repository as part of a HW4
folder. Code should be written clearly in a consistent style, see in particular Hadley Wickham’s tidyverse style guide. As an example, code should be easily readable and avoid unnecessary repetition of variable names.
Note that there are new data-sets available in the HW_data
repository. Downloading them by opening the associated R-project and issue a “pull”. If it fails, delete the HW_data
folder on your computer and clone the repository again according to the instructions in HW2.
Deadline for the homework is 2019-12-15 at 23.59. Submission occurs as usual by creating a new issue with the title “HW6 ready for grading” in your repository. Please also add a link from your repository’s README.md
file to HW6/HW6.md
.
In this exercise we will revisit the Nobel Foundation API known from HW5 and wrap its use with the purrr package.
Use the API to extract the id of all Nobel laureates in economics from 1969 to 2019. Use purrr functionality to create a data.frame laureates
containing five columns: year
, category
(always economics), firstname
, surname
and id
. Note: It might be worthwhile to cache the result of your query using cache=TRUE
in the code chunk of the knitr document in order to avoid calling the API whenever you compile your document.
Use the API and purrr
functionality to loop over all the above ids in laureates
. For each id determine the day of birth and the gender of the Laureate in economics. Add these as columns day_of_birth
and gender
to the above laureates
frame and store the result in a data.frame denoted laureates_info
. Note: It might be worthwhile to cache the result of your query using cache=TRUE
in the code chunk of the knitr document in order to avoid calling the API whenever you compile your document.
What is the proportion of female laureates among all current laureates in economics?
Assuming that the award is given on the 1st of December every year it is given, compute the age (in years) at the time of the award for each laureate and add this to the laureates_info
frame. Illustrate this age as a function of the award year for the laureates. Also add an appropriate smooth function using geom_smooth
. Interpret the overall result, i.e. how has age at the time of the award evolved with time.
One part of the examination of this course is that you are going to demonstrate all the acquired skills from this course as part of a project. For this read the course instructions about the project work, which have been updated as part of this homework.
Note: As an exception you can ask Erik and Jan for help about this exercise during the three remaining classes. Note: they will not provide help by email about the project work nor are they making special appointments for you to drop by. You are, to some degree, on your own for the project, but help is provided in these last classes.
The repo to student review will be assigned at 2019-12-16. Deadline for the peer-review is Wed, 2019-12-18 at 12:00 (noon).