The final part of the course involves an individual project, taking shape as a data blog post. This should illustrate an issue/problem using an unique data set collected by yourself and at the same time illustrate the use of tools taught in this course. Deadline for the project is 2020-01-15 at 18:00. Hand-in occurs by raising an issue with the title “Project ready for grading” in your PR_<github_username>
repository. Shortly after the deadline, we will clone all project repos with such an issue to a local installation. This version will count as your hand-in version. On 2020-01-17 the projects will be presented orally (5 minutes presentation) - presence is compulsory for the session you are presenting in, so please reserve that day in your calendar.
The below blog posts could be viewed as inspiration or to give a rough idea of the amount of work expected in the projects.
Data sources: During the course, you were introduced to a lot of possible data sources. Additional public web based data sources could, e.g., be the Stockholm Open Data Portal or the Öppna Data site.
The project work has the following elements:
#rstats
post, something which might interest your fellow students. Your post can be about a serious matter, but it can also be a not so serious matter. However, make it clear before writing who is your intended readership (general public, fellow B.Sc. students, R users, ornotologists, …)wordcountaddin
for RStudio to count the words in your report.The biggest challenge of the project will be to be realistic about what you can achieve within the given deadline. Once you have an estimate of how much that could be, take 50% of that and you are still likely to be busy. Make sure you have a working project early on and then scale up iteratively, so you’re always ready. Start early.
For every student will get create a private PR_<username>
GitHub repository as part of the mt5013-ht19 organisation, which only you and the teacher’s of the course have read/write access to (similar setup as your HW_<username>
repo) You should make sub-directories Data
, R
, Report
and Presentation
and store all your files there. At the project deadline we will pull all repos, which have an issue “Project work submitted”.
At submission, your repo should at least contain the following files:
PR_jensjensen/Report/report.Rmd
PR_jensjensen/Report/report.html
PR_jensjensen/Presentation/presentation.Rmd
PR_jensjensen/Presentation/presentation.html
where jensjensen
is to be substituted with your GitHub user name. Note: It’s important to pick exactly the filenames as above, since we will extract report and presentation automatically from your repository. Furthermore, ensure that any support files like data files, graphics, etc. which are needed to compile the .Rmd documents. One exception are data aquired by using private API-keys. To this end, make a R/query_data.R
script, which imports the API key stored somewhere outside the git, does all the work and finally stores the data using save
in the Data
directory. Your R Markdown report should then access the data by using load
. See HW_data/swedavia_api.R
folder for an example.
The data remain private as part of the PW_<username>
repository, but your project report HTML-file will be made accessible to the teachers and students of the course.
In contrast to the homework exercises we use HTML as backend, see e.g. Section 3.1 of the R Markdown: The definite guide for options to customize. In order to ensure portability of the HTML reports please use the following as part of your YAML header:
----
title: The snappy title of your project
author: Jens Jensen <jens.jensen@student.su.se>
date: 2020-01-15
output:
html_document:
self_contained: true
toc: true
toc_depth: 2
---
Additional note: It’s possible that we succeed in creating a project repo template, which already contains some of the necessary files and directories.
Final note: For the presentation the data wrangling steps do not need to be repeated. One can either import all data generated by report.Rmd
using load
or use some other type of caching.
The project will be graded based on the following five dimensions, which have equal weight:
Lycyka till!