Read R4DS chapter 7 about exploratory data analysis.
Solve Exploring Categorical Data and Exploring Numerical Data of the Exploratory Data Analysis course at DataCamp.
filter
to extract the groups of products c("Vitt vin", "Rött vin", "Rosévin", "Mousserande vin")
of vintage 2011-2018. Try and compare the following bar charts
ggplot
with aes(x = Argang)
, geom_bar()
andggplot
with aes(x = Argang)
, geom_bar()
and facet_wrap(~ Varugrupp)
(try adding scale = "free_y"
to facet_wrap
)ggplot
with aes(x = Argang, fill = Varugrupp)
and
geom_bar()
geom_bar(position = "dodge")
geom_bar(position = "fill")
Recreate the following plot (Red wines in the regular range)
Make a box_plot
of PrisPerLiter
on the log-scale,with x = Varugrupp
. Try coord_flip
to improve readability.
The following code transforms the medals data to “long” format (more about this next time!) which is easier to work with in ggplot
:
medal_long <- read_csv("Class_files/Winter_medals2019-10-30.csv") %>%
select(-Total) %>%
gather(Denomination, Number, c("Gold", "Silver", "Bronze"))
Check the result with glimpse(medal_long)
. Use group_by
and summarise
in order to aggregate the total number of medals of each denomination (Gold/Silver/Bronze) for each country. Illustrate the relative proportions of denominations, e.g. by geom_bar
with stat = "identity
and position = "fill"
.
The file Class_files/MM2001_results.csv
contains the age, sex, and grade on course Matematik I (MM2001) of 3201 students aged 18-40 years. An NA
in the grade column means that the student has been registered but not yet completed the course.
Use ggplot
to explore relations between the variables.