Solve chapters Downloading Files and Using API Clients, Using httr to interact with APIs directly and Handling JSON and XML på DataCamp.
Read: Chapters 1-3 in An introduction to APIs
Resources:
Write a function
get_pi <- function(start, numberOfDigits){
...
}
that calls a-pi and returns the digits of \(\pi\) from start
to start + numberOfDigits - 1
. A sample call is https://api.pi.delivery/v1/pi?start=1000&numberOfDigits=5.
A TimeEdit-schedule can be obtained in JSON-format by changing .html
to .json
in the url, try in a web-browser with https://cloud.timeedit.net/su/web/stud1/ri107455X48Z06Q5Z16g3Y05y5006Y48Q02gQY6Q55727.html
Part 1: We can use GET
to import it to R
library(httr)
schema_response <- GET("https://cloud.timeedit.net/su/web/stud1/ri107455X48Z06Q5Z16g3Y05y5006Y48Q02gQY6Q55727.json")
schema_json <- content(schema_response, "text")
The result can be explored with jsonedit
in the package listviewer
library(knitr)
library(tidyverse)
library(jsonlite)
library(listviewer)
jsonedit(schema_json)
We convert it with fromJSON
and choose the reservations
schema_df <- fromJSON(schema_json)$reservations
The result is now a data.frame
with six columns, where the last column contains a vector in each cell. In order to extract elements from this column, we use mutate
in combination map_chr
. The family of map
-functions comes from the purrr
package which is part of the tidyverse
, more abot them later. Here map_chr(columns, 1)
corresponds to sapply(columns, function (x) x[1])
in base-R.
schema_df %>% mutate(sal = map_chr(columns, 3),
kurs = map_chr(columns, 1),
tid = paste(starttime, endtime, sep = " - ")) %>%
select(kurs, datum = startdate, tid, sal) %>%
kable()
kurs | datum | tid | sal |
---|---|---|---|
MT5013 | 2019-11-07 | 13:15 - 16:00 | Sal 22. Kräftriket hus 5 |
MT5013 | 2019-11-08 | 09:15 - 12:00 | Sal 22. Kräftriket hus 5 |
MT5013 | 2019-11-12 | 09:15 - 12:00 | Sal 22. Kräftriket hus 5 |
MT5013 | 2019-11-13 | 09:15 - 12:00 | Sal 22. Kräftriket hus 5 |
MT5013 | 2019-11-19 | 09:15 - 12:00 | Sal 22. Kräftriket hus 5 |
MT5013 | 2019-11-20 | 09:15 - 12:00 | Sal 22. Kräftriket hus 5 |
MT5013 | 2019-11-22 | 09:15 - 12:00 | Sal 22. Kräftriket hus 5 |
MT5013 | 2019-11-26 | 09:15 - 12:00 | Sal 22. Kräftriket hus 5 |
MT5013 | 2019-11-27 | 09:15 - 12:00 | Sal 22. Kräftriket hus 5 |
MT5013 | 2019-12-04 | 09:15 - 12:00 | Sal 22. Kräftriket hus 5 |
MT5013 | 2019-12-06 | 09:15 - 12:00 | Sal 22. Kräftriket hus 5 |
MT5013 | 2019-12-10 | 09:15 - 12:00 | Sal 22. Kräftriket hus 5 |
MT5013 | 2019-12-11 | 09:15 - 12:00 | Sal 22. Kräftriket hus 5 |
MT5013 | 2019-12-13 | 09:15 - 12:00 | Sal 22. Kräftriket hus 5 |
MT5013 | 2019-12-18 | 15:00 - 18:00 | Sal 36. Kräftriket hus 5 |
MT5013 | 2020-01-15 | 14:00 - 17:00 | Sal 36. Kräftriket hus 5 |
MT5013 | 2020-01-17 | 09:00 - 15:00 | Sal 14. Kräftriket hus 5 |
Fun fact: The schedule of our github page is based on such a TimeEdit API call - see schedule.html
, which makes it easy to adapt the page each year the course is given.
Part 2: The schedule for room 14 during the 2nd part of the winter semester can be found on
https://cloud.timeedit.net/su/web/stud1/ri107455X28Z07Q5Z76g0Y05y5076Y31Q09gQY6Q55777.html
If you generate a table at Statistikdatabasen, you will find a link “API för denna tabell” that gives an url and a query to be made by POST in order to fetch the table. Try fetching a table with httr::POST
, the query should be placed in the body
.
Note: at the end of the query you may change "format": "px"
to "format": "json"
in order to get a reply in JSON-format. Fetch a table, examine its structure and try to extract a suitable data.frame
. See SCB_KPI_API.R
for a simple example.
Note: A dedicated R package pxweb
exists for querying the SCB database through the API in a slightly more comfortable way. See the vignette for a demonstratio.
The last days hourly temperatures (parameter 1) at Bromma (station 97200) can be fetched from SMHI by (switch xml
for json
if you want to change format)
temp_response <- GET("https://opendata-download-metobs.smhi.se/api/version/1.0/parameter/1/station/97200/period/latest-day/data.xml")
http_type(temp_response)
## [1] "application/xml"
We extract the XML-content by
library(xml2)
temp_xml <- read_xml(temp_response)
class(temp_xml)
## [1] "xml_document" "xml_node"
The structure can be viewed by opening
in your web-browser. We see that temperatures can be found in XPATH
"/metObsSampleData/value/value"
:
xml_ns_strip(temp_xml) # Överkurs
xml_find_all(temp_xml, "/metObsSampleData/value/value")
## {xml_nodeset (25)}
## [1] <value>3.0</value>
## [2] <value>2.9</value>
## [3] <value>3.0</value>
## [4] <value>3.1</value>
## [5] <value>3.4</value>
## [6] <value>3.5</value>
## [7] <value>3.6</value>
## [8] <value>3.5</value>
## [9] <value>3.4</value>
## [10] <value>3.4</value>
## [11] <value>3.6</value>
## [12] <value>4.0</value>
## [13] <value>4.4</value>
## [14] <value>5.0</value>
## [15] <value>5.3</value>
## [16] <value>5.3</value>
## [17] <value>5.4</value>
## [18] <value>5.0</value>
## [19] <value>4.9</value>
## [20] <value>5.2</value>
## ...
(use xml_text
to get the values).
Systembolaget’s API uses XML, the list of stores from HW4 can be fetched by
stores_response <- GET("https://www.systembolaget.se/api/assortment/stores/xml")
http_type(stores_response)
## [1] "application/xml"
We extract XML with
stores_xml <- read_xml(stores_response)
and look at the first with
xml_find_first(stores_xml, "/ButikerOmbud/ButikOmbud")
## {xml_node}
## <ButikOmbud type="StoreAssortmentViewModel">
## [1] <Typ>Butik</Typ>
## [2] <Nr>0102</Nr>
## [3] <Namn>Fältöversten</Namn>
## [4] <Address1>Karlaplan 13</Address1>
## [5] <Address2/>
## [6] <Address3>115 20</Address3>
## [7] <Address4>STOCKHOLM</Address4>
## [8] <Address5>Stockholms län</Address5>
## [9] <Telefon>08/662 22 89</Telefon>
## [10] <ButiksTyp/>
## [11] <Tjanster/>
## [12] <SokOrd>STOCKHOLM;STHLM;ÖSTERMALM;KARLAPLANSRONDELLEN;FÄLTAN</SokOrd>
## [13] <Oppettider>2019-11-26;10:00;19:00;;;0;_*2019-11-27;10:00;19:00;;;0;_*20 ...
## [14] <RT90x>6582011</RT90x>
## [15] <RT90y>1630064</RT90y>
In order to extract the names we may use
xml_find_all(stores_xml, "//Namn")[1:10]
## {xml_nodeset (10)}
## [1] <Namn>Fältöversten</Namn>
## [2] <Namn/>
## [3] <Namn>Garnisonen</Namn>
## [4] <Namn>Norra Djurgårdsstaden</Namn>
## [5] <Namn/>
## [6] <Namn>Sergel</Namn>
## [7] <Namn>PK-Huset</Namn>
## [8] <Namn/>
## [9] <Namn>Marieberg</Namn>
## [10] <Namn/>
evidently, not all stores have names.
data.frame
.