The Rpoet
package is a wrapper of the PoetryDB API, which enables developers and other users to extract a vast amount of English-language poetry from nearly 130 authors (as of this writing). The package provides a simple R interface for interacting and accessing the PoetryDB database. This vignette will introduce the basic functionality of Rpoet
and some example usages of the package.
First Steps
If not done already, install Rpoet
using install.packages()
function.
install.packages('Rpoet')
The latest development version can also be installed using the devtools function install_github
.
devtools::install_github('aschleg/Rpoet')
After Rpoet
is installed, load it using the library()
function.
library(Rpoet)
Using Rpoet
The get.poetry
function acts as the interface to the PoetryDB
API. The only required parameter in the function is the input_term
, which must be one or a combination of the following:
- 'author'
- 'title'
- 'lines'
- 'linecount'
The search_term
parameter should correspond to the given input_term
. For example, if we are interested in finding the poems and sonnets of William Shakespeare, we would use the get.poetry
function like so:
get.poetry('author', 'William Shakespeare')
In the case of searching for a particular poem or sonnet:
get.poetry('title', 'Paradise Lost')
For users who know of a certain line in a poem and want the full poem:
get.poetry('lines', 'But thou contracted to thine own bright eyes,')
Limiting Returned Results
In the samples given above, all of the data found by the API will be returned. The resulting data returned from the API can be specified by utilizing the output
parameter. Similar to the input_term
parameter, output
can be any one or a combination of 'author', 'title', 'lines' or 'linecount'. For example, the following would return all of Shakespeare's poem titles and linecounts rather than the full returned object.
get.poetry('author', 'William Shakespeare', 'title,linecount')
If we only wanted to get the lines of John Milton's Paradise Lost, the function would look similar to the following:
get.poetry('title', 'Paradise Lost', 'lines')
Combination Searches
Multiple input and search terms can be passed in the input_term
and search_term
parameters. Each term passed in the input_term
parameter must be delimited by a comma, while the terms in the search_term
parameter should be delimited by a semi-colon. There must be a corresponding search term for each passed input term. For example, let's say we want to find the full title name and the line count of John Milton's poetry with Paradise Lost in the title.
get.poetry('title,author', 'Paradise Lost;Milton', 'title,linecount')
As another example, let's say we are interested in finding all of William Shakespeare's poems and sonnets that are fourteen lines long (a sonnet is a poem of 14 equal length lines).
fourteen_lines <- get.poetry('author,linecount', 'William Shakespeare;14', 'title,linecount')
nrow(fourteen_lines)
Other Examples
Getting Available Authors and Titles
A list of authors and titles in the PoetryDB can be found by setting the input_term
parameter to 'author' or 'title' while keeping the search_term
parameter NULL
. In the example below, we see how many authors and poetry titles are currently in PoetryDB
.
authors <- get.poetry('author')
titles <- get.poetry('title')
print(c(paste('Number of authors:', length(authors$authors), sep = ' '),
paste('Number of titles:', length(titles$titles), sep = ' ')))
We see there are 129 authors and just under 3,000 titles in PoetryDB
at the time of this writing. Therefore, the average number of poems for each author in the database can be calculated:
length(titles$titles) / length(authors$authors)