Package: archiveRetriever 0.4.1

Lukas Isermann

archiveRetriever: Retrieve Archived Web Pages from the 'Internet Archive'

Scraping content from archived web pages stored in the 'Internet Archive' (<https://archive.org>) using a systematic workflow. Get an overview of the mementos available from the respective homepage, retrieve the Urls and links of the page and finally scrape the content. The final output is stored in tibbles, which can be then easily used for further analysis.

Authors:Lukas Isermann [aut, cre], Konstantin Gavras [aut]

archiveRetriever_0.4.1.tar.gz
archiveRetriever_0.4.1.zip(r-4.7)archiveRetriever_0.4.1.zip(r-4.6)archiveRetriever_0.4.1.zip(r-4.5)
archiveRetriever_0.4.1.tgz(r-4.6-any)archiveRetriever_0.4.1.tgz(r-4.5-any)
archiveRetriever_0.4.1.tar.gz(r-4.7-any)archiveRetriever_0.4.1.tar.gz(r-4.6-any)
archiveRetriever_0.4.1.tgz(r-4.6-emscripten)
manual.pdf |manual.html✨
card.svg |card.png
archiveRetriever/json (API)
NEWS

# Install 'archiveRetriever' in R:

install.packages('archiveRetriever', repos = c('https://liserman.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/liserman/archiveretriever/issues

On CRAN:

4.94 score 18 stars 16 scripts 237 downloads 4 exports 45 dependencies

Last updated from:50bef5c1ed. Checks:9 OK. Indexed: yes.

Target	Result	Time
linux-devel-x86_64	OK	184
source / vignettes	OK	193
linux-release-x86_64	OK	170
macos-release-arm64	OK	162
macos-oldrel-arm64	OK	137
windows-devel	OK	136
windows-release	OK	139
windows-oldrel	OK	171
wasm-release	OK	122

Exports:archive_overview retrieve_links retrieve_urls scrape_urls

Dependencies:anytime askpass BH cli cpp11 curl dplyr farver generics ggplot2 glue gridExtra gtable httr isoband jsonlite labeling lifecycle lubridate magrittr mime openssl pillar pkgconfig purrr R6 RColorBrewer Rcpp rlang rvest S7 scales selectr stringi stringr sys tibble tidyr tidyselect timechange utf8 vctrs viridisLite withr xml2

Citation

Development and contributors

Readme and manuals

Help Manual

Help page	Topics
archive_overview: Getting a first glimpse of mementos available in the Internet Archive	archive_overview
retrieve_links: Retrieving Links of Lower-level web pages of mementos from the Internet Archive	retrieve_links
retrieve_urls: Retrieving Urls from the Internet Archive	retrieve_urls
scrape_urls: Scraping Urls from the Internet Archive	scrape_urls