Update zip/txt retrieval (#71)

ropensci · Sep 2, 2024 · ee02847 · ee02847
1 parent 2a73b8e
commit ee02847
Show file tree

Hide file tree

Showing 2 changed files with 3 additions and 3 deletions.
diff --git a/README.Rmd b/README.Rmd
@@ -119,7 +119,7 @@ Yes! The package respects [these rules](https://www.gutenberg.org/policy/robot_a
 
 * Project Gutenberg allows wget to harvest Project Gutenberg using [this list of links](https://www.gutenberg.org/robot/harvest?filetypes[]=html). The gutenbergr package visits that page once to find the recommended mirror for the user's location.
 * We retrieve the book text directly from that mirror using links in the same format. For example, Frankenstein (book 84) is retrieved from `https://www.gutenberg.lib.md.us/8/84/84.zip`.
-* We retrieve the .zip file rather than txt to minimize bandwidth on the mirror.
+* We give priority to retrieving the `.zip` file to minimize bandwidth on the mirror. `.txt` files are only retrieved if there is no `.zip`.
 
 Still, this package is *not* the right way to download the entire Project Gutenberg corpus (or all from a particular language). For that, follow [their recommendation](https://www.gutenberg.org/policy/robot_access.html) to use wget or set up a mirror. This package is recommended for downloading a single work, or works for a particular author or topic.
 

diff --git a/README.md b/README.md
@@ -230,8 +230,8 @@ to the best of our ability. Namely:
 - We retrieve the book text directly from that mirror using links in the
   same format. For example, Frankenstein (book 84) is retrieved from
   `https://www.gutenberg.lib.md.us/8/84/84.zip`.
-- We retrieve the .zip file rather than txt to minimize bandwidth on the
-  mirror.
+- We give priority to retrieving the `.zip` file to minimize bandwidth
+  on the mirror. `.txt` files are only retrieved if there is no `.zip`.
 
 Still, this package is *not* the right way to download the entire
 Project Gutenberg corpus (or all from a particular language). For that,