Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for local files #262

Merged
merged 48 commits into from
Sep 9, 2021
Merged
Show file tree
Hide file tree
Changes from 47 commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
701fbc9
Add support for local files
mre Jun 16, 2021
d5bb7ee
Or Patterns (Rust 1.53)
mre Jun 17, 2021
f9bf52e
Add support for base_dir
mre Jun 20, 2021
4fbd337
Add install target and fix build phony
mre Jun 22, 2021
185645a
Update docs
mre Jun 22, 2021
bfa3b1b
Introduce Base type, which can be a path or URL
mre Jun 22, 2021
887f1b9
Split up file checking into file discovery and validation of path exists
mre Jun 30, 2021
d51a49d
Move uri to types
mre Jun 30, 2021
d924c25
Non-existing directories are fine for URI base for files
mre Jun 30, 2021
f5ee472
explicit naming
mre Jul 4, 2021
ee70e13
Check real link to file
mre Jul 4, 2021
daa5be4
Add/change file link tests
mre Jul 4, 2021
a3fd85d
Exclude anchor links
mre Jul 4, 2021
1546d6e
Normalize path; fix tests
mre Jul 4, 2021
afdb721
Fix lints
mre Jul 5, 2021
4f9dc67
fix test
mre Jul 5, 2021
04bf838
lint
mre Jul 5, 2021
b06afb7
fix test
mre Jul 5, 2021
495f856
cleanup
mre Jul 5, 2021
5a2e107
linting
mre Sep 1, 2021
dd3205a
wip
mre Sep 2, 2021
b7c129c
Fix resolving absolute paths
mre Sep 2, 2021
03f5df9
Add fixtures for offline testing
mre Sep 2, 2021
82652a6
Add test
mre Sep 2, 2021
87fd90f
cargo fmt
mre Sep 3, 2021
9163066
Reintegrate master
mre Sep 3, 2021
57af648
fix tests after making base dir mandatory
mre Sep 3, 2021
b3c5d12
Fix clippy lints
mre Sep 3, 2021
f143087
Relative path not needed
mre Sep 3, 2021
f472820
String allocation not needed
mre Sep 3, 2021
00ddb6d
Filter out directories with suffixes that look like extensions
mre Sep 5, 2021
b2ce613
Fix build errors; cleanup code
mre Sep 6, 2021
5d0b952
Remove anchor from file links
mre Sep 6, 2021
4827ecf
Fix clippy warnings
mre Sep 6, 2021
a28f932
Fix wildcard test
mre Sep 6, 2021
8353ab1
Update docs
mre Sep 6, 2021
0c5dcf3
whoops
mre Sep 6, 2021
67268ed
Clean up params and fragment handling
mre Sep 7, 2021
ffab034
Revert refactor for removing params and fragments
mre Sep 7, 2021
f3fe46a
Merge branch 'master' of github.com:lycheeverse/lychee into local-files
mre Sep 7, 2021
24ea248
Update docs
mre Sep 7, 2021
a75cae5
Add failing test
mre Sep 8, 2021
93948d7
Avoid double-encoding already encoded destination paths
mre Sep 8, 2021
a1acf7b
Reintegrate master
mre Sep 8, 2021
a41e81c
Merge branch 'master' into local-files
mre Sep 8, 2021
2a4170e
Add test for `+` encoding
mre Sep 9, 2021
d743657
formatting
mre Sep 9, 2021
de55fbd
Add TODO for fixing URL encoding for paths
mre Sep 9, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -43,17 +43,17 @@ jobs:
fail-fast: false
steps:
- name: Install musl tools
if: contains(matrix.target, 'musl')
if: ${{ contains(matrix.target, 'musl') }}
run: sudo apt-get install -y musl-tools

- name: Install arm tools
if: contains(matrix.target, 'arm')
if: ${{ contains(matrix.target, 'arm') }}
run: |
echo "GNU_PREFIX=arm-linux-gnueabihf-" >> $GITHUB_ENV
sudo apt-get install -y binutils-arm-linux-gnueabihf

- name: Install aarch64 tools
if: contains(matrix.target, 'aarch64')
if: ${{ contains(matrix.target, 'aarch64') }}
run: |
echo "GNU_PREFIX=aarch64-linux-gnu-" >> $GITHUB_ENV
sudo apt-get install -y binutils-aarch64-linux-gnu
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/rust.yml
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ jobs:
- run: cargo-publish-all --dry-run

publish:
if: startsWith(github.ref, 'refs/tags/')
if: ${{ startsWith(github.ref, 'refs/tags/') }}
needs:
- test
- lint
Expand Down
9 changes: 9 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 5 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,14 @@ docker-run: ## Run Docker image
docker-push: ## Push image to Docker Hub
docker push $(IMAGE_NAME)

.PHONY: build-local
.PHONY: build
build: ## Build Rust code locally
cargo build

.PHONY: install
install: ## Install project locally
cargo install --path lychee-bin

.PHONY: run
run: ## Run Rust code locally
cargo run
11 changes: 9 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -161,11 +161,15 @@ lychee ~/projects/*/README.md

# check links in local files (lychee supports advanced globbing and ~ expansion):
lychee "~/projects/big_project/**/README.*"

# ignore case when globbing and check result for each link:
lychee --glob-ignore-case --verbose "~/projects/**/[r]eadme.*"

# check links from epub file (requires atool: https://www.nongnu.org/atool)
acat -F zip {file.epub} "*.xhtml" "*.html" | lychee -

# check links in directory; block network requests
lychee --offline path/to/directory
```

### GitHub token
Expand Down Expand Up @@ -202,14 +206,16 @@ FLAGS:
-i, --insecure Proceed for server connections considered insecure (invalid TLS)
-n, --no-progress Do not show progress bar.
This is recommended for non-interactive shells (e.g. for continuous integration)
--offline Only check local files and block network requests
--require-https When HTTPS is available, treat HTTP links as errors
--skip-missing Skip missing input files (default is to error if they don't exist)
-V, --version Prints version information
-v, --verbose Verbose program output

OPTIONS:
-a, --accept <accept> Comma-separated list of accepted status codes for valid links
-b, --base-url <base-url> Base URL to check relative URLs
-b, --base <base> Base URL or website root directory to check relative URLs e.g.
mre marked this conversation as resolved.
Show resolved Hide resolved
https://example.org or `/path/to/public`
--basic-auth <basic-auth> Basic authentication support. E.g. `username:password`
-c, --config <config-file> Configuration file to use [default: ./lychee.toml]
--exclude <exclude>... Exclude URLs from checking (supports regex)
Expand Down Expand Up @@ -310,7 +316,8 @@ Try one of these links to get started:
- [good first issues](https://github.com/lycheeverse/lychee/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22)
- [help wanted](https://github.com/lycheeverse/lychee/issues?q=is%3Aissue+is%3Aopen+label%3A%22help+wanted%22)

Lychee is written in Rust. Install [rust-up](https://rustup.rs/) to get started. Begin my making sure the following commands succeed without errors.
Lychee is written in Rust. Install [rust-up](https://rustup.rs/) to get started.
Begin my making sure the following commands succeed without errors.

```bash
cargo test # runs tests
Expand Down
4 changes: 2 additions & 2 deletions examples/collect_links/collect_links.rs
Original file line number Diff line number Diff line change
Expand Up @@ -14,12 +14,12 @@ async fn main() -> Result<()> {
];

let links = Collector::new(
None, // base_url
None, // base
false, // don't skip missing inputs
10, // max concurrency
)
.collect_links(
inputs, // base_url
inputs, // base url or directory
)
.await?;

Expand Down
4 changes: 2 additions & 2 deletions fixtures/TEST.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
This link should be ignored as it is not a fully qualified URL.
![Logo](awesome.png)
Check file link
![Logo](../assets/banner.svg)

![Anchors should be ignored](#awesome)

Expand Down
2 changes: 1 addition & 1 deletion fixtures/TEST_SCHEMES.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
slack://channel?id=123
file://foo/bar
file:///test_folder/test_file
https://example.org
Empty file.
21 changes: 21 additions & 0 deletions fixtures/offline/about/index.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
<html>
<head>
<title>About</title>
</head>
<body>
<h1>About</h1>
<p>
<ul>
<li>
<a href="https://example.org">example</a>
</li>
<li>
<a href="/">home</a>
</li>
<li>
<a href="/post1">Post 1</a>
</li>
</ul>
</p>
</body>
</html>
Empty file.
21 changes: 21 additions & 0 deletions fixtures/offline/blog/post1/index.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
<html>
<head>
<title>Post 2</title>
</head>
<body>
<h1>Post 2 Title</h1>
<p>
<ul>
<li>
<a href="/">home</a>
</li>
<li>
<a href="/post1">Post 1</a>
</li>
<li>
<a href="../about">Relative</a>
</li>
</ul>
</p>
</body>
</html>
18 changes: 18 additions & 0 deletions fixtures/offline/blog/post2/index.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
<html>
<head>
<title>Post 1</title>
</head>
<body>
<h1>Post 1 Title</h1>
<p>
<ul>
<li>
<a href="/">home</a>
</li>
<li>
<a href="/post2">Post 2</a>
</li>
</ul>
</p>
</body>
</html>
27 changes: 27 additions & 0 deletions fixtures/offline/index.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
<html>
<head>
<title>Index</title>
</head>
<body>
<h1>Index Title</h1>
<p>
<ul>
<li>
<a href="/">home</a>
</li>
<li>
<a href="/about">About</a>
</li>
<li>
<a href="/about#fragment">About</a>
</li>
<li>
<a href="/another page">About</a>
</li>
<li>
<a href="/another%20page">About</a>
</li>
</ul>
</p>
</body>
</html>
16 changes: 10 additions & 6 deletions lychee-bin/src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -70,10 +70,7 @@ use anyhow::{anyhow, Context, Result};
use headers::{authorization::Basic, Authorization, HeaderMap, HeaderMapExt, HeaderName};
use http::StatusCode;
use indicatif::{ProgressBar, ProgressStyle};
use lychee_lib::{
collector::{Collector, Input},
ClientBuilder, ClientPool, Response,
};
use lychee_lib::{ClientBuilder, ClientPool, Collector, Input, Response};
use openssl_sys as _; // required for vendored-openssl feature
use regex::RegexSet;
use ring as _; // required for apple silicon
Expand Down Expand Up @@ -178,6 +175,13 @@ async fn run(cfg: &Config, inputs: Vec<Input>) -> Result<i32> {
let include = RegexSet::new(&cfg.include)?;
let exclude = RegexSet::new(&cfg.exclude)?;

// Offline mode overrides the scheme
let schemes = if cfg.offline {
vec!["file".to_string()]
} else {
cfg.scheme.clone()
};

let client = ClientBuilder::builder()
.includes(include)
.excludes(exclude)
Expand All @@ -193,14 +197,14 @@ async fn run(cfg: &Config, inputs: Vec<Input>) -> Result<i32> {
.method(method)
.timeout(timeout)
.github_token(cfg.github_token.clone())
.schemes(HashSet::from_iter(cfg.scheme.clone()))
.schemes(HashSet::from_iter(schemes))
.accepted(accepted)
.require_https(cfg.require_https)
.build()
.client()
.map_err(|e| anyhow!(e))?;

let links = Collector::new(cfg.base_url.clone(), cfg.skip_missing, max_concurrency)
let links = Collector::new(cfg.base.clone(), cfg.skip_missing, max_concurrency)
.collect_links(&inputs)
.await
.map_err(|e| anyhow!(e))?;
Expand Down
23 changes: 16 additions & 7 deletions lychee-bin/src/options.rs
Original file line number Diff line number Diff line change
@@ -1,9 +1,8 @@
use std::{fs, io::ErrorKind, path::PathBuf, str::FromStr};
use std::{convert::TryFrom, fs, io::ErrorKind, path::PathBuf, str::FromStr};

use anyhow::{anyhow, Error, Result};
use lazy_static::lazy_static;
use lychee_lib::collector::Input;
use reqwest::Url;
use lychee_lib::{Base, Input};
use serde::Deserialize;
use structopt::{clap::crate_version, StructOpt};

Expand Down Expand Up @@ -76,6 +75,10 @@ macro_rules! fold_in {
};
}

fn parse_base(src: &str) -> Result<Base, lychee_lib::ErrorKind> {
Base::try_from(src)
}

#[derive(Debug, StructOpt)]
#[structopt(
name = "lychee",
Expand Down Expand Up @@ -161,6 +164,11 @@ pub(crate) struct Config {
#[serde(default)]
pub(crate) scheme: Vec<String>,

/// Only check local files and block network requests.
#[structopt(long)]
#[serde(default)]
pub(crate) offline: bool,

/// URLs to check (supports regex). Has preference over all excludes.
#[structopt(long)]
#[serde(default)]
Expand Down Expand Up @@ -223,10 +231,11 @@ pub(crate) struct Config {
#[serde(default = "method")]
pub(crate) method: String,

/// Base URL to check relative URLs
#[structopt(short, long, parse(try_from_str))]
/// Base URL or website root directory to check relative URLs
/// e.g. https://example.org or `/path/to/public`
#[structopt(short, long, parse(try_from_str = parse_base))]
#[serde(default)]
pub(crate) base_url: Option<Url>,
pub(crate) base: Option<Base>,

/// Basic authentication support. E.g. `username:password`
#[structopt(long)]
Expand Down Expand Up @@ -311,7 +320,7 @@ impl Config {
accept: None;
timeout: TIMEOUT;
method: METHOD;
base_url: None;
base: None;
basic_auth: None;
github_token: None;
skip_missing: false;
Expand Down
Loading