Skip to content

Commit

Permalink
refactor: logic for listing folder content
Browse files Browse the repository at this point in the history
Before, we use `HEAD` request to check for `Link` headers for folder
content. However, there are limits on how big the headers are, and how
many `Link` headers can be returned. This is a problem for folders with
many files.

Therefore, we refactor the code to use `GET` request to list folder by
returing a HTML index page using `<table>` with some conventions. Not
only this lets individual remote backend to be run standalone, listing
contents as HTML, it also lets us parse the response for structured
data listing.

All remote backends are refactored to the new convention.

Refactored `lsjson` to parse HTML response from remote backend to list
items from it. Also let `lsjson` returns each item as a chunk.
Similarly, `lsf`, which uses `lsjson`, returns each item as a chunk.

Added `http` backend to use similar logic to `lsjson` to handle many
HTML index pages.

Added `serve` command to serve a remote from config. Only `http` is
supported for now.
  • Loading branch information
sntran committed Jun 25, 2024
1 parent 050041e commit 2c40eb0
Show file tree
Hide file tree
Showing 33 changed files with 1,765 additions and 659 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,4 @@
node_modules
package.json
package-lock.json
.DS_Store
33 changes: 19 additions & 14 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ fs.
- `deno init backend/remote`
- Edit "backend/remote/main.ts" and "backend/remote/main_test.ts" for the new
backend.
- Implements a `fetch` export function that handles "HEAD", "GET", "PUT" and
- Implements a `fetch` default export function that handles "GET", "PUT" and
"DELETE".
- Uses `backend/local/main.ts` as reference, or the boilerplate below:

Expand Down Expand Up @@ -55,20 +55,25 @@ function router(request: Request) {
status,
headers,
});
}
}\

const exports = {
export default {
fetch: router,
};

export {
// For Cloudflare Workers.
exports as default,
router as fetch,
};

// Learn more at https://deno.land/manual/examples/module_metadata#concepts
if (import.meta.main) {
Deno.serve(router);
}
```

## Architecture

- `GET /folder/`: displays HTML page with folder content.
- `GET /file`: fetches file.
- `PUT /folder/`: creates folder.
- `PUT /file`: uploads file.
- `DELETE /folder/`: deletes folder.
- `DELETE /file`: deletes file.

For displaying folder content, any HTML can be used, but the listing itself
should be in a `<table>`, whose each rows are for each file or folder inside.
Each items should have a `<a>` whose `href` attribute points to the file or
folder, and optionally a `type` attribute to tell the item's mime-type. A
`<data>` element should be used for file size, and `<time>` for file
modification time.
8 changes: 1 addition & 7 deletions backend/alias/main.ts
Original file line number Diff line number Diff line change
Expand Up @@ -34,12 +34,6 @@ function router(request: Request): Promise<Response> {
return fetch(join(remote, pathname), request);
}

const exports = {
export default {
fetch: router,
};

export {
// For Cloudflare Workers.
exports as default,
router as fetch,
};
25 changes: 7 additions & 18 deletions backend/alias/main_test.ts
Original file line number Diff line number Diff line change
@@ -1,38 +1,27 @@
import { join } from "../../deps.ts";
import { assert } from "../../dev_deps.ts";

import { fetch } from "./main.ts";
import backend from "./main.ts";

Deno.test("local path", async () => {
const requestInit = {
method: "HEAD",
method: "GET",
};

const files: string[] = [];

const cwd = Deno.cwd();
const url = new URL(`/backend?remote=${cwd}`, import.meta.url);
const url = new URL(`/backend/?remote=.`, import.meta.url);

const request = new Request(url, requestInit);
const { headers, body } = await fetch(request);

assert(!body, "should not have body");
const response = await backend.fetch(request);
const html = await response.text();

const links = headers.get("Link")?.split(",");
assert(Array.isArray(links), "should have Link headers");

let index = 0;
for await (let { name, isDirectory } of Deno.readDir(join(cwd, "backend"))) {
if (isDirectory) {
name += "/";
}
const link = links![index++];
assert(
link.includes(`<${encodeURIComponent(name)}>`),
`should have ${name} enclosed between < and > and percent encoded`,
html.includes(` href="${name}`),
`should have link to ${name}`,
);

const [_, uri] = link.match(/<(.*)>/) || [];
files.push(decodeURIComponent(uri));
}
});
21 changes: 7 additions & 14 deletions backend/chunker/main.ts
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
#!/usr/bin/env -S deno serve --allow-all

/**
* Chunker
*
Expand Down Expand Up @@ -99,7 +101,7 @@
* to missing chunk errors (especially missing last chunk) than format with
* metadata enabled.
*/
import { crypto, toHashString } from "../../deps.ts";
import { crypto, encodeHex } from "../../deps.ts";
import { Chunker } from "../../lib/streams/chunker.ts";
import { fetch } from "../../main.ts";

Expand All @@ -116,6 +118,7 @@ interface Metadata {

export const options = {
string: [
"hash_type",
"name_format",
"meta_format",
/**
Expand All @@ -129,6 +132,7 @@ export const options = {
boolean: ["fail_hard"],
default: {
chunk_size: 2 * 1024 * 1024 * 1024, // 2 GiB
hash_type: "md5", // | "sha1" | "md5all" | "sha1all" | "md5quick" | "sha1quick" | "none"
meta_format: "simplejson", // | "none"
name_format: "*.rclone_chunk.###",
start_from: 1,
Expand Down Expand Up @@ -289,7 +293,7 @@ async function router(request: Request): Promise<Response> {
ver: METADATA_VERSION,
size: fileSize,
nchunks: chunkIndex - startFrom,
md5: toHashString(await digestStream),
md5: encodeHex(await digestStream),
};
// Adds metadata.
await fetch(`${remote}/${fileName}`, {
Expand Down Expand Up @@ -362,17 +366,6 @@ async function upload(url: string | URL, { headers, body }: RequestInit) {
});
}

const exports = {
export default {
fetch: router,
};

export {
// For Cloudflare Workers.
exports as default,
router as fetch,
};

// Learn more at https://deno.land/manual/examples/module_metadata#concepts
if (import.meta.main) {
Deno.serve(router);
}
9 changes: 3 additions & 6 deletions backend/chunker/main_test.ts
Original file line number Diff line number Diff line change
@@ -1,8 +1,5 @@
import { crypto, toHashString, mkBuffer } from "../../deps.ts";
import {
assert,
assertEquals,
} from "../../dev_deps.ts";
import { crypto, mkBuffer, encodeHex } from "../../deps.ts";
import { assert, assertEquals } from "../../dev_deps.ts";
import { fetch as fetchMemory } from "../memory/main.ts";
import { fetch } from "./main.ts";

Expand All @@ -13,7 +10,7 @@ const buffer = mkBuffer(1024 * 1024 * 10); // 10M
const file = new File([buffer], "10M.bin", {
type: "application/octet-stream",
});
const MD5 = toHashString(await crypto.subtle.digest("MD5", buffer));
const MD5 = encodeHex(await crypto.subtle.digest("MD5", buffer));

let chunkSize = 1024 * 1024 * 4; // 4M

Expand Down
1 change: 1 addition & 0 deletions backend/crypt/PathCipher.ts
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ export default function PathCipher(
const nameCipher = new AES(nameKey);

function encryptName(name: string) {
if (name === "") return "";
const ciphertext = encoder.encode(name);
const paddedCipherText = pad(ciphertext, Padding.PKCS7, 16);
const rawCipherText = Encrypt(nameCipher, nameTweak!, paddedCipherText);
Expand Down
93 changes: 93 additions & 0 deletions backend/crypt/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
# Crypt

`crypt` remotes encrypt and decrypt other remotes.

A remote of type crypt does not access a storage system directly, but instead
wraps another remote, which in turn accesses the storage system. This is similar
to how alias, union, chunker and a few others work. It makes the usage very
flexible, as you can add a layer, in this case an encryption layer, on top of
any other backend, even in multiple layers. Rclone's functionality can be used
as with any other remote, for example you can mount a crypt remote.

Accessing a storage system through a crypt remote realizes client-side
encryption, which makes it safe to keep your data in a location you do not trust
will not get compromised. When working against the crypt remote, rclone will
automatically encrypt (before uploading) and decrypt (after downloading) on your
local system as needed on the fly, leaving the data encrypted at rest in the
wrapped remote. If you access the storage system using an application other than
rclone, or access the wrapped remote directly using rclone, there will not be
any encryption/decryption: Downloading existing content will just give you the
encrypted (scrambled) format, and anything you upload will not become encrypted.

The encryption is a secret-key encryption (also called symmetric key encryption)
algorithm, where a password (or pass phrase) is used to generate real encryption
key. The password can be supplied by user, or you may chose to let rclone
generate one. It will be stored in the configuration file, in a lightly obscured
form. If you are in an environment where you are not able to keep your
configuration secured, you should add configuration encryption as protection. As
long as you have this configuration file, you will be able to decrypt your data.
Without the configuration file, as long as you remember the password (or keep it
in a safe place), you can re-create the configuration and gain access to the
existing data. You may also configure a corresponding remote in a different
installation to access the same data. See below for guidance to changing
password.

Encryption uses cryptographic salt, to permute the encryption key so that the
same string may be encrypted in different ways. When configuring the crypt
remote it is optional to enter a salt, or to let rclone generate a unique salt.
If omitted, rclone uses a built-in unique string. Normally in cryptography, the
salt is stored together with the encrypted content, and do not have to be
memorized by the user. This is not the case in rclone, because rclone does not
store any additional information on the remotes. Use of custom salt is
effectively a second password that must be memorized.

File content encryption is performed using NaCl SecretBox, based on XSalsa20
cipher and Poly1305 for integrity. Names (file- and directory names) are also
encrypted by default, but this has some implications and is therefore possible
to be turned off.

### Standard options

Here are the Standard options specific to crypt (Encrypt/Decrypt a remote).

#### --crypt-remote

Remote to encrypt/decrypt.

Normally should contain a ':' and a path, e.g. "myremote:path/to/dir",
"myremote:bucket" or maybe "myremote:" (not recommended).

Properties:

- Config: remote
- Env Var: RCLONE_CRYPT_REMOTE
- Type: string
- Required: true

#### --crypt-password

Password or pass phrase for encryption.

**NB** Input to this must be obscured - see [obscure](/cmd/obscure/).

Properties:

- Config: password
- Env Var: RCLONE_CRYPT_PASSWORD
- Type: string
- Required: true

#### --crypt-password2

Password or pass phrase for salt.

Optional but recommended. Should be different to the previous password.

**NB** Input to this must be obscured - see [obscure](/cmd/obscure/).

Properties:

- Config: password2
- Env Var: RCLONE_CRYPT_PASSWORD2
- Type: string
- Required: false
Loading

0 comments on commit 2c40eb0

Please sign in to comment.