-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #47 from dreammis/gt7-7d25a21d-be49-4630-bc72-4df3…
…2162a5d0 🌐 Add LLM Translations
- Loading branch information
Showing
1 changed file
with
320 additions
and
0 deletions.
There are no files selected for viewing
320 changes: 320 additions & 0 deletions
320
content/posts/how-to/how-to-install-paperless-ngx-on-your-nas/index.en.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,320 @@ | ||
--- | ||
title: "A Must-Have for NAS Experts: A Comprehensive Guide to Easy Document Management with Paperless-ngx" | ||
date: 2023-12-13T12:59:45+08:00 | ||
categories: | ||
- NAS Tutorials | ||
draft: false | ||
# url : /posts/xxx # Specify URL | ||
toc: true | ||
description: Explore Paperless-ngx, the new intelligent document management solution for middle-aged men. | ||
--- | ||
|
||
## 1. Introduction | ||
|
||
After all the tinkering, I realized that `90% of NAS applications are useless`. | ||
|
||
![](9058fdc48d0956b6f4f84b29e4a4a940.jpg) | ||
|
||
Only `3 or 2` of them are actually useful. | ||
|
||
Most of the time, after following various guides and tutorials to set them up, we just `leave them there untouched`. | ||
|
||
From the first NAS I had, the Star Snail, to now, after 8 years, I have researched countless self-hosted applications. | ||
|
||
The protagonist I am introducing today belongs to the remaining 10% that are actually useful. | ||
|
||
To be more precise, it can be considered as the top 1% among this 10% of usefulness. | ||
|
||
The benefits it brings me are not only related to life and work, but also to better `document management and file search`. | ||
|
||
Thanks to it, I have saved at least `500 hours`. | ||
|
||
--- | ||
|
||
Before formally introducing it, I want to talk about another topic: why I have always been `unable to leave the Apple ecosystem`. | ||
|
||
The Apple ecosystem has brought me not only `convenient systems`, `security`, and `smooth integration of all devices`, but also one of the biggest reasons: Apple's powerful `photo-OCR function`. | ||
|
||
For example, if I want to find a `chat screenshot` from a conversation I had with a seller a while ago, in order to provide evidence, | ||
|
||
Compared to the previous method of searching through each image one by one, I can now simply search for the keyword `screwdriver`. Apple Photos will directly locate the image that contains a `screwdriver`. | ||
|
||
![image-20231213105413936](image-20231213105413936.png) | ||
|
||
![image-20231213105424615](image-20231213105424615.png) | ||
|
||
If you are not familiar with this feature yet, don't rush to try it yourself. | ||
|
||
The toy I am introducing to you today can bring you the same: | ||
|
||
- The same effect as the Apple feature. | ||
- Hosted on `your NAS`. | ||
- Complete control over your data. | ||
|
||
![image-20231213110140424](image-20231213134631394.png) | ||
|
||
- It also supports `online preview`: | ||
|
||
![image-20231213131049730](image-20231213131049730.png) | ||
|
||
- It supports all `digital documents`: Not only images, but also PDFs, Word documents, Excel spreadsheets, and even Markdown files. It truly achieves document digitization, unified management, and efficient search. | ||
|
||
![image-20231213110911380](image-20231213110911380.png) | ||
|
||
This is the new toy I am bringing to you today, Paperless-ngx. As the name suggests, it is all about going paperless. | ||
|
||
It can help you organize your contracts, `physical documents`, bills, and more, while also managing digital documents (Word, Excel, PDF, etc.). | ||
|
||
![paperless-ngx-banner](paperless-ngx-banner.png) | ||
|
||
--- | ||
|
||
## Introduction to Paperless-ngx | ||
|
||
Paperless-ngx is not just a document management system. It is a complete solution that converts your physical files into searchable online archives, reducing the use of paper. Its core features include: | ||
|
||
- **Document organization and indexing**: Organize scanned documents using tags, correspondents, types, and more. | ||
- **OCR text recognition**: Perform optical character recognition on documents to enable text search and selection, even for documents with images. | ||
- **Multi-language support**: Utilize the open-source Tesseract engine to support over 100 languages. | ||
- **Long-term storage format**: Save documents in PDF/A format, designed for long-term storage. | ||
- **Intelligent tagging and classification**: Automatically add tags, correspondents, and document types using machine learning. | ||
- **Wide range of file support**: Support for PDF documents, images, plain text files, Office documents, and more. | ||
- **Customizable file management**: Paperless-ngx manages file names and folders, supporting different configurations. | ||
- **Modern web application**: Customizable dashboard, filters, batch editing, drag and drop upload, custom views, shared links, and more. | ||
- **Full-text search**: Auto-complete, relevance ranking, and highlighting of matched query parts. | ||
- **Email handling**: Import documents from email accounts and configure multiple accounts and rules. | ||
- **Multi-user permission system**: Built-in robust multi-user permission system. | ||
- **Multi-core system optimization**: Parallel processing of multiple documents. | ||
|
||
--- | ||
|
||
Setup Steps: | ||
|
||
## 1. Key Points | ||
|
||
`Follow for free` to stay on track. | ||
|
||
## 2. Docker Management GUI Tools | ||
|
||
#### Synology DSM 7.2 or above can directly use *Container Manager* | ||
|
||
![container-manager-1](images/container-manager-1.png) | ||
|
||
#### QNAP ContainerStation | ||
|
||
![container-station-1](images/container-station-1.png) | ||
|
||
![container-station-2](images/container-station-2.png) | ||
|
||
#### Install Portainer Yourself | ||
|
||
Tutorial reference: | ||
|
||
[Install Portainer in NAS in 30 seconds](/how-to-install-portainer-in-nas/) | ||
|
||
## 3. File Station | ||
|
||
- Open File Station and create a `paperless-ngx` folder in the docker folder. | ||
|
||
![image-20231212182330224](image-20231212182330224.png) | ||
|
||
- Create the following directories inside the `paperless-ngx` folder: | ||
- consume | ||
- data | ||
- export | ||
- media | ||
- pgdata | ||
- redisdata | ||
|
||
![image-20231212182342117](image-20231212182342117.png) | ||
|
||
## 4. Container Manager | ||
|
||
I am using Synology's Container Manager for this setup, but Portainer and QNAP are similar: | ||
|
||
### Upload Configuration | ||
|
||
![image-20231212182358034](image-20231212182358034.png) | ||
|
||
Copy the following configuration: | ||
|
||
```yaml | ||
version: "3.4" | ||
services: | ||
broker: | ||
image: library/redis:7 | ||
restart: unless-stopped | ||
volumes: | ||
- /volume1/docker/paperless-ngx/redisdata:/data | ||
|
||
db: | ||
image: library/postgres:15 | ||
restart: unless-stopped | ||
volumes: | ||
- /volume1/docker/paperless-ngx/pgdata:/var/lib/postgresql/data | ||
environment: | ||
POSTGRES_DB: paperless | ||
POSTGRES_USER: paperless | ||
POSTGRES_PASSWORD: paperless | ||
``` | ||
```markdown | ||
webserver: | ||
image: paperlessngx/paperless-ngx:latest | ||
restart: unless-stopped | ||
depends_on: | ||
- db | ||
- broker | ||
- gotenberg | ||
- tika | ||
ports: | ||
- "28000:8000" # change it if you like | ||
healthcheck: | ||
test: ["CMD", "curl", "-fs", "-S", "--max-time", "2", "http://localhost:8000"] | ||
interval: 30s | ||
timeout: 10s | ||
retries: 5 | ||
volumes: | ||
- /volume1/docker/paperless-ngx/data:/usr/src/paperless/data | ||
- /volume1/docker/paperless-ngx/media:/usr/src/paperless/media | ||
- /volume1/docker/paperless-ngx/export:/usr/src/paperless/export | ||
- /volume1/docker/paperless-ngx/consume:/usr/src/paperless/consume | ||
environment: | ||
PAPERLESS_REDIS: redis://broker:6379 | ||
PAPERLESS_DBHOST: db | ||
PAPERLESS_TIKA_ENABLED: 1 | ||
PAPERLESS_TIKA_GOTENBERG_ENDPOINT: http://tika:3009 | ||
PAPERLESS_TIKA_ENDPOINT: http://gotenberg:9998 | ||
PAPERLESS_OCR_LANGUAGES: chi-sim chi-tra # change it if you like | ||
PAPERLESS_OCR_LANGUAGE: eng+chi_sim # change it if you like | ||
USERMAP_UID: 0 | ||
USERMAP_GID: 0 | ||
PAPERLESS_TIME_ZONE: Asia/Shanghai # change it if you like | ||
dns: | ||
- 8.8.8.8 | ||
- 8.8.4.4 | ||
|
||
gotenberg: | ||
image: gotenberg/gotenberg:7.10 | ||
restart: unless-stopped | ||
command: | ||
- "gotenberg" | ||
- "--chromium-disable-javascript=true" | ||
- "--chromium-allow-list=file:///tmp/.*" | ||
|
||
tika: | ||
image: apache/tika:latest | ||
restart: unless-stopped | ||
``` | ||
Explanation of the configuration (customizable): | ||
> I have marked the parts in the above file that I think can be modified with "# change it if you like". For the rest of the parts, it is not recommended for beginners to modify. | ||
- webserver's port section: you can change it to another port number such as "`38000:8000`", `do not modify the 8000 at the end` | ||
|
||
- PAPERLESS_OCR_LANGUAGES: set the `supported languages` for paperless, chi-sim chi-tra (Simplified Chinese, Traditional Chinese), you can add the language you want, such as jpn | ||
|
||
In addition, the system already includes English, German, Italian, etc. | ||
|
||
- PAPERLESS_OCR_LANGUAGE: `default language for OCR`, I have set it to English and Simplified Chinese here | ||
|
||
- PAPERLESS_TIME_ZONE: set your time zone | ||
|
||
### Wait: | ||
|
||
![image-20231212182432126](image-20231212182432126.png) | ||
|
||
### Done: | ||
|
||
![image-20231212182442058](image-20231212182442058.png) | ||
|
||
## 5. Usage | ||
|
||
Access the program in the browser: [ip]:[port] | ||
|
||
> ip is the IP address of your NAS (mine is 172.16.22.22), and the port is defined in the configuration file above. If you follow my tutorial, it is 28000. | ||
|
||
![image-20231213115505007](image-20231213115505007.png) | ||
|
||
But it seems that you don't have a username and password yet, so let's `create an account and password`: | ||
|
||
Select the webserver container and open the terminal: | ||
|
||
![image-20231212182454272](image-20231212182454272.png) | ||
|
||
> python3 manage.py createsuperuser | ||
|
||
Enter the following information: | ||
|
||
- username | ||
- password | ||
|
||
![image-20231212182500769](image-20231212182500769.png) | ||
|
||
## 6. Special Features Showcase | ||
|
||
### Home Page: | ||
|
||
![image-20231212182524985](image-20231212182524985.png) | ||
|
||
### Test PDF File: | ||
|
||
![image-20231212182535297](image-20231212182535297.png) | ||
|
||
The text has been extracted: | ||
|
||
![image-20231212182553777](image-20231212182553777.png) | ||
|
||
#### Online Preview: | ||
|
||
![image-20231212182603518](image-20231212182603518.png) | ||
|
||
#### Search Function: | ||
|
||
![image-20231212182610480](image-20231213134836910.png) | ||
|
||
### Images: | ||
|
||
![image-20231212182618189](image-20231212182618189.png) | ||
|
||
In the edit view, you can see the recognized result and make modifications: | ||
|
||
![image-20231212182641514](image-20231212182641514.png) | ||
|
||
#### Search: | ||
|
||
![image-20231212182625422](image-20231213134930564.png) | ||
|
||
### Trying with Word Files: | ||
|
||
|
||
|
||
![image-20231212182740034](image-20231212182740034.png) | ||
|
||
![image-20231212182653729](image-20231213135041397.png) | ||
|
||
### Other Apps / Support | ||
|
||
You can also download third-party app `paperless_app` | ||
|
||
![image-20231213123109606](image-20231213123109606.png) | ||
|
||
You can also choose to use other scanning apps and then import them into pp (better recognition), such as the free Microsoft Lens | ||
|
||
![Screenshot from Microsoft Lens on an iPhone](36f6ef86-6b7c-4c39-bff4-21cc62f1202d.jpg) | ||
|
||
![Screenshot from Microsoft Lens on an iPhone](31173df1-6cbb-4e55-ad5f-0041273d89d7.jpg) | ||
|
||
You can also choose to connect your physical printer and automatically upload to paperless: | ||
|
||
![image-20231213123313667](image-20231213123313667.png) | ||
|
||
If you have more ideas, please feel free to share. | ||
|
||
## Finally | ||
|
||
If you like this article, please remember to like, bookmark, and follow [Dad's Digital Garden](https://example.com). We will continue to bring more practical self-built application guides. Together, let's take control of our own data and create our own digital world! | ||
|
||
If you encounter any problems or have any suggestions during the setup process, please feel free to leave a comment below for discussion and learning. |