Paperless-NGX / Document Pipeline#
Paperless-NGX runs on ereshkigal as a Podman compose stack managed by the
tsunaminoai.docPipeline NixOS module
(modules/nixos/containers/doc-pipeline/default.nix).
- Web UI: http://ereshkigal:8011
- Module option:
tsunaminoai.docPipeline.enable = true;
Architecture#
```mermaid flowchart TD subgraph voile[“voile (Synology DSM) — 10.0.0.2”] nfs[“/volume2/Books”] nfs –> consume[“paperless-consume/\n(drop PDFs here for auto-ingest)”] nfs –> export[“paperless-export/\n(archival copies)”] end
subgraph mokou["mokou (GTX 1080)"]
ollama["Ollama :11434\nqwen2.5-vl:7b"]
end
subgraph ereshkigal["ereshkigal — 10.0.0.1 / 192.168.0.20"]
nfs_mount["/mnt/voile/documents\n(NFSv4.1 over vmbr1 10G)"]
subgraph paperless["paperless_default Podman network"]
broker["paperless-broker\nvalkey:9-alpine :6379"]
db["paperless-db\npostgres:17-alpine"]
web["paperless-web\npaperless-ngx :8011"]
end
nfs_mount --> web
broker --> web
db --> web
end
consume --> nfs_mount
web --> export
web -->|"AI tagging / LLM\nhttp://mokou.ts.net:11434"| ollama
```
NFS uses the dedicated 10G backhaul (vmbr1 bridge, 10.0.0.0/24, MTU 9000)
between ereshkigal and voile, not the regular LAN.
Enabling the module#
# hosts/x86_64-nixos/ereshkigal/default.nix
tsunaminoai.docPipeline = {
enable = true;
ollamaHost = "mokou.${config.tsunaminoai.nix.tailscaleDomain}";
ollamaModel = "qwen2.5-vl:7b";
paperlessPort = 8011;
# voileSharePath and timezone use correct defaults
};
Module options#
| Option | Default | Description |
|---|---|---|
enable |
false |
Enable the pipeline |
paperlessPort |
8011 |
Host port for Paperless-NGX web UI |
ollamaHost |
"localhost" |
Hostname/IP of the Ollama inference server |
ollamaModel |
"qwen2.5-vl:7b" |
Ollama model used for VLM document tagging |
voileSharePath |
"/volume2/Books" |
NFS export path on voile (Synology DSM) |
timezone |
"America/Indiana/Indianapolis" |
Timezone for Paperless-NGX |
NFS mount#
The module mounts voile:/volume2/Books at /mnt/voile/documents via
NFSv4.1 with x-systemd.automount and a 10-minute idle timeout. The systemd
automount unit is mnt-voile-documents.mount.
The paperless-web container depends on this unit, so it will not start until
the NFS share is available. The mount itself waits on
network-bonds-ready.service, which polls for bond0/bond1 readiness on
ereshkigal.
Two subdirectories on the share are bind-mounted into the container:
| Share path | Container path | Purpose |
|---|---|---|
paperless-consume/ |
/usr/src/paperless/consume |
Drop PDFs here for auto-ingest |
paperless-export/ |
/usr/src/paperless/export |
Archival copies written back by Paperless |
Ollama / LLM integration#
LLM tagging and titling is handled by the paperless-gpt sidecar, not by
Paperless-NGX itself (the native PAPERLESS_AI_* vars are unreleased as of
v2.14). See AI → paperless-gpt for full details.
OCR runs on every ingested document (PAPERLESS_OCR_MODE=redo) so tesseract
always extracts a text layer. The paperless-web container has
--add-host=host.containers.internal:host-gateway so it can reach mokou via
the host’s routing table (Tailscale). Use mokou’s Tailscale FQDN for
ollamaHost so the address stays stable regardless of DHCP assignment.
Podman stack#
All three containers share the paperless_default Podman network and are
wired into a single podman-compose-paperless-root.target. The target is
wanted by multi-user.target.
Persistent data lives in named Podman volumes (created by one-shot systemd services):
| Volume | Contents |
|---|---|
paperless_redis |
Valkey queue state |
paperless_pgdata |
PostgreSQL database |
paperless_data |
Paperless index, settings, ML models |
paperless_media |
Scanned document files |
Backups#
The module hooks into the existing borgmatic.configurations.voile job (from
tsunaminoai.borg). Before each borg backup, three pre-hooks run:
- Postgres dump —
pg_dumpinsidepaperless-db, written to/var/backup/paperless/paperless-db-YYYYMMDD.sql. Dumps older than 3 days are pruned automatically. paperless_mediasnapshot — contents copied to/var/backup/paperless/data/media/via a temporarybusyboxcontainer.paperless_datasnapshot — contents copied to/var/backup/paperless/data/appdata/via a temporarybusyboxcontainer.
/var/backup/paperless is already included in borgmatic’s source_directories
(via the borg module’s /var/backup entry), so no additional path registration
is needed.
Note
The busybox snapshots pull docker.io/busybox if not cached. Pre-pull it
after first deploy: podman pull docker.io/busybox