paperless-gpt#

paperless-gpt is a sidecar container that adds LLM-powered auto-tagging, titling, correspondent assignment, and date extraction to Paperless-NGX. It runs alongside the Paperless stack on ereshkigal and delegates inference to Ollama on mokou.

Architecture#

```mermaid flowchart LR subgraph paperless_default[“paperless_default Podman network”] web[“paperless-web\n:8000 (internal)”] gpt[“paperless-gpt\n:8080 (internal)\n:8013 (host)”] end nginx[“nginx\n:8014 HTTPS”] mokou[“Ollama on mokou\n:11434”]

gpt -->|"REST API + token"| web
gpt -->|"LLM / VLM calls"| mokou
nginx -->|"proxy"| gpt

```

paperless-gpt and paperless-web share the paperless_default Podman network, so they communicate by container hostname (webserver) rather than via the host.

How it works#

Processing is triggered by tags on documents:

Tag Trigger
paperless-gpt Manual — tag a document to process it on demand
paperless-gpt-auto Automatic — background worker polls on paperless-gpt’s own interval (not set by this module)
paperless-gpt-ocr-auto OCR re-processing via VLM

The recommended workflow: create a Paperless workflow (Admin → Workflows) with trigger “Document Added” and action “Assign tag paperless-gpt-auto”. All newly ingested documents are then processed automatically in the background without hitting the 60s UI timeout.

For already-existing documents, tag them manually with paperless-gpt and use the UI to review and apply suggestions.

Module options#

Configured under tsunaminoai.docPipeline.paperlessGpt:

Option Default Description
enable true Enable the sidecar
manualTag "paperless-gpt" Tag for on-demand processing
autoTag "paperless-gpt-auto" Tag for background auto-processing
port 8013 Host HTTP port (HTTPS = port + 1)
llmModel "granite3.1-dense:8b" Ollama model for text tagging/titling
tokenLimit "4000" Max tokens of document text sent to the LLM (TOKEN_LIMIT)
contextLength "8192" Ollama context window size (OLLAMA_CONTEXT_LENGTH)

The vision model for OCR is inherited from docPipeline.ollamaModel (qwen2.5vl:7b).

Persistence and custom fields#

paperless-gpt keeps three pieces of state under its /app working directory, all mounted to host dirs under /var/lib/paperless-gpt so they survive a container recycle:

Container path Host path Contents
/app/prompts /var/lib/paperless-gpt/prompts Editable prompt templates (*.toml)
/app/config /var/lib/paperless-gpt/config settings.json — custom-field enable/select/write-mode
/app/db /var/lib/paperless-gpt/db modification_history.db — OCR page cache + undo history

Custom fields are UI-only — there is no env var for them. Auto-generation of custom fields is driven solely by the custom_fields_enable flag in settings.json, which is set via the UI (Settings → Custom Fields). Before /app/config was persisted, every recycle reset this to false and silently turned the feature off. With the volume in place you enable it once and it sticks:

  1. paperless-gpt UI → Settings → enable Custom Fields
  2. Select at least one custom field (required — it warns if enabled with none selected)
  3. Set write mode = Append (adds new fields without overwriting existing ones)

Note: the AUTO_GENERATE_TITLE / _TAGS / _CORRESPONDENTS / _DOCUMENT_TYPE / _CREATED_DATE toggles default to on when unset, so those are auto-generated for paperless-gpt-auto documents without any extra config — only custom fields require the one-time UI step above.

SOPS secret#

The Paperless API token is stored in SOPS and injected into the container via a systemd EnvironmentFile. A sops.templates entry formats it as KEY=VALUE:

sops.secrets."paperless/api-token" = {};

sops.templates."paperless-gpt-env" = {
  content = "PAPERLESS_API_TOKEN=${config.sops.placeholder."paperless/api-token"}";
  mode = "0400";
};

The podman container gets --env=PAPERLESS_API_TOKEN (no value — this form reads from the process environment, which systemd populates from the EnvironmentFile).

To create the secret:

# On the machine with your SOPS key
sops secrets.yaml
# Add:  paperless/api-token: <token from Paperless Admin → Tokens>

Granian / IPv6 binding#

Paperless-NGX uses the Granian ASGI server, which binds IPv6-only (:::8000) by default. paperless-gpt resolves webserver to an IPv4 address inside the Podman network, causing connection refused errors.

Fix: set GRANIAN_HOST=0.0.0.0 in the paperless-web container environment (not just PAPERLESS_BIND_ADDR — Granian reads its own env var directly).

Tag creation#

CREATE_NEW_TAGS=true allows paperless-gpt to create tags in Paperless that don’t yet exist. Without this, it can only assign pre-existing tags.

To bootstrap a useful starting set, use the Paperless API:

set TOKEN (ssh ereshkigal sudo cat /run/secrets/paperless/api-token)
set BASE "https://ereshkigal.armadillo-banfish.ts.net:8012"

set tags tax-return tax-document bank-statement pay-stub invoice receipt \
  investment loan contract agreement legal identity medical prescription \
  employment insurance vehicle education government correspondence personal

for tag in $tags
  curl -sf -o /dev/null -w "%{http_code}  $tag\n" \
    -X POST "$BASE/api/tags/" \
    -H "Authorization: Token $TOKEN" \
    -H "Content-Type: application/json" \
    -d "{\"name\": \"$tag\"}"
end

Tag prompt#

The tag prompt is editable in the paperless-gpt UI (Settings → Prompts). With CREATE_NEW_TAGS=true, update it to allow new tag suggestions:

I will provide you with the content and the title of a document.
Your task is to select appropriate tags for the document.
Prefer tags from the provided list when they fit. You may also suggest new
tags not in the list if none of the existing tags adequately describe the document.
Respond only with the selected tags as a comma-separated list.
New tags should be short, lowercase, and general enough to apply to multiple documents.

<available_tags>
{{.AvailableTags | join ", "}}
</available_tags>

<title>{{.Title}}</title>
<content>{{.Content}}</content>

Performance notes#

  • Processing time: ~30s–6min per document depending on page count and whether vision OCR runs. Background auto-processing has no timeout; the UI endpoint times out at 60s.
  • PDF_SKIP_EXISTING_OCR=true: Paperless+tesseract already runs OCR on ingest, so paperless-gpt skips re-running it on documents that already have a text layer.
  • TOKEN_LIMIT=4000: Truncates document content to ~3000 words before sending to the LLM. Raised from the original 1000 to feed more document content for better title/tag/date accuracy while still preventing context overflow.

Troubleshooting#

connection refused on startup — paperless-gpt started before paperless-web was ready. Normal; it retries on paperless-gpt’s own interval and recovers automatically.

Token not reaching container — verify the sops template path matches the EnvironmentFile path in the systemd service, and that --env=PAPERLESS_API_TOKEN (without a value) is in extraOptions.

Processing very slow / timing out — check that Ollama is using the GPU: nvidia-smi on mokou should show nonzero GPU utilisation during inference. If not, see Ollama — The CUDA SM 6.1 problem.