Docker Compose
Manual Docker Compose setup for SurfSense
Setup
git clone https://github.com/MODSetter/SurfSense.git
cd SurfSense/docker
cp .env.example .env
# Edit .env, at minimum set SECRET_KEY
docker compose up -dAfter starting, access SurfSense at:
- SurfSense: http://localhost:3929
- Backend API: http://localhost:3929/api/v1
- Zero sync:
ws://localhost:3929/zero
Configuration
All configuration lives in a single docker/.env file (or surfsense/.env if you used the install script). Copy .env.example to .env and edit the values you need.
Required
| Variable | Description |
|---|---|
SECRET_KEY | JWT secret key. Generate with: openssl rand -base64 32. Auto-generated by the install script. |
Core Settings
| Variable | Description | Default |
|---|---|---|
SURFSENSE_VERSION | Image tag to deploy. Use latest, a clean version (e.g. 0.0.14), or a specific build (e.g. 0.0.14.1) | latest |
SURFSENSE_VARIANT | Backend image variant. Leave empty for CPU, set cuda for CUDA 12.8, or cuda126 for CUDA 12.6. | (empty) |
AUTH_TYPE | Authentication method: LOCAL (email/password) or GOOGLE (OAuth) | LOCAL |
ETL_SERVICE | Document parsing: DOCLING (local), LITEPARSE (local LiteParse), UNSTRUCTURED, or LLAMACLOUD | DOCLING |
EMBEDDING_MODEL | Embedding model for vector search | sentence-transformers/all-MiniLM-L6-v2 |
TTS_SERVICE | Text-to-speech provider for podcasts | local/kokoro |
STT_SERVICE | Speech-to-text provider for audio files | local/base |
REGISTRATION_ENABLED | Allow new user registrations | TRUE |
Image Variants
SurfSense publishes CPU and CUDA backend image variants. The frontend image is not variant-specific.
| Backend tag | Use case | SURFSENSE_VARIANT |
|---|---|---|
:latest | CPU-only default | (empty) |
:latest-cuda | NVIDIA CUDA 12.8 backend image | cuda |
:latest-cuda126 | NVIDIA CUDA 12.6 backend image for older driver stacks | cuda126 |
All backend variants are published for linux/amd64 and linux/arm64. CUDA on linux/arm64 is best-effort.
GPU acceleration needs two settings: SURFSENSE_VARIANT selects the CUDA image, and COMPOSE_FILE enables the GPU device overlay. The host must have the NVIDIA Container Toolkit installed.
NVIDIA GPU Acceleration
For most NVIDIA systems, add these values to .env to use the CUDA 12.8 image:
SURFSENSE_VARIANT=cuda
COMPOSE_FILE=docker-compose.yml:docker-compose.gpu.yml
SURFSENSE_GPU_COUNT=1Use SURFSENSE_VARIANT=cuda126 for older NVIDIA driver stacks or older GPUs that need the CUDA 12.6 fallback image.
On Windows, use ; instead of : in COMPOSE_FILE inside .env:
COMPOSE_FILE=docker-compose.yml;docker-compose.gpu.ymlTo switch variants later, edit SURFSENSE_VARIANT and COMPOSE_FILE in .env, then run:
docker compose pull
docker compose up -d --waitAutomatic Updates
Manual Docker Compose installs do not start Watchtower automatically. To enable external automatic updates, run Watchtower separately:
docker run -d --name watchtower \
--restart unless-stopped \
-v /var/run/docker.sock:/var/run/docker.sock \
nickfedor/watchtower \
--label-enable \
--interval 86400SurfSense containers are labeled for Watchtower, so --label-enable limits updates to the SurfSense services.
Public URL and Ports
| Variable | Description | Default |
|---|---|---|
SURFSENSE_PUBLIC_URL | Public origin used by the frontend, backend OAuth callbacks, and Zero browser URL | http://localhost:3929 |
SURFSENSE_SITE_ADDRESS | Caddy site address. :80 means local plain HTTP; a hostname enables automatic HTTPS | :80 |
LISTEN_HTTP_PORT | Host port mapped to Caddy's HTTP listener | 3929 |
LISTEN_HTTPS_PORT | Host port mapped to Caddy's HTTPS listener for domain mode | 443 |
SurfSense includes Caddy by default. The frontend, backend, and
zero-cache containers are internal-only in the production compose file; the
browser reaches them through Caddy path routing.
Custom Domain / Automatic HTTPS
For a real domain, point DNS at the Docker host and set:
SURFSENSE_SITE_ADDRESS=surf.example.com
LISTEN_HTTP_PORT=80
LISTEN_HTTPS_PORT=443
CERT_EMAIL=you@example.com
SURFSENSE_PUBLIC_URL=https://surf.example.comCaddy will issue and renew Let's Encrypt certificates automatically. Ports 80 and 443 must be reachable from the internet for the default HTTP-01 challenge.
| Variable | Description |
|---|---|
CERT_EMAIL | Optional ACME contact email |
CERT_ACME_CA | ACME directory URL; use Let's Encrypt staging when testing cert issuance |
CERT_ACME_DNS | DNS-01 challenge config; requires the custom Caddy build |
TRUSTED_PROXIES | CIDR ranges trusted for forwarded client IP headers |
SURFSENSE_MAX_BODY_SIZE | Upload limit enforced at the proxy |
Bring Your Own Proxy
If you already run nginx, Traefik, Cloudflare Tunnel, or another ingress, you
can comment out the proxy service and route traffic to the internal services
with the same path contract:
| Public path | Upstream |
|---|---|
/auth/* | backend:8000 |
/api/v1/* | backend:8000 |
/zero/* | zero-cache:4848 |
/* | frontend:3000 |
Alternative proxies must preserve WebSocket upgrades for /zero, avoid
buffering streaming responses, allow long-running requests, and support large
uploads. For DNS-01 or wildcard certificates with Caddy, build
docker/proxy/Dockerfile and set CERT_ACME_DNS for your DNS provider.
Zero-cache (Real-Time Sync)
Defaults work out of the box. Change ZERO_ADMIN_PASSWORD for security in production.
| Variable | Description | Default |
|---|---|---|
ZERO_ADMIN_PASSWORD | Password for the zero-cache admin UI and /statz endpoint | surfsense-zero-admin |
ZERO_UPSTREAM_DB | PostgreSQL connection URL for replication (must be a direct connection, not via pgbouncer) | (built from DB_ vars)* |
ZERO_CVR_DB | PostgreSQL connection URL for client view records | (built from DB_ vars)* |
ZERO_CHANGE_DB | PostgreSQL connection URL for replication log entries | (built from DB_ vars)* |
ZERO_APP_PUBLICATIONS | PostgreSQL publication restricting which tables are replicated (created by migration 116, verified by the migrations service before zero-cache starts) | zero_publication |
ZERO_NUM_SYNC_WORKERS | Number of view-sync worker processes. Must be ≤ connection pool sizes | 4 |
ZERO_UPSTREAM_MAX_CONNS | Max connections to upstream PostgreSQL for mutations | 20 |
ZERO_CVR_MAX_CONNS | Max connections to the CVR database | 30 |
Database
Defaults work out of the box. Change for security in production.
| Variable | Description | Default |
|---|---|---|
DB_USER | PostgreSQL username | surfsense |
DB_PASSWORD | PostgreSQL password | surfsense |
DB_NAME | PostgreSQL database name | surfsense |
DB_HOST | PostgreSQL host | db |
DB_PORT | PostgreSQL port | 5432 |
DB_SSLMODE | SSL mode: disable, require, verify-ca, verify-full | disable |
DATABASE_URL | Full connection URL override. Use for managed databases (RDS, Supabase, etc.) | (built from above) |
Authentication
| Variable | Description |
|---|---|
GOOGLE_OAUTH_CLIENT_ID | Google OAuth client ID (required if AUTH_TYPE=GOOGLE) |
GOOGLE_OAUTH_CLIENT_SECRET | Google OAuth client secret (required if AUTH_TYPE=GOOGLE) |
Create credentials at the Google Cloud Console.
External API Keys
| Variable | Description |
|---|---|
UNSTRUCTURED_API_KEY | Unstructured.io API key (required if ETL_SERVICE=UNSTRUCTURED) |
LLAMA_CLOUD_API_KEY | LlamaCloud API key (required if ETL_SERVICE=LLAMACLOUD) |
| (LiteParse) | For ETL_SERVICE=LITEPARSE, no cloud API key; backend uses the liteparse package (included in pyproject.toml) |
Connector OAuth Keys
Uncomment the connectors you want to use. Redirect URIs follow the single-origin
pattern ${SURFSENSE_PUBLIC_URL}/api/v1/auth/<connector>/connector/callback.
For local Docker defaults, that means
http://localhost:3929/api/v1/auth/<connector>/connector/callback.
| Connector | Variables |
|---|---|
| Google Drive / Gmail / Calendar | GOOGLE_DRIVE_REDIRECT_URI, GOOGLE_GMAIL_REDIRECT_URI, GOOGLE_CALENDAR_REDIRECT_URI |
| Notion | NOTION_CLIENT_ID, NOTION_CLIENT_SECRET, NOTION_REDIRECT_URI |
| Slack | SLACK_CLIENT_ID, SLACK_CLIENT_SECRET, SLACK_REDIRECT_URI |
| Discord | DISCORD_CLIENT_ID, DISCORD_CLIENT_SECRET, DISCORD_BOT_TOKEN, DISCORD_REDIRECT_URI |
| Atlassian (Jira & Confluence) | ATLASSIAN_CLIENT_ID, ATLASSIAN_CLIENT_SECRET, JIRA_REDIRECT_URI, CONFLUENCE_REDIRECT_URI |
| Linear | LINEAR_CLIENT_ID, LINEAR_CLIENT_SECRET, LINEAR_REDIRECT_URI |
| ClickUp | CLICKUP_CLIENT_ID, CLICKUP_CLIENT_SECRET, CLICKUP_REDIRECT_URI |
| Airtable | AIRTABLE_CLIENT_ID, AIRTABLE_CLIENT_SECRET, AIRTABLE_REDIRECT_URI |
| Microsoft (Teams & OneDrive) | MICROSOFT_CLIENT_ID, MICROSOFT_CLIENT_SECRET, TEAMS_REDIRECT_URI, ONEDRIVE_REDIRECT_URI |
| Dropbox | DROPBOX_APP_KEY, DROPBOX_APP_SECRET, DROPBOX_REDIRECT_URI |
Messaging Channels
Configure these in the same docker/.env file when you want users to chat with
SurfSense from external apps. See Messaging Channels
for full setup.
| Channel | Variables |
|---|---|
| Telegram | TELEGRAM_SHARED_BOT_TOKEN, TELEGRAM_SHARED_BOT_USERNAME, TELEGRAM_WEBHOOK_SECRET, GATEWAY_BASE_URL, GATEWAY_TELEGRAM_INTAKE_MODE |
GATEWAY_WHATSAPP_INTAKE_MODE, WHATSAPP_SHARED_BUSINESS_TOKEN, WHATSAPP_SHARED_PHONE_NUMBER_ID, WHATSAPP_SHARED_DISPLAY_PHONE_NUMBER, WHATSAPP_SHARED_WABA_ID, WHATSAPP_WEBHOOK_VERIFY_TOKEN, WHATSAPP_WEBHOOK_APP_SECRET | |
| Slack | SLACK_CLIENT_ID, SLACK_CLIENT_SECRET, GATEWAY_SLACK_ENABLED, GATEWAY_SLACK_SIGNING_SECRET, GATEWAY_SLACK_REDIRECT_URI |
| Discord | DISCORD_CLIENT_ID, DISCORD_CLIENT_SECRET, DISCORD_BOT_TOKEN, GATEWAY_DISCORD_ENABLED, GATEWAY_DISCORD_REDIRECT_URI |
Observability (optional)
| Variable | Description |
|---|---|
LANGSMITH_TRACING | Enable LangSmith tracing (true / false) |
LANGSMITH_ENDPOINT | LangSmith API endpoint |
LANGSMITH_API_KEY | LangSmith API key |
LANGSMITH_PROJECT | LangSmith project name |
Advanced (optional)
| Variable | Description | Default |
|---|---|---|
SCHEDULE_CHECKER_INTERVAL | How often to check for scheduled connector tasks (e.g. 5m, 1h) | 5m |
RERANKERS_ENABLED | Enable document reranking for improved search | FALSE |
RERANKERS_MODEL_NAME | Reranker model name (e.g. ms-marco-MiniLM-L-12-v2) | |
RERANKERS_MODEL_TYPE | Reranker model type (e.g. flashrank) | |
PAGES_LIMIT | Max pages per user for ETL services | unlimited |
Docker Services
| Service | Description |
|---|---|
proxy | Caddy reverse proxy; the only public ingress in production Docker |
db | PostgreSQL with pgvector extension |
migrations | Short-lived: runs alembic upgrade head and verifies zero_publication, then exits |
redis | Message broker for Celery |
searxng | Local privacy-respecting search backend |
backend | FastAPI application server |
celery_worker | Background task processing (document indexing, etc.) |
celery_beat | Periodic task scheduler (connector sync) |
zero-cache | Rocicorp Zero real-time sync (replicates Postgres to clients) |
frontend | Next.js web application, internal behind Caddy |
All services start automatically with docker compose up -d.
How startup ordering works
Schema migrations run as a dedicated migrations service that exits 0 on
success and non-zero on failure. Every other backend-image service gates on
it via condition: service_completed_successfully:
db (healthy) ──▶ migrations (alembic upgrade head + verify zero_publication)
│
├── exit 0 ─▶ backend ──▶ frontend
│ celery_worker
│ celery_beat
│ zero-cache ──▶ frontend
│
└── exit ≠ 0 ─▶ compose halts the rest of the stackThis guarantees zero-cache only starts after zero_publication exists in
Postgres. Before this design, a silent migration failure would leave
zero-cache crash-looping with Unknown or invalid publications. Specified: [zero_publication]. Found: [].
Readiness vs liveness
The backend exposes two endpoints:
GET /health: lightweight liveness probe (always returns 200 if the process is up).GET /ready: readiness probe that confirmszero_publicationexists. Returns 503 if not. The composebackend.healthcheckuses/readyso the container only reportshealthyonce the schema is actually usable by zero-cache.
You can also monitor startup progress with docker compose ps (look for
(health: starting) → (healthy)). The install script polls these states
automatically and times out after 5 minutes if the stack does not converge.
Useful Commands
# View logs (all services)
docker compose logs -f
# View logs for a specific service
docker compose logs -f backend
# Stop all services
docker compose down
# Restart a specific service
docker compose restart backend
# Stop and remove all containers + volumes (destructive!)
docker compose down -vTroubleshooting
- Port already in use: Change
LISTEN_HTTP_PORTin.envand restart. In domain mode, use ports80and443so Caddy can complete certificate issuance. - Permission errors on Linux: You may need to prefix
dockercommands withsudo. - Real-time updates not working: Open DevTools → Console and check for WebSocket errors. In production Docker the expected URL is
${SURFSENSE_PUBLIC_URL}/zero. - Line ending issues on Windows: Run
git config --global core.autocrlf truebefore cloning.
Migration service exited non-zero
The migrations service exits non-zero in two cases:
alembic upgrade headfailed (timeout or SQL error).alembicsucceeded butzero_publicationis still missing frompg_publication.
Inspect the logs and the alembic state:
docker compose logs migrations
docker compose exec db psql -U surfsense -d surfsense \
-c 'SELECT * FROM alembic_version;'
docker compose exec db psql -U surfsense -d surfsense \
-c 'SELECT pubname FROM pg_publication;'The default migration timeout is 900 seconds. Slow disks (Windows / WSL2)
may need more. Set MIGRATION_TIMEOUT in .env to increase it.
Zero-cache stuck on Unknown or invalid publications
Symptom (in docker compose logs zero-cache):
Error: Unknown or invalid publications. Specified: [zero_publication]. Found: []This means zero-cache started before zero_publication was created or the
publication does not match SurfSense's canonical Zero shape. With the current
compose files this should be impossible: the migrations service blocks
zero-cache from starting and verifies the publication before exiting
successfully. If you see it, your stack predates the fix or you brought up
zero-cache manually with docker compose up zero-cache before the migrations
service ran.
Recovery:
docker compose down
docker volume rm surfsense-zero-cache # wipe half-built SQLite replica
docker compose up -d # migrations runs first, then zero-cacheZero-cache crashes with _zero.tableMetadata errors
This indicates a half-initialized SQLite replica left behind by a previous
crash. Zero's own event triggers and ZERO_AUTO_RESET handle schema and
replication halts automatically. If the local SQLite replica is wedged, run the
recovery one-liner above to wipe surfsense-zero-cache; zero-cache will
re-sync from Postgres on the next start.
Ensuring wal_level = logical
Logical replication is required by zero-cache. The bundled
docker/postgresql.conf sets wal_level = logical automatically. If you
swap in your own config or use a managed Postgres, confirm with:
docker compose exec db psql -U surfsense -d surfsense \
-c "SHOW wal_level;"Using docker-compose.deps-only.yml
docker-compose.deps-only.yml runs only the dependencies (Postgres, Redis,
SearXNG, zero-cache) on Docker while the backend and frontend run on the
host. Because there is no backend container in this stack, there is no
migrations service either, and you must run alembic on the host before
bringing the stack up:
cd surfsense_backend
uv run alembic upgrade head
cd ../docker
docker compose -f docker-compose.deps-only.yml up -dIf you skip the alembic step, zero-cache will crash-loop with Unknown or invalid publications. Specified: [zero_publication].
