Generating PDFs on a server is one of those tasks that sounds simple until you actually sit down to do it. HTML-to-PDF rendering drifts between browsers, LibreOffice headless mode is finicky to install, and most SaaS solutions charge per page once you hit volume. Gotenberg solves this cleanly: a single Docker container that bundles headless Chromium, LibreOffice, QPDF, pdfcpu, PDFtk, and ExifTool, then exposes all of their capabilities through a straightforward HTTP API. You POST files and parameters, you receive a PDF. No install scripts, no dependency hell, no per-page billing.
With 12,000+ GitHub stars and production deployments across thousands of companies, Gotenberg is the de-facto standard for self-hosted PDF generation pipelines.
What Does This Thing Actually Do?
Gotenberg is a multipart/form-data API built in Go. Each route accepts one or more files alongside form parameters, runs the appropriate backend engine, and returns the resulting PDF. The entire toolchain – Chromium, LibreOffice, and all the PDF utilities – is baked into the image so nothing needs to be installed on the host.
Key capabilities:
- HTML to PDF via Chromium – Renders HTML exactly as a browser would, including JavaScript execution, web fonts, CSS Grid, animations, and
@media printrules - URL to PDF – Fetches any accessible URL and renders it to PDF with a single API call
- Markdown to PDF – Converts Markdown files through an HTML wrapper before rendering
- Office to PDF via LibreOffice – Converts .docx, .xlsx, .pptx, .odt, .ods, .odp, and 100+ other formats
- Header and Footer Templates – Injects per-page HTML header and footer files with dynamic classes for page numbers, totals, date, and title
- PDF Metadata Editing – Reads and writes XMP metadata (Author, Title, Subject, Keywords, Copyright, and more) via ExifTool
- PDF Merge and Split – Combines multiple PDFs or extracts page ranges using QPDF, pdfcpu, or PDFtk
- Form Field Flattening – Converts interactive AcroForm fields to static content
- Watermarking and Stamping – Overlays text or image watermarks on every page
- Encryption – Applies user and owner passwords with permission controls
- PDF/A and PDF/UA – Produces archival-compliant and accessibility-tagged PDFs
- Screenshots – Captures full-page or viewport screenshots of URLs and HTML files
- Webhook Support – Sends completed PDFs to a callback URL asynchronously for long-running jobs
- Prometheus Metrics – Exposes conversion statistics for monitoring
Docker Compose Setup
Gotenberg needs no external services. A single container is all that is required for most deployments:
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
services: gotenberg: image: gotenberg/gotenberg:8 container_name: gotenberg restart: unless-stopped ports: - "127.0.0.1:3000:3000" command: - "gotenberg" - "--api-timeout=60s" - "--chromium-auto-start=true" - "--chromium-max-concurrency=4" - "--chromium-restart-after=100" - "--libreoffice-auto-start=true" - "--libreoffice-restart-after=10" - "--log-level=info" networks: - gotenberg-network networks: gotenberg-network: |
Binding to 127.0.0.1:3000 is strongly recommended. Gotenberg has no built-in authentication, so exposing port 3000 to the public internet allows anyone to trigger conversions. Let a reverse proxy handle TLS and access control in front of this binding.
Installation Steps
1. System Requirements
Chromium is the heaviest component. Plan for:
- Docker and Docker Compose installed
- 1 GB RAM minimum for simple HTML rendering, 2–4 GB for concurrent workloads
- 2 vCPUs recommended; Chromium and LibreOffice are CPU-bound during conversion
- The image is ~1 GB on disk due to the bundled Chromium and LibreOffice binaries
2. Create the Project Directory
|
1 2 3 4 |
mkdir -p ~/docker/gotenberg cd ~/docker/gotenberg |
3. Save the Compose File
Create a docker-compose.yml with the configuration from above.
4. Pull the Image
|
1 2 3 |
docker compose pull |
5. Start the Service
|
1 2 3 |
docker compose up -d |
6. Verify
|
1 2 3 4 |
curl -s http://localhost:3000/health docker compose logs -f gotenberg |
The health endpoint returns HTTP 200 when the service is ready. Chromium and LibreOffice will appear in the logs as they initialise when auto-start is enabled.
7. Your First Conversion
|
1 2 3 4 5 |
curl -X POST http://localhost:3000/forms/chromium/convert/url \ --form url=https://example.com \ -o example.pdf |
Environment Variables Explained
Gotenberg is configured entirely through CLI flags. Every flag has a matching environment variable in SCREAMING_SNAKE_CASE, so you can pass configuration either way. The examples below use environment variables, which is cleaner in most compose setups.
API_TIMEOUT
Purpose: Maximum time a single conversion request may run before the server returns a timeout error.
Format: Duration string (e.g. 30s, 120s, 5m)
Default: 30s
|
1 2 3 |
API_TIMEOUT=60s |
Raise this when generating long-running documents like 100-page reports or complex office file conversions. Also increase the corresponding timeout on any reverse proxy in front of Gotenberg.
API_BODY_LIMIT
Purpose: Maximum allowed size of a multipart/form-data request body.
Format: Byte count integer
Default: No limit
|
1 2 3 |
API_BODY_LIMIT=52428800 |
Setting a body limit protects the service from unexpectedly large uploads. 50–100 MB covers most office document and HTML asset use cases.
API_ENABLE_BASIC_AUTH
Purpose: Enables HTTP Basic Authentication on all API routes.
Format: Boolean (true / false)
Default: false
|
1 2 3 |
API_ENABLE_BASIC_AUTH=true |
When enabled, set credentials via GOTENBERG_API_BASIC_AUTH_USERNAME and GOTENBERG_API_BASIC_AUTH_PASSWORD. This is the only built-in auth mechanism; prefer a reverse proxy with a proper auth layer for more flexibility.
CHROMIUM_MAX_CONCURRENCY
Purpose: Number of HTML/URL/Markdown conversions that can run simultaneously inside a single Chromium instance.
Format: Integer
Default: 6
|
1 2 3 |
CHROMIUM_MAX_CONCURRENCY=4 |
Higher values increase throughput on multi-core hosts but consume more memory per Chromium tab. A safe starting point is 2x your vCPU count.
CHROMIUM_RESTART_AFTER
Purpose: Automatically restarts Chromium after this many conversions to prevent memory leaks.
Format: Integer
Default: 100
|
1 2 3 |
CHROMIUM_RESTART_AFTER=50 |
Lowering this value is useful on memory-constrained hosts where Chromium tends to grow over time. Set to 0 to disable automatic restarts.
CHROMIUM_AUTO_START
Purpose: Pre-warms Chromium at container startup instead of on the first request.
Format: Boolean
Default: false
|
1 2 3 |
CHROMIUM_AUTO_START=true |
Enable this for production. Without it, the first request after startup pays the cold-start cost of launching Chromium, which can take several seconds.
CHROMIUM_DENY_LIST
Purpose: Regex pattern of URLs that Chromium is not allowed to navigate to or load resources from.
Format: Regex string
Default: ^file:(?!//\/tmp/).*
|
1 2 3 |
CHROMIUM_DENY_LIST=^file:(?!//\/tmp/).* |
The default blocks access to local filesystem paths outside /tmp. Tighten this further in production by also blocking your internal network ranges.
CHROMIUM_DISABLE_JAVASCRIPT
Purpose: Globally disables JavaScript execution in Chromium for all conversions.
Format: Boolean
Default: false
|
1 2 3 |
CHROMIUM_DISABLE_JAVASCRIPT=true |
Useful when processing untrusted HTML where JavaScript execution is a security risk. Disable only when your templates do not rely on JS for rendering.
LIBREOFFICE_AUTO_START
Purpose: Pre-warms LibreOffice at container startup.
Format: Boolean
Default: false
|
1 2 3 |
LIBREOFFICE_AUTO_START=true |
LibreOffice takes longer than Chromium to cold-start. Enable this when you convert office files regularly so the first request is not penalised.
LIBREOFFICE_RESTART_AFTER
Purpose: Restarts LibreOffice after a fixed number of conversions.
Format: Integer
Default: 10
|
1 2 3 |
LIBREOFFICE_RESTART_AFTER=10 |
LibreOffice is more prone to memory drift than Chromium. The low default of 10 is intentional. Do not raise it significantly without monitoring memory over time.
LOG_LEVEL
Purpose: Controls verbosity of container logs.
Format: String: error, warn, info, debug
Default: info
|
1 2 3 |
LOG_LEVEL=info |
WEBHOOK_DISABLE
Purpose: Disables the async webhook feature entirely if you only need synchronous responses.
Format: Boolean
Default: false
|
1 2 3 |
WEBHOOK_DISABLE=true |
Volume Mounts Explained
Gotenberg is stateless and requires no persistent volumes for basic operation. Temporary files are written to the container’s /tmp directory during each job and cleaned up on completion.
tmpfs Scratch Mount (Recommended)
Purpose: Backs /tmp with RAM for faster I/O during conversion.
Mount Point: /tmp
|
1 2 3 4 |
tmpfs: - /tmp:size=1g,mode=1777 |
Particularly useful on high-throughput setups where many conversions happen in parallel. Size it to cover your maximum expected concurrent job size.
Font Volume (Optional)
Purpose: Adds custom or licensed fonts that must be available to both Chromium and LibreOffice.
Mount Point: /usr/share/fonts/custom
|
1 2 3 4 |
volumes: - ./fonts:/usr/share/fonts/custom:ro |
After mounting, the font cache must be rebuilt inside the container. The cleanest approach is to build a custom Dockerfile based on gotenberg/gotenberg:8 that copies in the fonts and runs fc-cache -fv.
API Routes Reference
HTML to PDF
|
1 2 3 |
POST /forms/chromium/convert/html |
The main document must be named index.html. Additional files (CSS, images, fonts) should be uploaded with the same request and referenced with relative paths.
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
curl -X POST http://localhost:3000/forms/chromium/convert/html \ --form files=@index.html \ --form files=@styles.css \ --form files=@logo.png \ --form paperWidth=8.27 \ --form paperHeight=11.7 \ --form marginTop=1.0 \ --form marginBottom=1.0 \ --form marginLeft=1.0 \ --form marginRight=1.0 \ --form printBackground=true \ --form emulatedMediaType=print \ -o report.pdf |
URL to PDF
|
1 2 3 4 5 6 7 |
curl -X POST http://localhost:3000/forms/chromium/convert/url \ --form url=https://example.com \ --form paperWidth=8.5 \ --form paperHeight=11 \ -o page.pdf |
Office Documents to PDF
|
1 2 3 4 5 |
curl -X POST http://localhost:3000/forms/libreoffice/convert \ --form files=@contract.docx \ -o contract.pdf |
Merge PDFs
|
1 2 3 4 5 6 7 |
curl -X POST http://localhost:3000/forms/pdfengines/merge \ --form files=@cover.pdf \ --form files=@content.pdf \ --form files=@appendix.pdf \ -o full-report.pdf |
Write PDF Metadata
|
1 2 3 4 5 6 |
curl -X POST http://localhost:3000/forms/pdfengines/metadata/write \ --form files=@document.pdf \ --form metadata='{"Author":"Jane Doe","Title":"Q2 Report","Keywords":"finance quarterly","Copyright":"2026 ACME Corp"}' \ -o document-tagged.pdf |
Read PDF Metadata
|
1 2 3 4 |
curl -X POST http://localhost:3000/forms/pdfengines/metadata/read \ --form files=@document.pdf |
Flatten Form Fields
|
1 2 3 4 5 |
curl -X POST http://localhost:3000/forms/pdfengines/flatten \ --form files=@filled-form.pdf \ -o archived-form.pdf |
Headers and Footer Templates
Gotenberg supports per-page HTML headers and footers for the Chromium HTML and URL routes. You upload them as header.html and footer.html alongside the main document. Each file must be a complete HTML document.
Chromium injects these special classes automatically into any element that uses them:
.pageNumber– current page number.totalPages– total page count.date– formatted print date.title– document<title>value.url– source URL of the document
Example header.html:
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
<!DOCTYPE html> <html> <head> <style> * { -webkit-print-color-adjust: exact; } body { font-size: 9px; font-family: Arial, sans-serif; margin: 0 1cm; padding: 4px 0; display: flex; justify-content: space-between; align-items: center; border-bottom: 1px solid #ddd; color: #555; } </style> </head> <body> <span>ACME Corp – Confidential</span> <span class="date"></span> </body> </html> |
Example footer.html:
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
<!DOCTYPE html> <html> <head> <style> * { -webkit-print-color-adjust: exact; } body { font-size: 9px; font-family: Arial, sans-serif; margin: 0 1cm; padding: 4px 0; display: flex; justify-content: flex-end; color: #555; } </style> </head> <body> <span>Page <span class="pageNumber"></span> of <span class="totalPages"></span></span> </body> </html> |
Sending them with a conversion request:
|
1 2 3 4 5 6 7 8 9 10 |
curl -X POST http://localhost:3000/forms/chromium/convert/html \ --form files=@index.html \ --form files=@header.html \ --form files=@footer.html \ --form marginTop=1.2 \ --form marginBottom=1.2 \ --form printBackground=true \ -o report-with-headers.pdf |
The page margins must be large enough to accommodate the header and footer height, otherwise they will overlap the content.
Important constraint: Headers and footers render in a separate Chromium context from the main document. Your page’s stylesheets do not apply, external fonts do not load, and JavaScript does not run. Embed all styles inline and use only system fonts in these files.
Common Use Cases
Invoice and Report Generation
Render a Jinja2, Twig, or Blade HTML template server-side, then POST the resulting HTML to Gotenberg with a branded header (logo, company name) and a footer (page numbers, legal disclaimer). The output is a pixel-perfect PDF identical to what a print dialog would produce in Chrome.
Office Document Pipeline
Accept .docx files from end users, convert them to PDF via LibreOffice, then optionally stamp a watermark or write metadata, all in a single server-side workflow without any Office installation on the host.
Archival PDF/A Generation
|
1 2 3 4 5 6 7 |
curl -X POST http://localhost:3000/forms/chromium/convert/html \ --form files=@document.html \ --form pdfa=PDF/A-3b \ --form pdfua=true \ -o archive.pdf |
Async PDF Generation with Webhooks
For long-running conversions, add the Gotenberg-Webhook-Url header and the API returns 204 immediately while the result is POSTed to your endpoint when ready:
|
1 2 3 4 5 6 7 |
curl -X POST http://localhost:3000/forms/chromium/convert/url \ -H "Gotenberg-Webhook-Url: https://app.internal/pdf-ready" \ -H "Gotenberg-Webhook-Error-Url: https://app.internal/pdf-failed" \ --form url=https://myapp.com/report/42 \ -o /dev/null |
PDF Metadata Batch Update
Loop over a directory of PDFs and tag each with author and copyright before distribution:
|
1 2 3 4 5 6 7 8 |
for f in ./docs/*.pdf; do curl -s -X POST http://localhost:3000/forms/pdfengines/metadata/write \ --form files=@"$f" \ --form metadata='{"Author":"ACME Corp","Copyright":"2026"}' \ -o "./tagged/$(basename "$f")" done |
Filling AcroForm Fields (pdftk Sidecar)
Gotenberg can flatten an already-filled PDF but cannot populate form fields from JSON. Pair it with KatSick/pdftk-as-a-service for the complete workflow: fill fields via pdftk, then flatten and apply metadata via Gotenberg:
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
services: gotenberg: image: gotenberg/gotenberg:8 ports: - "127.0.0.1:3000:3000" networks: - pdf-network pdftk-api: image: katsick/pdftkgolangservice:1.1.0 ports: - "127.0.0.1:8080:8080" networks: - pdf-network networks: pdf-network: |
Useful Links
- Official Website: https://gotenberg.dev/
- Documentation: https://gotenberg.dev/docs
- HTML to PDF Docs: https://gotenberg.dev/docs/convert-with-chromium/convert-html-to-pdf
- Configuration Reference: https://gotenberg.dev/docs/configuration
- GitHub Repository: https://github.com/gotenberg/gotenberg
- Docker Hub: https://hub.docker.com/r/gotenberg/gotenberg
- KatSick pdftk-as-a-service: https://github.com/KatSick/pdftk-as-a-service
Conclusion
Gotenberg does one thing extremely well: it takes documents and turns them into PDFs without you touching a single system dependency. Chromium handles everything that needs to look exactly like a browser – invoices, dashboards, branded reports. LibreOffice handles the office format zoo. The PDF engine layer handles post-processing: metadata tagging, merging, archival compliance, flattening, and encryption.
The header and footer template system is the feature that elevates Gotenberg above simpler tools. Uploading a pair of HTML files to get consistent branding on every page – with dynamic page numbers and dates – is a one-liner cURL away. Combined with the webhook support for async jobs and the Prometheus endpoint for monitoring, Gotenberg fits neatly into any production-grade document generation pipeline.
For the one thing it does not do – filling AcroForm fields from JSON – the lightweight pdftk-as-a-service sidecar covers the gap cleanly. Two containers, one compose file, every PDF workflow covered.
FAQ
Which self-hosted Docker tool is best for filling PDF form fields via API?
KatSick/pdftk-as-a-service is the most straightforward option. POST a PDF template and a JSON map of field names to values to /fill-pdf and receive the completed PDF. It wraps PDFtk in a Go/Gin REST service and requires no configuration.
What is Stirling-PDF?
Stirling-PDF is an open-source self-hosted PDF platform with 70+ tools including merge, split, OCR, convert, redact, watermark, and metadata editing. It provides both a React web UI and a full REST API at /api/v1/, backed by LibreOffice, Tesseract, and QPDF.
Does Stirling-PDF support programmatic form filling?
Not yet via the API. Form filling through the browser UI works manually. A dedicated API endpoint is tracked in GitHub issue #3569, open as of May 2025. For automated form filling today, use KatSick/pdftk-as-a-service.
How do I update PDF metadata with Stirling-PDF?
POST to /api/v1/misc/update-metadata with the PDF as fileInput and individual fields (title, author, subject, keywords, creator, producer) as multipart form values. The response is the updated PDF.
What is WeasyPrint Docker and when should I use it?
WeasyPrint Docker is a Python aiohttp service that converts HTML to PDF using WeasyPrint’s CSS Paged Media engine. Use it when your HTML and CSS are under your control, you need headers and footers defined in CSS @page rules, and you do not need JavaScript execution during rendering.
What is CSS Paged Media and why does it matter for headers and footers?
CSS Paged Media is a W3C specification that extends CSS to define how a document is paginated for print. The @page rule lets you declare margin boxes (@top-center, @bottom-right, etc.) that hold running headers and footers. WeasyPrint implements this specification, so headers and footers share the document’s full CSS context and can use the same fonts and colours as the main content.
How do I define a page number in a WeasyPrint header?
Use CSS counters inside a @page margin box: content: "Page " counter(page) " of " counter(pages);. WeasyPrint substitutes the correct values at render time. No JavaScript or separate file upload is needed.
What is the difference between WeasyPrint headers and Gotenberg headers?
WeasyPrint defines headers and footers in CSS @page rules that share the document’s full stylesheet. Gotenberg uses separate uploaded header.html and footer.html files rendered in an isolated Chromium context, where the main document’s CSS does not apply and JavaScript does not run.
What does torfs-ict/docker-pdftk-webservice do?
It is a PHP/Symfony webservice that merges PDFs via a POST to /merge with multipart file uploads. Only merge is currently implemented; the project acknowledges that other features are planned but not shipped. It is largely inactive and not recommended for new projects.
Should I use wkhtmltopdf Docker for new projects?
No. The upstream wkhtmltopdf binary is abandoned, has rendering bugs with modern CSS, and is no longer maintained. Use WeasyPrint for CSS Paged Media rendering or Gotenberg (Day 48) for Chromium-based rendering instead.
What are the three Stirling-PDF Docker image variants?
latest is the standard image for most PDF tools. latest-fat adds LibreOffice, extra fonts, and Calibre for Office conversion and highest-quality output. latest-ultra-lite strips it down to core operations only for resource-constrained environments.
Why do I need the fat image for HTML to PDF in Stirling-PDF?
HTML-to-PDF conversion in Stirling-PDF goes through LibreOffice, which is only present in the latest-fat image. The INSTALL_BOOK_AND_ADVANCED_HTML_OPS=true environment variable must also be set to install Calibre at startup.
Does Stirling-PDF have a web UI?
Yes. The React-based web UI is the primary interface and is accessible at port 8080. Every tool is available through the UI with no code required. The same operations are also available via the REST API for automation.
How do I authenticate Stirling-PDF API requests?
When login is enabled, every API request must include -H "X-API-KEY: your-key". Generate the key from the user settings page in the web UI after logging in with the initial admin credentials.
What port does each service use?
Stirling-PDF listens on 8080. KatSick/pdftk-as-a-service listens on 8080 inside the container (map to a different host port if running both). WeasyPrint Docker listens on 5000. torfs-ict listens on 80.
Does KatSick/pdftk-as-a-service support any operations besides form filling?
No. The container exposes a single endpoint, POST /fill-pdf. It does not support metadata editing, merging, splitting, or any other PDF operation. For anything beyond AcroForm field filling, use Stirling-PDF or another tool.
How do I merge PDFs via the Stirling-PDF API?
POST multiple files to /api/v1/general/merge-pdfs using repeated -F "fileInput=@file.pdf" fields. The files are merged in the order they are sent.
Can Stirling-PDF convert Word documents to PDF?
Yes, via POST /api/v1/convert/file/pdf with the latest-fat image. LibreOffice handles .docx, .xlsx, .pptx, .odt, and 100+ other Office formats.
How does OCR work in Stirling-PDF?
Tesseract OCR adds a searchable text layer to image-only or scanned PDFs via POST /api/v1/misc/ocr-pdf. English is bundled. Additional language packs are .traineddata files placed in the /usr/share/tessdata volume mount.
Can I add a watermark via the Stirling-PDF API?
Yes. POST /api/v1/misc/add-watermark accepts watermarkType (text or image), watermarkText, fontSize, rotation, and opacity as multipart fields.
Does Stirling-PDF store uploaded files?
No. Files are processed in memory and deleted immediately after each operation. Nothing is retained between requests. Sensitive documents do not persist on the server.
Which volume is most important to back up in Stirling-PDF?
The /configs volume. It contains settings.yml and the embedded H2 database holding user accounts, API keys, and audit logs. Losing it when login is enabled means losing all user access.
How do I control JVM memory in Stirling-PDF?
Set JAVA_TOOL_OPTIONS="-Xms512m -Xmx4g" to define initial and maximum JVM heap. Always leave memory for the OS and LibreOffice below the Docker container’s memory limit.
Can I redact text programmatically?
Yes. Stirling-PDF’s POST /api/v1/security/auto-redact endpoint accepts search terms or patterns and blacks them out in the output PDF, making the redaction permanent and unrecoverable.
What is the pipeline feature in Stirling-PDF?
Pipelines are no-code automation chains built in the web UI and saved to the /pipeline volume. You link multiple PDF operations in sequence – merge, then watermark, then compress – without writing any code. Pipelines can also be triggered programmatically.
How many languages does the Stirling-PDF UI support?
40+ languages including English, German, French, Spanish, Dutch, Italian, Portuguese, Chinese, Japanese, Korean, Arabic, Russian, and many more. Set the default with SYSTEM_DEFAULTLOCALE.
Can I run all these services together on one server?
Yes. Add each service to the same Docker Compose file with a shared network. Use different host port mappings to avoid conflicts. Stirling-PDF (fat) needs 4+ GB RAM; KatSick and WeasyPrint are each under 200 MB and negligible in comparison.
What is a good minimum server spec for this PDF stack?
For Stirling-PDF fat plus KatSick and WeasyPrint: 4 GB RAM, 2 vCPUs, and 10 GB disk. For heavy OCR or simultaneous Office conversions, 8 GB RAM and 4 vCPUs is more comfortable.
What license is Stirling-PDF under?
Stirling-PDF uses an open-core model. The community edition is free and open source. Enterprise features such as SSO and audit logging require a commercial license. See the LICENSE file in the GitHub repository for the full terms.
Is Stirling-PDF actively maintained?
Very much so. With over 79,000 GitHub stars and 6,900 forks, it is the most-starred self-hosted PDF project. Issues are responded to quickly and releases are frequent.
