CHECKING STATUS
I AM LISTENING TO
|

Day 47: FFmpeg-API – HTTP-Driven Media Processing in a Single Container – 7 Days of Docker

16. May 2026
.SHARE

Table of Contents

FFmpeg is the swiss-army knife of media processing, but wiring it into a web application is rarely fun. You either install the binary on every worker, juggle temp directories, sanitise shell arguments, and hope nothing escapes your wrapper – or you reach for a heavy SaaS service and pay per minute of video. Aureum-Cloud/FFmpeg-API takes the middle road: it bundles FFmpeg into a tiny stateless HTTP API written in Go, accepts media files from S3, HTTP URLs or Base64, runs your command pipeline, and returns the results in the format you choose. Drop the container next to your application and call it like any other microservice.

What Does This Thing Actually Do?

FFmpeg-API is a Docker-ready HTTP service that exposes a single endpoint capable of running arbitrary FFmpeg pipelines. You POST a JSON body describing the inputs, the FFmpeg argument arrays, and where the output should go – the service handles fetching, processing, and returning the results.

The image bundles FFmpeg 8.0 and ships as a scratch-based Go binary, so the runtime footprint is tiny. Headline features include:

  • Single Endpoint – Everything happens through POST /v1/process with a JSON request body
  • Three Input Sources – Pull files from S3 buckets, fetch them over HTTPS, or decode Base64 strings directly in the request
  • Three Output Modes – Upload back to S3, return Base64 in the JSON response, or stream the first output file inline as binary
  • Multi-Step Pipelines – Pass an array of FFmpeg argument lists; each command sees the outputs of the previous step
  • Per-Request S3 Config – Every request can target a different S3 endpoint, region, or set of credentials – great for multi-tenant pipelines
  • S3-Compatible Provider Support – Works with AWS S3, MinIO, DigitalOcean Spaces, Cloudflare R2, Backblaze B2, or any other S3 API
  • Stateless Runtime – No database, no shared filesystem, no session storage. Each request creates a temporary job directory under /tmp/<uuid> and wipes it on completion
  • Parallel Input Fetching – Multiple inputs are downloaded concurrently inside a request, which keeps long pipelines snappy
  • Tiny Image – Built on scratch with the FFmpeg 8.0 binaries copied in. No shell, no package manager, no surprise CVEs
  • Horizontal Scaling – Because requests are independent, you can run any number of replicas behind a load balancer or in Kubernetes
  • MIT Licensed – Permissive license, no commercial strings attached

The use cases range from podcast trimming and thumbnail generation to full transcoding pipelines for user-generated video platforms.

Docker Compose Setup

The image needs nothing beyond the port mapping to run, but a sensible compose file gives you a healthcheck, resource limits, and a tmpfs mount for fast scratch I/O:

Because the image is built on scratch there is no shell to run a command-based healthcheck. If you need one, place the service behind a reverse proxy or sidecar that can curl the endpoint with a small payload.

Installation Steps

1. System Requirements

FFmpeg-API itself is lightweight, but media transcoding is hungry for CPU and disk I/O. Plan ahead for the workloads you actually run:

  • Docker and Docker Compose installed
  • 1 GB RAM minimum for trivial format conversions, 4 GB+ for video transcoding
  • 2 vCPUs minimum, more for parallel jobs or H.264/H.265 encoding
  • Generous /tmp space – the largest input plus all intermediate and output files must fit on the working filesystem during a job

2. Create the Project Directory

3. Save the Compose File

Drop the docker-compose.yml from the previous section into the directory.

4. Pull the Image

5. Launch the Service

6. Verify the Container is Running

You should see a “server listening on :8080” line as soon as the first request arrives.

7. Run Your First Conversion

Strip the audio track from a remote MP4 and get it back as Base64 in one call:

The JSON response contains a results object keyed by output filename. Each entry holds the URL (when uploading to S3) and/or the Base64 payload.

Environment Variables Explained

FFmpeg-API is unusual in that it has no application-level environment variables. Everything – S3 credentials, region, bucket, output mode – is supplied per request in the JSON body. The variables below are still useful at the Docker layer for tuning runtime behaviour.

GOMAXPROCS

Purpose: Caps the number of OS threads the Go runtime uses for goroutines. Useful when running on a host with many cores but you only want a slice of them for this service.

Format: Integer

Default: Number of available CPUs

This affects parallel input fetching and HTTP request handling. FFmpeg itself is a separate process that picks up CPUs through its own -threads flag.

TZ

Purpose: Sets the container’s timezone so log timestamps match the rest of your stack.

Format: IANA timezone string

Example: Europe/Berlin, America/New_York, UTC

The image is scratch-based and has no /etc/timezone tooling, so timezone data is sparse. Stick to UTC if you do not need local time in logs.

HTTP_PROXY / HTTPS_PROXY (Optional)

Purpose: Routes outbound HTTP traffic – the input fetcher and S3 SDK both honour these variables – through a corporate proxy.

Format: Full proxy URL with optional credentials

Always pair HTTPS_PROXY with a NO_PROXY list that includes your S3 endpoint and any internal services to prevent recursive routing.

AWS_REGION (Optional)

Purpose: The AWS Go SDK falls back to this variable when constructing default config objects. Even though the JSON request usually carries the region, setting a default keeps the SDK quiet on cold starts.

Format: AWS region string

Per-request s3Config.region always wins over this default.

TMPDIR (Optional)

Purpose: Overrides the directory used to create per-job working folders. Each job gets a UUID-named subdirectory and is cleaned up automatically.

Format: Absolute filesystem path

Default: /tmp

Pointing this at a tmpfs or fast NVMe volume is the single biggest performance lever for high-throughput transcoding.

Volume Mounts Explained

The container is stateless and does not require any persistent volume. There are still two mount strategies worth knowing about:

tmpfs Scratch Mount

Purpose: Backs the per-job temporary directories with RAM for maximum throughput.

Mount Point: /tmp

Size the tmpfs to fit your largest concurrent job. A 2 GB tmpfs comfortably handles short videos; bump to 8 GB or higher if you process long-form content. Remember tmpfs consumes host RAM.

Persistent Disk Volume (Optional)

Purpose: Use a named volume or bind mount for jobs when RAM is tight.

Mount Point: Whatever you set TMPDIR to

Helpful if you have lots of cheap disk space but limited memory. Performance is bound by the underlying storage’s IOPS rather than RAM speed.

Request Anatomy

Every call to /v1/process follows the same shape:

Inputs

A map of filename to source. Filenames inside the map are exactly what FFmpeg will see on disk.

The temporary flag declares a placeholder name that intermediate FFmpeg steps can write to without it being treated as an output to upload.

Commands

An array of FFmpeg argument arrays. Each entry is executed sequentially in the same working directory, so step N can consume files written by step N-1.

The service automatically prepends -hide_banner to keep logs tidy.

Output

One of three modes: S3 upload, Base64 in the response, or inline streaming.

When inlineContentType is set, the first output file is streamed back with that Content-Type header and no JSON envelope.

S3 Configuration

S3 access is configured per request via the s3Config object. The same shape works for AWS S3 and any S3-compatible provider.

The endpoint field must not include a bucket name – the bucket is taken from the s3://bucket/key path in your inputs and outputs. useSSL defaults to true when omitted.

MinIO Example

Cloudflare R2 Example

Common Use Cases

Audio Extraction

Pull an MP3 out of any video file – useful for podcast pipelines that ingest screen recordings or webinar exports.

Thumbnail Generation

Grab a single frame at the 5-second mark and store it on S3 for use as a video poster image.

Watermarking and Re-Encoding

Overlay a logo and transcode to web-friendly H.264 in two pipeline steps.

GIF Generation

Turn a clip into an animated GIF for marketing pages with a high-quality two-pass palette.

Format Conversion Microservice

Drop the container behind your API gateway and let upstream services POST whatever they need converted. The stateless nature means scaling is as simple as raising the replica count.

Batch Pipelines from Workflow Engines

Tools like n8n, Make.com, and Apache Airflow can call FFmpeg-API directly with no special integration. Pair the service with a queue (RabbitMQ, SQS, BullMQ) to throttle concurrency when GPUs or licensing matter.

Useful Links

Conclusion

FFmpeg-API turns FFmpeg into something you can call instead of something you have to install. The boundary is JSON in, JSON or binary out – no shared filesystems, no command injection traps, no native dependencies leaking into your application code.

For teams already running on object storage, the per-request S3 configuration is the killer feature. Your application uploads raw assets, sends a single HTTP call, and receives a finished file at a known S3 path moments later. Pair it with a queue for backpressure and the same container scales from a side-project podcast workflow to a full multi-tenant transcoding pipeline.

One container, one endpoint, every FFmpeg trick you ever wanted – exactly the kind of focused tool that earns a permanent slot in a self-hosted media stack.

FAQ

What is FFmpeg-API?

It is an HTTP API written in Go that wraps FFmpeg. You POST a JSON body describing inputs, an FFmpeg command pipeline, and an output target, and the service returns the processed media via S3, Base64, or inline streaming.

Who maintains the project?

The project is published by the Aureum-Cloud organization on GitHub at github.com/Aureum-Cloud/FFmpeg-API and released under the MIT License.

What is the container image?

The image is published to GitHub Container Registry as ghcr.io/aureum-cloud/ffmpeg-api:latest.

What port does the API run on?

Port 8080 is hard-coded in the binary. Map it to whatever host port you prefer or keep it internal-only on a Docker network.

What is the API endpoint?

There is a single endpoint: POST /v1/process. The request body is JSON with four top-level keys: s3Config, input, commands, and output.

Which input sources are supported?

S3 objects (s3://bucket/key), arbitrary HTTP/HTTPS URLs, Base64-encoded strings, and a temporary placeholder for intermediate files generated between pipeline steps.

Which output modes are supported?

S3 upload to a configurable prefix, Base64 encoding in the JSON response, or inline streaming where the first output file is returned directly with a custom Content-Type.

Does it support multi-step pipelines?

Yes. The commands field is an array of FFmpeg argument arrays executed sequentially in the same working directory, so each step can consume the output of the previous one.

Can I use S3-compatible storage like MinIO or R2?

Yes. The s3Config object accepts a custom endpoint, region, credentials, and SSL flag, so AWS S3, MinIO, DigitalOcean Spaces, Cloudflare R2, Backblaze B2, and similar providers all work.

Does the endpoint URL include the bucket?

No. The endpoint in s3Config must not include a bucket name. The bucket is parsed from the s3://bucket/key URL in your input or output paths.

Are S3 credentials passed via environment variables?

No. Credentials are supplied per request inside the JSON body. This makes the service multi-tenant by design – different requests can target different buckets and accounts.

Does the image require any environment variables to start?

None. The container starts with no configuration and listens on port 8080 immediately. Optional variables like GOMAXPROCS, TZ, or HTTPS_PROXY are only useful for runtime tuning.

What FFmpeg version is bundled?

The Dockerfile copies binaries from jrottenberg/ffmpeg:8.0-scratch, which provides FFmpeg 8.0. Future image tags may bump that version, so check the upstream Dockerfile if you depend on specific codecs.

Is the image safe to expose publicly?

Not by itself. There is no built-in authentication or rate limiting. Always run it behind a reverse proxy or API gateway that enforces those controls when exposing it beyond an internal network.

How does authentication work?

It does not – the API is unauthenticated. Add authentication at a layer in front of the service (Nginx basic auth, an API gateway, a JWT-aware reverse proxy, or a service mesh).

Can the same FFmpeg arguments be passed straight through?

Yes. Each command is an array of strings passed directly to FFmpeg. The service prepends -hide_banner and removes any duplicate of that flag, but otherwise leaves your arguments alone.

Where do temporary files live?

Each request creates a UUID-named directory under the system temp directory (/tmp by default). Files are written, processed, returned, and then the entire directory is deleted on completion.

How can I speed up large jobs?

Mount /tmp as a tmpfs to keep intermediate files in RAM, raise CPU limits, and use FFmpeg’s own -threads flag inside your commands. Multi-step pipelines also benefit from passing -c copy on the final mux step when re-encoding is not needed.

How big can input files be?

There is no hard-coded limit. The practical limit is the available space in the container’s working directory plus any reverse proxy upload limits if you use Base64 inputs. Streaming from S3 or HTTPS avoids those proxy limits entirely.

Can I run multiple jobs in parallel?

Yes. The HTTP server handles concurrent requests, and each gets its own job directory and FFmpeg process. CPU and disk I/O are usually the limiting factors, not the service itself.

How do I scale horizontally?

Run multiple replicas behind a load balancer or Kubernetes Service. Because the service is stateless and stores nothing locally beyond per-request job directories, no session affinity or shared storage is required.

What does the JSON response look like?

The response is { "results": { "filename": { "url": "...", "base64": "..." } } }. URLs are populated when uploading to S3, Base64 strings are populated when output.base64 is true.

What happens if a command fails?

The service returns HTTP 500 with the FFmpeg stderr output included in the response body. The job directory is still cleaned up so failures do not leave files behind.

Is GPU acceleration supported?

The published image uses CPU FFmpeg builds. For GPU-accelerated transcoding (NVENC, VAAPI), you would need to build a custom image based on a GPU-enabled FFmpeg variant and pass the device into the container.

How do I view container logs?

Use docker compose logs -f ffmpeg-api. The service logs each job, every FFmpeg step, and the final timing and output count.

Does it support webhooks or async jobs?

No. The API is fully synchronous – the HTTP request stays open until processing finishes. For async workflows, queue the requests in your own job runner (RabbitMQ, SQS, BullMQ) and call FFmpeg-API from the worker.

Why do my long jobs time out?

Reverse proxies and load balancers usually enforce request timeouts of 30–120 seconds. Increase proxy_read_timeout (Nginx) or the equivalent setting on your gateway, or queue the work asynchronously.

Can I run it on ARM (Raspberry Pi, Apple Silicon servers)?

Compatibility depends on the architectures the maintainers publish. Check the available tags on GHCR; if only amd64 is shipped, you can build a custom image from the source repository on your target architecture.

How does it compare to a SaaS like Mux or Cloudflare Stream?

SaaS services bundle hosting, CDN delivery, adaptive bitrate ladders, and analytics. FFmpeg-API is a focused processing primitive – it does the encoding step and nothing else. Combine it with object storage and a CDN to assemble a comparable pipeline at lower cost.

Is there a Helm chart or Kubernetes manifest?

Not provided upstream, but the container is so simple that a basic Deployment plus Service works out of the box. Use a tmpfs emptyDir mount on /tmp for fast scratch storage.

What language is the service written in?

Go. The repository is roughly 93% Go and 7% Dockerfile, and the binary is built statically and shipped from a scratch base image with the FFmpeg binaries layered in.

Let’s Talk!

Looking for a reliable partner to bring your project to the next level? Whether it’s development, design, security, or ongoing support—I’d love to chat and see how I can help.

Get in touch,
and let’s create something amazing together!

RELATED POSTS

Here’s the thing about the macOS menu bar: Apple gives you zero control over it. Your apps just pile in from the right, squeezing together like commuters on a rush-hour train, and you either live with it or you don’t. There’s no padding, no grouping, no breathing room. Just a wall of tiny icons staring […]

PDF operations are one of those recurring pain points that never fully go away. You need to fill a contract template, strip and rewrite document metadata before archiving, generate an invoice from an HTML template, and stamp every page with a branded header. The default answer is a SaaS API subscription that charges per document […]

Generating PDFs on a server is one of those tasks that sounds simple until you actually sit down to do it. HTML-to-PDF rendering drifts between browsers, LibreOffice headless mode is finicky to install, and most SaaS solutions charge per page once you hit volume. Gotenberg solves this cleanly: a single Docker container that bundles headless […]

Alexander

I am a full-stack developer. My expertise include:

  • Server, Network and Hosting Environments
  • Data Modeling / Import / Export
  • Business Logic
  • API Layer / Action layer / MVC
  • User Interfaces
  • User Experience
  • Understand what the customer and the business needs


I have a deep passion for programming, design, and server architecture—each of these fuels my creativity, and I wouldn’t feel complete without them.

With a broad range of interests, I’m always exploring new technologies and expanding my knowledge wherever needed. The tech world evolves rapidly, and I love staying ahead by embracing the latest innovations.

Beyond technology, I value peace and surround myself with like-minded individuals.

I firmly believe in the principle: Help others, and help will find its way back to you when you need it.