Docker deployment checklist for Open-AutoGLM

A containerization checklist for reliable Open-AutoGLM deployments.

Docker can simplify reproducible deployments, but it also hides GPU and device access details that matter for Open-AutoGLM. This checklist focuses on what to validate before you ship a containerized setup.

TODO: replace with real Docker deployment diagram

When Docker is a good fit

Use Docker if you want:

A consistent runtime across machines.
Clear dependency isolation.
Easier deployment to servers for endpoint hosting (Option A).

Avoid Docker if you need direct device access without extra layers or you are in early debugging stages.

Preflight checklist

Before you build an image:

Confirm which runtime you use (vLLM or SGLang).
Confirm which model weights are mounted or downloaded.
Document GPU driver versions on the host.
Decide whether the container is endpoint hosting or client-only.

Base image and dependencies

Use official base images when possible. Avoid obscure or unmaintained images. Document every dependency in the Dockerfile so it can be rebuilt later.

TODO: add official Docker guidance if the Open-AutoGLM README includes it.

Image size and caching

Large images slow down deployments. To reduce size:

Clean package caches after installs.
Use multi‑stage builds where possible.
Keep model weights outside the image and mount them as volumes.

This keeps rebuilds fast and reduces storage costs.

GPU access

If you run locally inside Docker (Option B):

Ensure NVIDIA Container Toolkit is installed.
Verify nvidia-smi runs inside the container.
Mount the model directory read‑only unless you expect writes.

Endpoint hosting

If your container is used as a model service endpoint:

Expose only required ports.
Use HTTPS termination at the edge.
Add basic auth or token‑based access for --base-url.
Log requests and response times for debugging.

Network policies

If you run in a shared environment:

Restrict inbound traffic to known IPs.
Block outbound traffic unless required.
Separate endpoint and UI agent containers with clear network boundaries.

These policies reduce unintended exposure.

ADB and device access

If the container must access a physical device:

Mount the ADB socket or device USB bus (host‑specific).
Prefer USB on development machines; avoid this in production.

When possible, separate the endpoint container from the client UI agent.

Environment variables and secrets

Do not bake secrets into the image. Pass them at runtime via environment variables or secret stores.

Volume strategy

Use volumes for:

Model files (read‑only mount preferred).
Logs and traces.
Temporary cache directories if required by the runtime.

Avoid writing large artifacts inside container layers.

Health checks

Add simple health checks:

“Model loaded” endpoint returns 200.
ADB connectivity (if needed) returns expected output.
Memory usage stays below threshold.

Observability

At minimum, log:

Model version and runtime.
Request timestamps and durations.
Error messages and stack traces.

These logs make it easier to compare agent performance across deployments.

CI considerations

If you build images in CI:

Pin dependency versions for reproducibility.
Run a basic smoke test that verifies the entry point.
Store build metadata (image tag, git commit, model version).

Official base image and documented dependencies
GPU access validated (if local)
Ports, auth, and TLS configured (if endpoint)
Logs and health checks in place
Rollback image available

Next steps

Review model service endpoints for --base-url guidance.
Use the human‑in‑the‑loop guide for safe login flows.

Waitlist

Mobile Regression Testing (coming soon)

Get notified when guided Android regression testing workflows and safety checklists are ready.

We only use your email for the waitlist. You can opt out anytime.