Model service endpoints: connect and secure --base-url

How to connect Open-AutoGLM to a model service endpoint and secure access.

Model service endpoints are the recommended way to run Open-AutoGLM without local GPU hardware. This guide covers how to connect via --base-url and how to keep the endpoint secure.

TODO: replace with endpoint architecture diagram

Why endpoints are recommended

Endpoints reduce setup time and allow you to focus on evaluating the agent:

No local GPU required
Centralized updates
Easier to scale and monitor

Connecting with --base-url

Use the official README for exact flags. The example below shows the shape of a typical command:

# TODO: replace with official command and flags
python -m openautoglm.run --base-url https://your-endpoint.example.com

If authentication is required, follow the endpoint provider’s instructions.

Selecting an endpoint

Prefer endpoints that provide:

Versioned model metadata
Stable latency
Clear access controls

Document the endpoint version to keep evaluations reproducible.

Security basics

Never expose endpoints publicly without protection. Minimum safeguards:

HTTPS/TLS
Token‑based authentication
IP allowlists (if available)
Rate limiting

Authentication patterns

Common patterns include:

Static API tokens
OAuth or short‑lived tokens
Signed requests with custom headers

Never embed tokens in public logs or client‑side code.

Token rotation

Rotate credentials regularly:

Revoke unused tokens.
Update stored secrets in your CI or runtime.
Document rotation dates for audit purposes.

Logging and monitoring

Track:

Request counts and latency
Model version and configuration
Error rates and timeouts

These metrics help you debug and compare runs.

Access logs

Keep access logs separate from model output logs. This makes it easier to detect suspicious activity without exposing sensitive data.

Handling failures

If the endpoint is down:

Fail fast and notify the user.
Provide a retry plan with backoff.
Allow switching to a backup endpoint.

Pre‑flight testing

Before a full evaluation run:

Ping the endpoint and confirm a 200 response.
Test one low‑risk action.
Verify logs capture the request and response time.

Safety note

Endpoint access is a privilege. Treat it as sensitive and avoid sharing URLs or tokens in public logs.

Next steps

For local fallback, see low‑VRAM options.
For Docker deployment, see Docker deployment checklist.

Waitlist

Mobile Regression Testing (coming soon)

Get notified when guided Android regression testing workflows and safety checklists are ready.

We only use your email for the waitlist. You can opt out anytime.