Best practices: preventing destructive actions

Safety techniques for confirmations, dry runs, and guardrails in Open-AutoGLM.

Open-AutoGLM can tap, type, and submit forms. That power requires guardrails. This tutorial collects best practices for preventing destructive actions during evaluation.

TODO: replace with safety checklist visual

What counts as destructive?

Examples include:

Deleting accounts or data.
Changing passwords or security settings.
Submitting payments or purchases.
Sending messages on behalf of a user.

If the action cannot be easily undone, treat it as destructive.

Classify risk levels

Create a simple risk scale to decide how strict your guardrails should be:

Low: navigation, reading, opening menus.
Medium: toggling settings, editing profiles.
High: payments, deletions, account changes.

Require stronger approvals as risk increases.

Core safety patterns

1) Confirmation before risky actions

Require a human confirmation step before any destructive action. The agent should explicitly ask:

I am about to submit a payment. Do you want me to proceed? (yes/no)

2) Dry-run mode

In dry run, the agent lists its intended actions without executing them. This is ideal for early evaluation:

Plan: tap Settings > Billing > Cancel Subscription.
Awaiting confirmation to execute.

3) Safety scope

Limit which screens the agent can interact with. If the agent navigates outside allowed areas, stop the run.

4) Read-back confirmations

Ask the agent to read back the action before executing:

I am about to tap "Delete account". Confirm yes/no.

This catches mis‑read UI elements.

Audit logging

Always log:

The prompt used
The target app and version
Each action taken
Screenshots before and after

This helps you reproduce and review any unexpected behavior.

Safer testing environments

Whenever possible:

Use test devices and test accounts.
Disable payment methods.
Use sandbox or staging environments.

This reduces the blast radius of mistakes.

Human-in-the-loop checkpoints

Use checkpoints at:

Login and verification screens
Payment screens
Account deletion flows

See the human‑in‑the‑loop guide for patterns.

Example policy template

The agent must:
1) Ask before tapping any button labeled Delete, Remove, or Purchase.
2) Stop if a payment form is detected.
3) Require a human to enter credentials or OTP codes.

Common mistakes

Letting the agent proceed after a warning modal appears.
Running on personal accounts with real payment methods.
Failing to monitor logs during early tests.

Checklist

Dry run for first attempt
Confirmation prompts enabled
Test accounts and devices only
Action logs and screenshots captured

Review gates

Before shipping a workflow:

Have a second reviewer validate the prompts.
Run the flow on a sandbox account.
Confirm that destructive actions are blocked by default.

Next steps

Review the ADB troubleshooting guide to keep device access stable.
Check the model service endpoint guide for safe endpoint use.

Waitlist

Mobile Regression Testing (coming soon)

Get notified when guided Android regression testing workflows and safety checklists are ready.

We only use your email for the waitlist. You can opt out anytime.