Best practices: preventing destructive actions

Safety techniques for confirmations, dry runs, and guardrails in Open-AutoGLM.

Open-AutoGLM can tap, type, and submit forms. That power requires guardrails. This tutorial collects best practices for preventing destructive actions during evaluation.

TODO: replace with safety checklist visual

What counts as destructive?

Examples include:

  • Deleting accounts or data.
  • Changing passwords or security settings.
  • Submitting payments or purchases.
  • Sending messages on behalf of a user.

If the action cannot be easily undone, treat it as destructive.

Classify risk levels

Create a simple risk scale to decide how strict your guardrails should be:

  • Low: navigation, reading, opening menus.
  • Medium: toggling settings, editing profiles.
  • High: payments, deletions, account changes.

Require stronger approvals as risk increases.

Core safety patterns

1) Confirmation before risky actions

Require a human confirmation step before any destructive action. The agent should explicitly ask:

I am about to submit a payment. Do you want me to proceed? (yes/no)

2) Dry-run mode

In dry run, the agent lists its intended actions without executing them. This is ideal for early evaluation:

Plan: tap Settings > Billing > Cancel Subscription.
Awaiting confirmation to execute.

3) Safety scope

Limit which screens the agent can interact with. If the agent navigates outside allowed areas, stop the run.

4) Read-back confirmations

Ask the agent to read back the action before executing:

I am about to tap "Delete account". Confirm yes/no.

This catches mis‑read UI elements.

Audit logging

Always log:

  • The prompt used
  • The target app and version
  • Each action taken
  • Screenshots before and after

This helps you reproduce and review any unexpected behavior.

Safer testing environments

Whenever possible:

  • Use test devices and test accounts.
  • Disable payment methods.
  • Use sandbox or staging environments.

This reduces the blast radius of mistakes.

Human-in-the-loop checkpoints

Use checkpoints at:

  • Login and verification screens
  • Payment screens
  • Account deletion flows

See the human‑in‑the‑loop guide for patterns.

Example policy template

The agent must:
1) Ask before tapping any button labeled Delete, Remove, or Purchase.
2) Stop if a payment form is detected.
3) Require a human to enter credentials or OTP codes.

Common mistakes

  • Letting the agent proceed after a warning modal appears.
  • Running on personal accounts with real payment methods.
  • Failing to monitor logs during early tests.

Checklist

  • Dry run for first attempt
  • Confirmation prompts enabled
  • Test accounts and devices only
  • Action logs and screenshots captured

Review gates

Before shipping a workflow:

  • Have a second reviewer validate the prompts.
  • Run the flow on a sandbox account.
  • Confirm that destructive actions are blocked by default.

Next steps

Waitlist

Mobile Regression Testing (coming soon)

Get notified when guided Android regression testing workflows and safety checklists are ready.

We only use your email for the waitlist. You can opt out anytime.

Best practices: preventing destructive actions