Human-in-the-loop design: handling login & verification codes safely

Patterns for safe approvals and login checkpoints in Open-AutoGLM workflows.

Login flows and verification codes are the most sensitive part of phone‑agent automation. This tutorial shows how to design human‑in‑the‑loop checkpoints so Open-AutoGLM can assist without bypassing security or violating terms.

TODO: replace with login checkpoint diagram

Why human‑in‑the‑loop matters

Login screens are designed for humans. Automating them end‑to‑end increases security risk and can violate app policies. A safer approach is to let the agent navigate to the login step, then pause and wait for human confirmation or manual input.

This protects user accounts and provides a clear audit trail of actions.

Design goals

Aim for:

  • Explicit checkpoints before any credential entry.
  • Clear prompts that explain what the agent will do next.
  • Manual entry for passwords and verification codes.
  • Auditable logs for every sensitive step.
  1. Agent navigates to login screen.
  2. Agent requests confirmation to proceed.
  3. Human enters credentials or OTP.
  4. Agent resumes after confirmation.

This simple loop is safer than fully automated login.

UI design for approvals

If you build a wrapper UI around the agent, include:

  • A clear Approve / Reject control.
  • A short explanation of the next action.
  • A screenshot or element highlight of the target button.

This reduces mistakes during login flows.

Example prompts (template)

Use explicit prompts so the agent does not proceed automatically:

You are at the login screen. Ask for confirmation before typing any credentials.
Do not attempt to bypass verification prompts.

If the agent requests input, provide it manually and confirm the next step.

Verification codes and 2FA

Never automate retrieval of verification codes. Instead:

  • Ask the human to input the code.
  • Verify the correct field is focused.
  • Require a confirmation before the agent taps “Submit”.

This pattern prevents accidental lockouts and respects security policies.

Handling timeouts and retries

Login steps often time out. Use explicit time windows:

  • If the code expires, reset the step and request a new code.
  • If the agent loses context, reload the screen before continuing.

Avoid “best guess” actions when the state is unclear.

Session management

Reduce repetitive logins by:

  • Using test accounts with longer sessions.
  • Avoiding logout flows unless required.
  • Capturing when a session expires and adding a new checkpoint.

Audit notes

Record:

  • Which user account was used.
  • When the login occurred.
  • Which human confirmed the action (if relevant).

These notes help teams reproduce or review results.

Redact sensitive data

If you store logs or screenshots:

  • Blur or redact usernames and emails.
  • Avoid recording OTP codes.
  • Store sensitive logs in restricted locations.

This keeps evaluations compliant with internal policies.

Handling captchas

If a captcha appears, the agent should stop and request human input. Do not attempt to automate captcha solving.

Safety checklist

  • Confirmation prompt before credentials
  • Manual entry for passwords and OTPs
  • Logs for all sensitive steps
  • Captcha and lockout handling

Checklist for reviewers

  • Was a human present at each checkpoint?
  • Were credentials entered manually?
  • Was any sensitive data logged unintentionally?

Example scenarios (not endorsements)

  • QA team testing a new onboarding flow.
  • Researcher validating a multi‑step login sequence.
  • Accessibility tester verifying a localized login UI.

What not to do

  • Do not bypass OTP or verification steps.
  • Do not attempt to scrape or auto‑read authentication codes.
  • Do not store credentials in prompts or logs.

Runbook template

Use a runbook to standardize safe logins:

  1. Navigate to login screen.
  2. Pause and request confirmation.
  3. Human enters credentials.
  4. Agent confirms field focus and waits.
  5. Human approves final submission.

This reduces variance between evaluators.

Next steps

Waitlist

Mobile Regression Testing (coming soon)

Get notified when guided Android regression testing workflows and safety checklists are ready.

We only use your email for the waitlist. You can opt out anytime.

Human-in-the-loop design: handling login & verification codes safely