How to Monitor Your AI Agent: Dashboard, Logs, and Alerts
Deploying an AI agent is the beginning, not the end. Once your agent is live and handling real conversations, monitoring becomes the most important part of your operational workflow. A monitored agent stays healthy. An unmonitored agent drifts, breaks, and burns through credits without you knowing.
Good monitoring answers three questions at all times:
- Is my agent working? (Availability)
- Is my agent working well? (Quality)
- Is my agent costing what I expect? (Economics)
This guide covers everything you need to build a robust monitoring practice for your AI agents on EZClaws: the dashboard tools available to you, what metrics to track, how to interpret logs, when to set up alerts, and how to build monitoring into your routine.
The EZClaws Dashboard
The EZClaws dashboard is your primary monitoring tool. It provides real-time visibility into all your agents from a single view.
Agent Overview
The main dashboard shows agent cards with at-a-glance status:
- Green indicator - Agent is running normally
- Red indicator - Agent has encountered an error
- Gray indicator - Agent is stopped
- Yellow/Blue indicator - Agent is creating or starting up
Each card also shows the gateway URL, model provider, and last activity timestamp. This is your first stop for a quick health check.
Real-Time Updates
The dashboard uses real-time data from Convex, which means status changes appear instantly. If your agent goes from Running to Error, you see it the moment it happens without refreshing the page. This is especially valuable during deployments and troubleshooting.
Key Metrics to Track
Uptime
What it measures: The percentage of time your agent is in the "Running" state.
Target: 99%+ for production agents.
How to track: The EZClaws dashboard shows current status. Note any downtime periods in a simple log (even a spreadsheet works). Over time, you will understand your agent's reliability pattern.
What causes downtime:
- Model provider outages (OpenAI, Anthropic, Google having issues)
- Railway infrastructure issues (rare but possible)
- API key expiration or billing failure
- Agent configuration errors after changes
- Usage credit exhaustion
Response Time
What it measures: How long your agent takes to respond to a message.
Target: Under 3 seconds for chat-based agents, under 10 seconds for complex agents with multiple skill chains.
What affects response time:
- Model provider latency (biggest factor): GPT-4 is slower than GPT-4o-mini. Claude Opus is slower than Claude Haiku.
- Conversation context length: More context = more tokens = longer processing.
- Skill execution time: Skills that call external APIs add latency.
- Infrastructure latency: Minimal with EZClaws/Railway but can fluctuate.
Optimization: If response time is too high, consider using a faster model, reducing context window size, or optimizing skill API calls. See the configuration guide for tuning options.
Credit Consumption
What it measures: How many usage credits your agent consumes per day, week, and month.
Target: Predictable and within your plan's allocation.
How to track: The billing section shows credit consumption in real time. Track the daily burn rate to predict when you will approach your limit.
What affects consumption:
- Message volume (more conversations = more credits)
- Model choice (larger models cost more per request)
- Context window size (more context = more tokens per request)
- Response length (longer responses = more output tokens)
- Skill activity (some skills consume additional resources)
Budget math: If you consume 100 credits per day and your plan includes 3,000 credits per month, you will run out on day 30. But if a viral moment drives traffic to 300 credits per day, you will run out on day 10. Monitor the trend, not just the current number.
Visit /pricing for plan details and credit allocations.
Error Rate
What it measures: The percentage of requests that result in an error instead of a successful response.
Target: Under 1% for a well-configured agent.
Types of errors:
- Model errors: The AI model provider returns an error (rate limit, server error, timeout)
- Skill errors: A skill fails to execute (external API down, invalid data)
- Configuration errors: Misconfigured settings cause processing failures
- Input errors: Malformed or unexpected input that the agent cannot process
How to track: Check the agent event log on the detail page. Look for error-type events and note patterns.
Conversation Quality
What it measures: How well your agent handles conversations. This is the hardest metric to quantify but arguably the most important.
How to assess:
- Sample conversations regularly: Read 5-10% of conversations weekly. Look for incorrect information, inappropriate responses, poor formatting, or missed escalation opportunities.
- Track escalation rate: If your agent escalates too many conversations (>40%), the system prompt or skills may need improvement. If it escalates too few (<5%), it might be handling issues it should not.
- User feedback: If you have a feedback mechanism, track satisfaction scores over time.
Reading Agent Logs
The Event Log
Each agent has an event log accessible from the agent detail page. Events are timestamped and include:
- Deployment events: Agent created, started, stopped, restarted
- Status changes: Running, Error, Stopped
- Error events: Detailed error messages with context
- Skill events: Skill installations, executions, and failures
- Configuration changes: Settings updates
Common Log Patterns
Healthy agent:
[12:00] Agent started - Status: Running
[12:01] Skill installed: knowledge-base
[12:05] No errors reported
API key issue:
[12:00] Agent started - Status: Running
[12:01] Error: Authentication failed - Invalid API key
[12:01] Status changed: Running -> Error
Fix: Update the API key in agent settings. See the API keys guide.
Credit exhaustion:
[12:00] Warning: Usage credits at 90%
[14:00] Warning: Usage credits at 95%
[16:00] Error: Usage credits exhausted
[16:00] Status changed: Running -> Stopped
Fix: Purchase additional credits or wait for the billing cycle reset.
Model provider outage:
[12:00] Error: Model provider returned 503 (Service Unavailable)
[12:01] Error: Model provider returned 503 (Service Unavailable)
[12:05] Request succeeded - provider recovered
Fix: Usually resolves on its own. If persistent, check the model provider's status page.
Skill error:
[12:00] Skill 'order-lookup' error: External API timeout after 30s
[12:00] Skill 'order-lookup' skipped - fallback to model-only response
Fix: Check the external API that the skill depends on. The agent will continue to respond without the skill's data, but responses may be less complete.
For comprehensive troubleshooting steps, see the troubleshooting guide.
Setting Up Alerts
Notification Preferences
Configure alerts in the settings section of your EZClaws dashboard:
Agent Status Alerts - Get notified when an agent changes from Running to Error or Stopped. This is the most important alert to enable.
Credit Threshold Alerts - Get notified when your usage credits drop below a percentage (e.g., 20% remaining). This gives you time to purchase more credits before your agent stops.
Activity Spike Alerts - Get notified when agent activity is significantly above normal. This could indicate viral traffic, a misconfigured integration, or abuse.
Alert Best Practices
- Do not alert on everything. Too many alerts cause alert fatigue, and you start ignoring them. Focus on actionable alerts.
- Set appropriate thresholds. Credit warnings at 20% are useful. Credit warnings at 80% are noise.
- Include context in alerts. When you receive an alert, you should know what to do without needing to investigate further.
- Test your alerts. Intentionally trigger an alert condition to verify it works. Stop an agent and confirm you receive the notification.
Building a Monitoring Routine
Daily Quick Check (30 seconds)
Open the dashboard once a day and verify:
- All agents show green (Running) status
- Credit consumption is within expected range
- No error events in the last 24 hours
This takes 30 seconds and catches most issues before they become problems.
Weekly Review (10 minutes)
Once a week, do a deeper review:
- Check usage trends - Is credit consumption increasing, decreasing, or stable? Is the trend expected?
- Review event logs - Scan for any warnings or errors that occurred during the week.
- Sample conversations - Read 5-10 conversations to assess quality. Look for:
- Incorrect information
- Responses that are too long or too short
- Missed escalation opportunities
- Tone or formatting issues
- Compare response times - Are responses getting slower? This could indicate context window bloat or skill latency.
- Check model provider dashboards - Review API usage and costs directly on the provider's dashboard.
Monthly Review (30 minutes)
Once a month, do a comprehensive review:
- Calculate ROI - Use the framework from our ROI guide to assess value delivered.
- Review system prompt - Based on conversation samples, does the system prompt need updating? Common adjustments include refining tone, adding new knowledge, and improving edge case handling.
- Evaluate model choice - Would a different model provide better quality or lower cost? See the model comparison.
- Assess skill performance - Are installed skills being triggered and providing value? Are there new skills in the marketplace that could help?
- Plan optimizations - Based on the monthly data, identify 1-2 specific improvements to make.
Monitoring Multiple Agents
If you run multiple agents, monitoring scales differently.
Dashboard Organization
Use descriptive display names that let you quickly identify each agent's purpose:
- "Support Agent - Website Chat"
- "Support Agent - Telegram"
- "Community Bot - Discord"
- "Sales Assistant - WhatsApp"
Prioritize by Impact
Not all agents deserve equal monitoring attention. Prioritize:
- Customer-facing production agents - Monitor daily, review weekly
- Internal team agents - Monitor weekly
- Development and testing agents - Monitor when actively developing
Aggregate Metrics
Track total credit consumption across all agents to manage your overall budget. Individual agent tracking helps identify which agents consume the most resources and whether the cost is justified by the value delivered.
What to Do When Things Go Wrong
Agent Stops Responding
- Check dashboard status. Is it Running?
- If Error, check event logs for the error message.
- If Running but not responding, check your model provider's API status.
- Try restarting the agent from the dashboard.
- If the issue persists, check the troubleshooting guide.
Unexpected High Credit Usage
- Check if there is a legitimate traffic spike.
- Review agent event logs for unusual activity.
- If suspicious, stop the agent temporarily while you investigate.
- Check if a misconfigured skill or integration is generating excessive requests.
- Consider adding rate limiting in your configuration.
Quality Degradation
If your agent's response quality drops:
- Sample recent conversations and identify specific problems.
- Check if the model provider updated their model (this can change behavior).
- Review recent configuration changes that might have caused the issue.
- Revisit the system prompt and check for ambiguities or missing instructions.
- Verify skills are functioning correctly and providing accurate data.
Model Provider Outage
- Check the provider's status page (status.openai.com, status.anthropic.com).
- If the outage is brief, wait for recovery. Your agent will resume automatically.
- For extended outages, consider temporarily switching to a different model provider.
- Communicate with users if the outage significantly impacts service.
Advanced Monitoring
Custom Metrics
For advanced users, consider tracking custom metrics:
- Conversations per hour by channel - Understand traffic patterns
- Average messages per conversation - Indicates conversation complexity
- Skill trigger frequency - Which skills are most used
- Escalation reasons - What types of issues require human help
- First-response vs follow-up ratio - How many conversations are one-shot vs multi-turn
Automated Quality Checks
For high-volume agents, manual conversation sampling may not be sufficient. Consider:
- Running a separate small model to evaluate your agent's response quality
- Setting up keyword alerts for known problem phrases
- Tracking user repeated-question patterns (a user asking the same question twice usually means the first answer was inadequate)
Conclusion
Monitoring is not glamorous, but it is the difference between an AI agent that reliably delivers value and one that silently fails or wastes money. The good news is that with EZClaws's real-time dashboard and a consistent monitoring routine, it takes minimal time to keep your agents healthy.
Start with the daily 30-second check. Build up to weekly and monthly reviews as your agent operation matures. Enable alerts for the critical conditions. And never stop sampling conversations. The numbers tell you how your agent is performing. The conversations tell you how your users are experiencing it.
For more on running effective AI agents, check our deployment tutorial, configuration guide, and troubleshooting guide.
Frequently Asked Questions
Check three things: status is 'Running' on the EZClaws dashboard, usage credits are being consumed (indicating the agent is processing requests), and the agent event log shows no error events. For deeper verification, send a test message and confirm a proper response.
Check the agent event log on the detail page for error messages. Common causes include an invalid or expired API key, exhausted usage credits, or a model provider outage. Fix the underlying issue and restart the agent from the dashboard. See our troubleshooting guide for specific error resolution steps.
Daily quick checks (30 seconds) to verify the agent is running and credits look normal. Weekly deeper reviews (5-10 minutes) to check usage trends, sample conversations, and review any logged events. Monthly comprehensive reviews to evaluate performance against goals and optimize configuration.
Yes. EZClaws supports notification preferences in the settings section. You can configure alerts for agent status changes (running to error), low credit warnings, and unusual activity spikes. Email notifications ensure you catch issues even when you are not looking at the dashboard.
The five most important metrics are: uptime (percentage of time the agent is running), response time (how fast it replies), autonomous resolution rate (conversations resolved without human help), credit consumption rate (cost tracking), and error rate (percentage of failed requests).
Your OpenClaw Agent is Waiting for you
Our provisioning engine is standing by to spin up your private OpenClaw instance — dedicated VM, HTTPS endpoint, and full autonomy in under a minute.
