Lab 05: Azure Monitor, Alerts, Backup, and Log Analytics
Estimated time: 50–65 minutes Difficulty: ⭐⭐⭐☆☆ Environment: Azure free account — Azure CLI + portal
Prerequisites
az account show --query "{name:name, id:id}" -o table
Lab Objectives
- Create a Log Analytics workspace
- Enable diagnostic settings on a VM to stream logs and metrics
- Write and run KQL queries
- Create a metric alert with an email action group
- Configure VM backup in a Recovery Services vault
- Perform a file-level restore from a backup
Step 1: Create Resource Group and Supporting Resources
RG="rg-az104-lab05"
LOCATION="eastus"
az group create --name $RG --location $LOCATION
# Create Log Analytics Workspace
az monitor log-analytics workspace create \
--resource-group $RG \
--workspace-name "law-lab05" \
--location $LOCATION \
--retention-time 30
LAW_ID=$(az monitor log-analytics workspace show \
--resource-group $RG \
--workspace-name "law-lab05" \
--query id -o tsv)
echo "Log Analytics Workspace ID: $LAW_ID"
Step 2: Deploy a VM to Monitor
az vm create \
--resource-group $RG \
--name "vm-monitor" \
--image Ubuntu2204 \
--admin-username azureuser \
--admin-password "P@ssw0rd!Azure104" \
--size Standard_B2s \
--location $LOCATION
VM_ID=$(az vm show \
--resource-group $RG \
--name "vm-monitor" \
--query id -o tsv)
echo "VM ID: $VM_ID"
Step 3: Enable Diagnostic Settings
Route VM metrics and logs to the Log Analytics workspace:
az monitor diagnostic-settings create \
--name "diag-vm-to-law" \
--resource "$VM_ID" \
--workspace "$LAW_ID" \
--metrics '[{"category": "AllMetrics", "enabled": true, "retentionPolicy": {"enabled": false, "days": 0}}]'
Install the Azure Monitor Agent (AMA) on the VM to collect OS-level performance counters:
az vm extension set \
--resource-group $RG \
--vm-name "vm-monitor" \
--name AzureMonitorLinuxAgent \
--publisher Microsoft.Azure.Monitor \
--version 1.0 \
--enable-auto-upgrade true
⚠️ Tricky spot: Diagnostic Settings alone send Azure platform metrics (host-level CPU, network). To get OS-level metrics (memory, disk free space, custom app counters), you also need the Azure Monitor Agent installed inside the VM. These are separate data streams.
Step 4: Create an Action Group for Alerts
# Create an action group with email notification
az monitor action-group create \
--resource-group $RG \
--name "ag-lab05" \
--short-name "lab05" \
--action email "lab-admin" "your-email@example.com"
AG_ID=$(az monitor action-group show \
--resource-group $RG \
--name "ag-lab05" \
--query id -o tsv)
echo "Action Group ID: $AG_ID"
Step 5: Create Metric Alerts
# Alert: CPU > 85% for 5 minutes
az monitor metrics alert create \
--name "alert-high-cpu" \
--resource-group $RG \
--scopes "$VM_ID" \
--condition "avg Percentage CPU > 85" \
--window-size 5m \
--evaluation-frequency 1m \
--action "$AG_ID" \
--severity 2 \
--description "CPU exceeded 85% for 5 minutes"
# Alert: Available memory < 500 MB
az monitor metrics alert create \
--name "alert-low-memory" \
--resource-group $RG \
--scopes "$VM_ID" \
--condition "avg Available Memory Bytes < 524288000" \
--window-size 5m \
--evaluation-frequency 1m \
--action "$AG_ID" \
--severity 3 \
--description "Available memory below 500 MB"
# Verify alerts
az monitor metrics alert list \
--resource-group $RG \
--query "[].{name:name, severity:severity, condition:criteria.allOf[0].metricName}" \
-o table
⚠️ Tricky spot: Alert
--window-sizeis the evaluation window (how long the condition must be true).--evaluation-frequencyis how often Azure checks.window-size >= evaluation-frequencyis required. If frequency is 1m and window is 5m, Azure checks every 1 minute whether the 5-minute average exceeded the threshold.
Step 6: Create an Activity Log Alert
Activity log alerts fire on Azure management operations:
SUBSCRIPTION_ID=$(az account show --query id -o tsv)
# Alert when any VM in the subscription is deleted
az monitor activity-log alert create \
--name "alert-vm-deleted" \
--resource-group $RG \
--scope "/subscriptions/$SUBSCRIPTION_ID" \
--condition "category=Administrative and operationName=Microsoft.Compute/virtualMachines/delete and status=Succeeded" \
--action-group "$AG_ID" \
--description "Alert when a VM is deleted anywhere in the subscription"
⚠️ Tricky spot: Activity log alerts use scope at the subscription level (or RG level) — not the individual resource. Metric alerts scope to individual resources or resource groups.
Step 7: Query Logs with KQL
Wait 10–15 minutes for some activity data to flow into the workspace, then:
# Get the workspace name and resource group for portal KQL
echo "Open portal → Log Analytics Workspaces → law-lab05 → Logs"
Run these KQL queries in the Log Analytics workspace portal UI:
// Query 1: Recent Azure activity in this resource group
AzureActivity
| where TimeGenerated > ago(1h)
| where ResourceGroup == "rg-az104-lab05"
| project TimeGenerated, OperationNameValue, ActivityStatusValue, Caller
| order by TimeGenerated desc
// Query 2: Count operations by type
AzureActivity
| where TimeGenerated > ago(24h)
| summarize count() by OperationNameValue
| order by count_ desc
| top 10 by count_
// Query 3: VM heartbeat check
Heartbeat
| where TimeGenerated > ago(30m)
| summarize LastHeartbeat = max(TimeGenerated) by Computer
⚠️ Tricky spot: Data may take 5–10 minutes to appear in Log Analytics after enabling diagnostic settings. If queries return empty results, wait and retry.
Step 8: Configure VM Backup
# Create a Recovery Services Vault
az backup vault create \
--resource-group $RG \
--name "rsv-lab05" \
--location $LOCATION
# View available backup policies
az backup policy list \
--resource-group $RG \
--vault-name "rsv-lab05" \
--query "[].name" \
-o tsv
# Enable backup for the VM using DefaultPolicy
az backup protection enable-for-vm \
--resource-group $RG \
--vault-name "rsv-lab05" \
--vm "vm-monitor" \
--policy-name "DefaultPolicy"
# Trigger an immediate backup (don't wait for scheduled)
az backup protection backup-now \
--resource-group $RG \
--vault-name "rsv-lab05" \
--container-name "IaasVMContainer;iaasvmcontainerv2;rg-az104-lab05;vm-monitor" \
--item-name "VM;iaasvmcontainerv2;rg-az104-lab05;vm-monitor" \
--backup-management-type AzureIaasVM \
--retain-until "01-01-2026"
Wait for the backup job to complete:
# Monitor backup job status
az backup job list \
--resource-group $RG \
--vault-name "rsv-lab05" \
--query "[0].{jobId:name, status:properties.status, operation:properties.operation}" \
-o table
Step 9: List Recovery Points
# List available recovery points (run after backup completes)
az backup recoverypoint list \
--resource-group $RG \
--vault-name "rsv-lab05" \
--container-name "IaasVMContainer;iaasvmcontainerv2;rg-az104-lab05;vm-monitor" \
--item-name "VM;iaasvmcontainerv2;rg-az104-lab05;vm-monitor" \
--backup-management-type AzureIaasVM \
--query "[].{name:name, time:properties.recoveryPointTime, type:properties.recoveryPointType}" \
-o table
⚠️ Tricky spot: Container name and item name use semicolon-separated compound identifiers. The format is:
IaasVMContainer;iaasvmcontainerv2;<resource-group>;<vm-name>. Getting these wrong is the #1 reason backup CLI commands fail. You can get the exact values withaz backup item list.
Step 10: Clean Up
# Disable backup protection first (required before deleting vault)
az backup protection disable \
--resource-group $RG \
--vault-name "rsv-lab05" \
--container-name "IaasVMContainer;iaasvmcontainerv2;rg-az104-lab05;vm-monitor" \
--item-name "VM;iaasvmcontainerv2;rg-az104-lab05;vm-monitor" \
--backup-management-type AzureIaasVM \
--delete-backup-data true \
--yes
# Then delete resource group
az group delete --name $RG --yes --no-wait
⚠️ Tricky spot: You cannot delete a Recovery Services vault if it has active backup items or ASR replication. Always disable protection (with
--delete-backup-data true) before deleting the vault.
Lab Tricky Spots Summary
| Trap | Effect | Fix |
|---|---|---|
| Diagnostic Settings without AMA | Only host-level metrics; no OS metrics (memory, disk) | Install Azure Monitor Agent for full OS telemetry |
| Log data delay | KQL queries return empty immediately after setup | Wait 5–10 minutes for data ingestion |
window-size < evaluation-frequency | Alert creation fails | Set window-size ≥ evaluation-frequency |
| Wrong backup container/item name format | Backup CLI commands fail | Use az backup item list to get exact names |
| Deleting vault with active items | Vault deletion blocked | Disable protection with --delete-backup-data true first |
Lab Takeaways
- Azure Monitor has two data types — metrics (automatic, 93 days) and logs (require workspace, 30 days default). Know which one you're querying.
- Three alert types — metric (threshold on numeric value), log query (KQL result), activity log (Azure management operation). Match the right alert type to the scenario.
- Recovery Services vault is required for VM backup and ASR. The vault must be in the same region as the VM.
- Backup soft delete means deleted backup data persists for 14 days. You cannot delete the vault until all backup data is purged.
- Test your DR — ASR test failover and backup restore drills should be scheduled regularly, not just done once.