Welcome to Azure Network Watcher

Welcome everyone! Today I'm going to walk you through Azure Network Watcher, which is Microsoft's comprehensive network monitoring and diagnostic service.

Think of Network Watcher as your network detective - it helps you understand what's happening in your Azure network infrastructure, diagnose connectivity issues, and monitor performance in real-time.

By the end of this presentation, you'll know how to set up monitoring, configure diagnostics, troubleshoot network problems, and optimize your network performance using Network Watcher's powerful tools.

Understanding the Architecture

Let me start by showing you how Network Watcher fits into your Azure environment. This architecture diagram illustrates the complete ecosystem.

Network Watcher sits at the center, connecting to various diagnostic tools like Connection Troubleshoot, Next Hop analysis, and Packet Capture. Notice how it integrates with Log Analytics for data analysis and Storage Accounts for raw data storage.

The beauty of this architecture is that it provides both real-time diagnostics and historical analysis. Your VMs and network security groups feed data into the system, while tools like Traffic Analytics provide insights back to you.

This integrated approach means you get a complete picture of your network health from multiple perspectives.

Getting Started - Prerequisites

Before we dive into the cool features, let's get Network Watcher properly set up. The first step is enabling Network Watcher in your region.

Here's something important to know - Network Watcher is automatically enabled when you create your first virtual network in a region, but I always recommend explicitly enabling it to ensure it's in the right resource group.

Notice we're specifying the location as 'eastus' - you'll need to do this for each region where you have resources. The great thing is that once it's enabled, it works across all your virtual networks in that region.

This command ensures Network Watcher is properly configured and ready to monitor your network infrastructure.

Storage Account Setup

Now we need a place to store all our diagnostic data. This storage account will hold packet captures, flow logs, and other network diagnostic information.

I'm using Standard_LRS here because it's cost-effective for most scenarios. If you need higher availability, you could use Standard_GRS for geo-redundancy, but that comes with additional costs.

The StorageV2 kind gives us access to all the latest features including blob storage tiers, which is perfect for managing costs on older diagnostic data.

Remember, this storage account name needs to be globally unique across all of Azure, so choose something descriptive but unique to your organization.

Log Analytics Workspace

The Log Analytics workspace is where the magic happens for data analysis. This is where Network Watcher will send processed flow logs and where Traffic Analytics will generate insights.

I'm using the PerGB2018 pricing tier, which offers the best value for most organizations. It gives you 31 days of retention and pay-per-GB pricing. If you have predictable data volumes, you might consider commitment tiers for cost savings.

This workspace will become your central hub for querying network data, creating alerts, and building dashboards. You'll spend a lot of time here once everything is configured!

Flow Monitoring Pipeline

This sequence diagram shows exactly what happens when network traffic flows through your infrastructure. Let me walk you through each step.

When a user makes a network request, it first hits your Network Security Group. The NSG evaluates its security rules and makes a decision - allow or deny.

Here's where it gets interesting - Flow Logs capture this decision and simultaneously do two things: store the raw logs in your storage account and send processed data to Log Analytics.

Traffic Analytics then takes this processed data and generates insights like geo-mapping, security threat detection, and performance analytics. It's like having a network analyst working 24/7!

Basic Flow Log Configuration

Let's start with basic flow log configuration. This command creates a flow log for a specific Network Security Group.

Notice we need the full resource ID of the NSG - this ensures we're targeting exactly the right security group. The storage account parameter tells Network Watcher where to store the raw flow data.

This gives us basic logging, but we're going to enhance this in the next step with Traffic Analytics and version 2 logging for much richer data.

The location parameter should match where your NSG is deployed - this is important for data residency and performance.

Enhanced Flow Logs with Analytics

Now let's supercharge our flow logging! This enhanced configuration adds several powerful features.

Version 2 logs include additional fields like flow state, throughput information, and enhanced security details. The 30-day retention means we keep logs in storage for a month before automatic cleanup.

The real game-changer here is Traffic Analytics. When enabled, it provides geo-mapping showing where your traffic originates, identifies top talkers, and even detects potential security threats.

The workspace parameter connects this to our Log Analytics workspace, enabling powerful KQL queries and alerting capabilities.

Packet Capture Workflow

This diagram shows the complete packet capture workflow. Unlike flow logs which show decisions, packet captures give you the actual network packets for deep analysis.

You can trigger captures manually when troubleshooting specific issues, schedule them for regular monitoring, or even set up alert-based triggers for automatic capture when problems occur.

The captured data gets stored in your storage account, and then you can analyze it using tools like Wireshark, Microsoft Network Monitor, or custom scripts.

The key is using filters to capture only what you need - unfiltered captures can generate massive amounts of data very quickly!

Basic Packet Capture

Here's how to create a basic packet capture. This is your starting point for deep network analysis.

The command targets a specific VM - this is where the Network Watcher extension will be installed automatically if it's not already there. The capture will include all traffic flowing through that VM's network interfaces.

This basic capture uses default settings, which means no time limits or size restrictions. While this gives you complete data, it can quickly consume storage space, so we'll add controls in the next example.

Optimized Packet Capture

This is where packet capture becomes really powerful. Look at all these optimization settings!

The 5-minute time limit prevents runaway captures, while the 128 bytes per packet limit captures just the headers - this reduces file size by up to 90% while preserving the information you need for most troubleshooting.

The 100MB session limit provides an additional safety net. These limits work together to give you meaningful data without breaking your storage budget.

The storage path parameter lets you organize captures into folders - very helpful when you're running multiple captures for different issues.

Targeted Packet Filtering

This is where packet capture becomes surgical. Instead of capturing everything, we're using filters to target exactly what we need.

This filter captures only TCP traffic on port 80 from a specific local IP address. The asterisks for remote IP and port mean we'll capture traffic to any destination.

You can create multiple filters in the JSON array to capture different types of traffic. For example, you might capture both HTTP and HTTPS traffic, or focus on traffic between specific subnets.

Filtering is crucial for reducing noise and focusing on the actual problem you're trying to solve.

Connection Troubleshooting Process

This flowchart shows my systematic approach to connection troubleshooting. Network Watcher's connectivity tools make this process much more efficient.

We start by defining the source and destination, then run a connectivity check. Based on whether the connection succeeds or fails, we follow different diagnostic paths.

For successful connections, we analyze performance metrics. For failures, we systematically check NSG rules, route tables, and firewall configurations.

The key is being methodical - this approach helps you quickly identify the root cause instead of guessing at solutions.

Basic Connectivity Testing

Let's test basic connectivity between two points in your network. This command checks if traffic can flow from a source VM to a destination IP on a specific port.

The test happens at the network layer and shows you exactly where traffic might be getting blocked. You'll get back connectivity status, latency information, and a hop-by-hop analysis of the path.

Port 443 here means we're testing HTTPS connectivity. This is super useful for validating application connectivity or testing after configuration changes.

The diagnostic returns detailed information about each hop in the path, making it easy to identify where problems occur.

Advanced Connectivity Testing

This advanced version shows testing between two VMs using resource IDs instead of IP addresses. This is particularly useful because it automatically resolves to current IP addresses.

We're testing SQL Server connectivity on port 1433 here. The protocol specification ensures we're testing the exact type of traffic your application uses.

The IPv4 preference is useful in dual-stack environments where you want to force testing over a specific IP version.

Resource-based targeting is great for dynamic environments where IP addresses might change due to scaling or redeployment.

Network Routing Analysis

This diagram illustrates how Azure determines the next hop for network traffic. Understanding this is crucial for troubleshooting routing issues.

Azure evaluates routes in a specific order: User Defined Routes have the highest priority, followed by BGP routes, then system routes. The first matching route determines the next hop.

Next hop types include Internet for external traffic, Virtual Appliance for traffic going through firewalls or routers, VnetLocal for traffic staying within the virtual network, and None when traffic should be dropped.

This systematic evaluation helps you predict and troubleshoot how traffic will flow through your network.

Network Topology Discovery

Network topology gives you a comprehensive view of your network architecture. This command discovers all the network resources and their relationships.

You'll see virtual networks, subnets, virtual machines, network security groups, and how they're all connected. This is invaluable for understanding your network layout and planning changes.

The topology view helps identify potential single points of failure, validates security boundaries, and ensures your network design matches your intended architecture.

Next Hop Route Analysis

Next hop analysis tells you exactly where traffic will go from a specific source to a destination. This is incredibly useful for troubleshooting routing problems.

The command shows the next hop type, the route table that made the decision, and the specific IP address traffic will be sent to. This helps validate that your custom routes are working as expected.

If you're having connectivity issues, this tool quickly shows whether it's a routing problem or something else like NSG rules or firewall configurations.

NSG Rule Evaluation Process

This diagram shows how Network Security Group rules are evaluated. Understanding this process is essential for security troubleshooting.

Traffic first encounters subnet-level NSG rules, then network interface-level rules. Rules are processed by priority number - 100 is highest priority, 4096 is lowest.

The first matching rule determines the action - allow or deny. This means rule order and priority are crucial for getting the security behavior you want.

Many connectivity issues are actually security rule problems, so understanding this evaluation process helps you troubleshoot faster.

Security Group Analysis

This command shows you the effective security rules for a specific VM, including both subnet and network interface level rules.

The output shows rule priorities, actions, and which NSG each rule comes from. This is invaluable for understanding the complete security posture of a VM.

Instead of manually checking multiple NSGs, this tool gives you a consolidated view of all rules that apply to your VM, making security auditing much easier.

Comprehensive Monitoring Pipeline

This diagram shows how all the monitoring pieces fit together in a comprehensive alerting and reporting system.

Network events flow into Azure Monitor, get processed in Log Analytics, and trigger alert rules that notify your team through multiple channels.

The system also supports custom dashboards, Power BI integration, and scheduled reports for proactive monitoring.

Action groups define who gets notified and how - email, SMS, webhooks, or even ITSM integrations for enterprise environments.

Setting Up Action Groups

Action groups are the foundation of your alerting system. They define who gets notified when alerts fire.

The short name is limited to 12 characters and appears in SMS and email notifications, so make it descriptive but concise.

This basic action group will be enhanced in the next steps with specific notification methods like email, SMS, and webhook integrations.

Email Notification Setup

Here we're adding email notifications to our action group. You can add multiple email addresses by repeating this command with different parameters.

The display name appears in the alert emails, so use descriptive names like "Network Admin" or "NOC Team" to make it clear who should respond.

You can also add SMS notifications using the same pattern: --add-action sms "Name" "Country Code" "Phone Number".

Performance Alert Configuration

This creates an alert for high network traffic. Notice we're monitoring Network In Total - this tracks incoming traffic to the VM.

The threshold is set to 1GB, but you'll want to adjust this based on your normal traffic patterns. The 15-minute window with 5-minute evaluation frequency provides good responsiveness without too many false positives.

Severity level 2 indicates a warning level alert. Use severity 0 for critical issues that need immediate response.

KQL Query for Traffic Analysis

Here's where Log Analytics becomes really powerful. This KQL query analyzes external traffic patterns to identify potential security threats.

We're looking at the last hour of external public traffic and summarizing total bytes by source IP. This helps identify potential data exfiltration or denial-of-service attacks.

The top 10 results give you the highest traffic sources, which you should investigate if they seem unusual for your environment.

Run queries like this regularly to establish baseline traffic patterns and quickly spot anomalies.

Security Analysis with KQL

This query focuses on blocked traffic analysis - essentially showing you what your NSG rules are protecting you from.

We're filtering for denied flows (FlowStatus_s == "D") and grouping by the NSG rule that blocked the traffic, destination port, and protocol.

This tells you which rules are most active and reveals common attack patterns by showing frequently blocked ports and protocols.

Use this information to validate that your security rules are working correctly and to identify potential threats.

Systematic Troubleshooting Approach

This decision matrix provides a structured approach to network troubleshooting. Different types of issues require different diagnostic tools.

Connectivity issues use the connection troubleshoot tool, performance problems need packet capture and metrics analysis, security issues require flow logs and NSG analysis, and routing problems need next hop and topology tools.

The key is matching the right tool to the type of problem you're investigating. This systematic approach saves time and leads to faster resolution.

External Connectivity Testing

Sometimes you need to test connectivity to external services. This command tests HTTPS connectivity to Microsoft's website.

This validates outbound internet access, DNS resolution, and HTTPS connectivity all in one test. It's perfect for troubleshooting application connectivity issues.

The test shows you the complete path from your VM to the external destination, helping identify where connectivity might be failing.

Performance-Focused Packet Capture

For performance analysis, we need longer capture windows and focused filtering. This 10-minute capture focuses specifically on HTTPS traffic.

Longer capture windows help identify intermittent performance issues that might not show up in shorter captures.

By filtering for port 443, we're eliminating noise and focusing on potentially problematic HTTPS traffic patterns.

Best Practices and Cost Optimization

Let me wrap up with some key best practices I've learned from implementing Network Watcher in production environments.

Cost optimization is crucial - enable flow logs only on critical NSGs, use packet capture limits to prevent excessive storage usage, and implement data retention policies.

For security, remember that packet captures contain sensitive data, so secure your storage accounts properly and limit access to authorized personnel.

Consider automation using Logic Apps or Azure Functions to automatically respond to alerts and trigger diagnostics when issues occur.

These practices will help you get maximum value from Network Watcher while controlling costs and maintaining security.

πŸ” Azure Network Watcher

Comprehensive Monitoring & Diagnostics

What You'll Learn Today

  • πŸ—οΈ Architecture and Setup
  • πŸ“Š Flow Monitoring & Analytics
  • πŸ“¦ Packet Capture Configuration
  • πŸ”§ Connection Troubleshooting
  • πŸ›‘οΈ Security Analysis
  • πŸ“ˆ Monitoring & Alerting

πŸ—οΈ Azure Network Watcher Architecture

graph TB subgraph "Azure Subscription" subgraph "Resource Group" NW[Network Watcher] LA[Log Analytics Workspace] SA[Storage Account] end subgraph "Virtual Network" VM1[Virtual Machine 1] VM2[Virtual Machine 2] NSG[Network Security Group] RT[Route Table] end subgraph "Monitoring Components" CT[Connection Troubleshoot] NT[Next Hop] SG[Security Group View] PT[Packet Capture] FL[Flow Logs] TA[Traffic Analytics] end end subgraph "External Services" AI[Application Insights] AM[Azure Monitor] PBI[Power BI] end NW --> CT NW --> NT NW --> SG NW --> PT NW --> FL NW --> TA FL --> LA TA --> LA PT --> SA LA --> AM AM --> AI AM --> PBI VM1 --> NSG VM2 --> NSG NSG --> FL RT --> NT

1Enable Network Watcher

az network watcher configure \ --resource-group myResourceGroup \ --locations eastus \ --enabled true # Returns: Network Watcher enabled in eastus region

Key Parameters

  • --resource-group: Target resource group for Network Watcher
  • --locations: Azure region where Network Watcher will be enabled
  • --enabled true: Activates Network Watcher service
Important: Network Watcher must be enabled per region. Run this command for each region where you have resources.

2Create Storage Account

az storage account create \ --name networkwatcherstorage \ --resource-group myResourceGroup \ --location eastus \ --sku Standard_LRS \ --kind StorageV2 # Returns: Storage account created successfully

Storage Options

  • Standard_LRS: Cost-effective, locally redundant
  • Standard_GRS: Geo-redundant for high availability
  • Premium_LRS: High-performance SSD storage
  • StorageV2: Latest features and blob tiers

3Create Log Analytics Workspace

az monitor log-analytics workspace create \ --resource-group myResourceGroup \ --workspace-name NetworkWatcherWorkspace \ --location eastus \ --sku PerGB2018 # Returns: Log Analytics workspace created with PerGB pricing

Pricing Tiers

  • PerGB2018: Pay-per-GB with 31-day retention
  • CapacityReservation: Commitment pricing (100GB, 200GB, etc.)
  • Free: 500MB daily limit, 7-day retention

πŸ”„ Network Flow Monitoring Pipeline

sequenceDiagram participant User as User Traffic participant NSG as Network Security Group participant FL as Flow Logs participant SA as Storage Account participant LA as Log Analytics participant TA as Traffic Analytics User->>NSG: Network Request NSG->>NSG: Apply Security Rules NSG->>FL: Log Flow Decision FL->>SA: Store Raw Logs FL->>LA: Send Processed Data LA->>TA: Enable Analytics TA->>User: Generate Insights

4Configure Basic Flow Logs

az network watcher flow-log create \ --location eastus \ --name myNSGFlowLog \ --nsg /subscriptions/{subscription-id}/resourceGroups/myResourceGroup/providers/Microsoft.Network/networkSecurityGroups/myNSG \ --storage-account networkwatcherstorage # Returns: Flow log configuration created for NSG

Required Information

  • Full NSG Resource ID: Ensures exact targeting
  • Storage Account Name: Where raw logs are stored
  • Location: Must match NSG location

5Enhanced Flow Logs with Analytics

az network watcher flow-log configure \ --location eastus \ --nsg /subscriptions/{subscription-id}/resourceGroups/myResourceGroup/providers/Microsoft.Network/networkSecurityGroups/myNSG \ --storage-account networkwatcherstorage \ --log-version 2 \ --retention 30 \ --traffic-analytics true \ --workspace NetworkWatcherWorkspace # Returns: Enhanced flow logging with Traffic Analytics enabled

Version 2 Enhancements

  • Flow state information
  • Throughput metrics
  • Enhanced security insights
  • Geo-mapping capabilities

πŸ“¦ Packet Capture Workflow

graph LR subgraph "Packet Capture Workflow" A[Trigger Event] --> B[Start Capture] B --> C[Apply Filters] C --> D[Capture Packets] D --> E[Store in Storage] E --> F[Analyze with Tools] end subgraph "Capture Triggers" G[Manual Start] H[Scheduled Capture] I[Alert-based Trigger] end subgraph "Analysis Tools" J[Wireshark] K[Network Monitor] L[Custom Scripts] end G --> A H --> A I --> A F --> J F --> K F --> L

6Basic Packet Capture

az network watcher packet-capture create \ --name myPacketCapture \ --vm myVM \ --resource-group myResourceGroup \ --location eastus # Returns: Packet capture session started on VM

Default Behavior

  • Captures all traffic through VM interfaces
  • No time or size limits
  • Stores in temporary location
  • Requires Network Watcher extension on VM
Warning: Unlimited captures can consume significant storage space quickly!

7Optimized Packet Capture

az network watcher packet-capture create \ --name advancedPacketCapture \ --vm myVM \ --resource-group myResourceGroup \ --location eastus \ --storage-account networkwatcherstorage \ --storage-path "/captures/advanced/" \ --time-limit 300 \ --bytes-to-capture-per-packet 128 \ --total-bytes-per-session 104857600 # Returns: Optimized capture with 5min limit, 128 bytes/packet, 100MB max

Optimization Benefits

  • Time Limit: Prevents runaway captures
  • Packet Truncation: 90% storage reduction
  • Session Limit: Additional safety net
  • Organized Storage: Custom folder structure

8Filtered Packet Capture

az network watcher packet-capture create \ --name filteredCapture \ --vm myVM \ --resource-group myResourceGroup \ --location eastus \ --storage-account networkwatcherstorage \ --filters '[{ "protocol": "TCP", "localIPAddress": "10.0.0.4", "localPort": "80", "remoteIPAddress": "*", "remotePort": "*" }]' # Returns: Filtered capture for TCP port 80 traffic from specific IP

Filter Options

  • protocol: TCP, UDP, Any
  • IP addresses: Specific IPs or wildcards (*)
  • Ports: Specific ports or ranges
  • Multiple filters: JSON array for complex scenarios

πŸ”§ Connection Troubleshooting Process

graph TD subgraph "Connection Troubleshoot Process" A[Connection Issue Reported] --> B[Define Source and Destination] B --> C[Run Connectivity Check] C --> D{Connection Status} D -->|Success| E[Analyze Performance Metrics] D -->|Failed| F[Identify Failure Point] F --> G[Check NSG Rules] F --> H[Verify Route Tables] F --> I[Examine Firewall Rules] G --> J[Generate Remediation Plan] H --> J I --> J E --> K[Optimize Configuration] J --> L[Implement Fixes] K --> L end

9Basic Connectivity Test

az network watcher test-connectivity \ --source-resource /subscriptions/{subscription-id}/resourceGroups/myResourceGroup/providers/Microsoft.Compute/virtualMachines/sourceVM \ --dest-address 10.0.1.4 \ --dest-port 443 \ --resource-group myResourceGroup # Returns: Connectivity status, latency, hop-by-hop analysis

Test Results Include

  • Connection status (Reachable/Unreachable)
  • Average latency in milliseconds
  • Hop-by-hop path analysis
  • Failure point identification

10Advanced Connectivity Testing

az network watcher test-connectivity \ --source-resource /subscriptions/{subscription-id}/resourceGroups/myResourceGroup/providers/Microsoft.Compute/virtualMachines/sourceVM \ --dest-resource /subscriptions/{subscription-id}/resourceGroups/myResourceGroup/providers/Microsoft.Compute/virtualMachines/destVM \ --protocol TCP \ --dest-port 1433 \ --resource-group myResourceGroup \ --preferred-ip-version IPv4 # Returns: SQL Server connectivity test between VMs

Advanced Features

  • Resource-based targeting: Auto-resolves IP addresses
  • Protocol specification: TCP, UDP, ICMP
  • IP version preference: IPv4 or IPv6
  • Application-specific ports: Database, web, custom

πŸ—ΊοΈ Network Routing Analysis

graph TB subgraph "Network Topology Discovery" VM[Source VM] --> RT[Route Table Lookup] RT --> NH{Next Hop Type} NH -->|Internet| IG[Internet Gateway] NH -->|VirtualAppliance| VA[Network Virtual Appliance] NH -->|VnetLocal| VL[Virtual Network Local] NH -->|None| DN[Drop/None] IG --> Internet[Internet Destination] VA --> FW[Firewall/Router] VL --> Target[Target VM] DN --> X[Traffic Dropped] end subgraph "Routing Decision Factors" UDR[User Defined Routes] BGP[BGP Routes] SYS[System Routes] UDR --> RT BGP --> RT SYS --> RT end

11Network Topology Discovery

az network watcher show-topology \ --resource-group myResourceGroup \ --location eastus # Returns: Complete network topology with resource relationships

Topology Information

  • Virtual networks and subnets
  • Virtual machines and network interfaces
  • Network security groups
  • Route tables and associations
  • Resource relationships and dependencies
Scope: Topology is limited to the specified resource group and region.

12Next Hop Route Analysis

az network watcher show-next-hop \ --vm /subscriptions/{subscription-id}/resourceGroups/myResourceGroup/providers/Microsoft.Compute/virtualMachines/sourceVM \ --source-ip 10.0.0.4 \ --dest-ip 10.0.1.4 \ --resource-group myResourceGroup # Returns: Next hop type, route table ID, next hop IP address

Next Hop Types

  • VirtualAppliance: Traffic routed through NVA
  • VnetLocal: Traffic stays within virtual network
  • Internet: Traffic routed to internet
  • None: Traffic dropped (no valid route)

πŸ›‘οΈ NSG Rule Evaluation Process

graph LR subgraph "NSG Rule Evaluation Process" A[Incoming Traffic] --> B[Subnet NSG Rules] B --> C{Allow/Deny} C -->|Allow| D[NIC NSG Rules] C -->|Deny| E[Traffic Blocked] D --> F{Allow/Deny} F -->|Allow| G[Traffic Permitted] F -->|Deny| H[Traffic Blocked] end subgraph "Rule Priority" I[Priority 100
Highest] J[Priority 200] K[Priority 300] L[Priority 4096
Lowest] end I --> J --> K --> L

13Security Group Analysis

az network watcher show-security-group-view \ --vm /subscriptions/{subscription-id}/resourceGroups/myResourceGroup/providers/Microsoft.Compute/virtualMachines/myVM \ --resource-group myResourceGroup # Returns: Effective security rules for VM including subnet and NIC NSGs

Security View Output

  • Effective security rules
  • Rule priorities and actions
  • Source NSG for each rule
  • Network interface associations
  • Subnet-level security rules

πŸ“ˆ Comprehensive Monitoring Pipeline

graph TB subgraph "Monitoring Pipeline" A[Network Events] --> B[Azure Monitor] B --> C[Log Analytics Queries] C --> D[Alert Rules] D --> E[Action Groups] E --> F[Notifications] G[Metrics Collection] --> H[Custom Dashboards] H --> I[Power BI Integration] B --> G C --> J[Workbooks] J --> K[Scheduled Reports] end subgraph "Alert Destinations" L[Email Notifications] M[SMS Alerts] N[Webhook Integrations] O[ITSM Connectors] end F --> L F --> M F --> N F --> O

14Create Action Group

az monitor action-group create \ --name NetworkWatcherAlerts \ --resource-group myResourceGroup \ --short-name NetWatch # Returns: Action group created for alert notifications

Action Group Features

  • Short Name: Max 12 characters for SMS/email
  • Multiple Actions: Email, SMS, webhook, functions
  • Reusable: Use across multiple alert rules
  • ITSM Integration: ServiceNow, Cherwell, etc.

15Email Notification Setup

az monitor action-group update \ --name NetworkWatcherAlerts \ --resource-group myResourceGroup \ --add-action email "NetworkAdmin" "admin@company.com" # Returns: Email notification added to action group

Additional Notification Types

  • SMS: --add-action sms "Name" "Country" "Phone"
  • Webhook: --add-action webhook "Name" "URI"
  • Azure Function: --add-action azurefunction "Name" "ResourceID"
  • Logic App: --add-action logicapp "Name" "ResourceID"

16Performance Alert Configuration

az monitor metrics alert create \ --name HighNetworkLatency \ --resource-group myResourceGroup \ --scopes /subscriptions/{subscription-id}/resourceGroups/myResourceGroup/providers/Microsoft.Compute/virtualMachines/myVM \ --condition "avg Network In Total > 1000000000" \ --description "Alert when network input exceeds 1GB" \ --evaluation-frequency 5m \ --window-size 15m \ --severity 2 \ --action NetworkWatcherAlerts # Returns: Performance alert configured with 1GB threshold

Alert Severity Levels

  • 0 - Critical: Immediate response required
  • 1 - Error: Significant issue
  • 2 - Warning: Potential problem
  • 3 - Informational: General information

17Traffic Analysis with KQL

// Top source IPs by traffic volume AzureNetworkAnalytics_CL | where TimeGenerated > ago(1h) | where FlowType_s == "ExternalPublic" | summarize TotalBytes = sum(TotalBytesFlowed_d) by SrcIP_s | top 10 by TotalBytes desc // Results show potential data exfiltration or DoS sources

Query Components

  • TimeGenerated > ago(1h): Last hour filter
  • FlowType_s == "ExternalPublic": External traffic only
  • summarize TotalBytes: Aggregate by source IP
  • top 10: Highest traffic sources

18Security Analysis with KQL

// Analyze blocked connections by NSG rules AzureNetworkAnalytics_CL | where TimeGenerated > ago(24h) | where FlowStatus_s == "D" // Denied flows | summarize BlockedCount = count() by NSGRule_s, DestPort_d, Protocol_s | order by BlockedCount desc // Shows which NSG rules are most active in blocking threats

Security Insights

  • FlowStatus_s == "D": Denied/blocked flows
  • NSGRule_s: Which rule blocked the traffic
  • DestPort_d, Protocol_s: Attack patterns
  • BlockedCount: Volume of blocked attempts

πŸ” Systematic Troubleshooting Approach

graph TD subgraph "Troubleshooting Decision Matrix" A[Network Issue Reported] --> B{Issue Type} B -->|Connectivity| C[Connection Troubleshoot] B -->|Performance| D[Packet Capture + Metrics] B -->|Security| E[Flow Logs + NSG Analysis] B -->|Routing| F[Next Hop + Topology] C --> G[Test Connectivity Tool] D --> H[Capture Filtered Traffic] E --> I[Analyze Security Groups] F --> J[Route Table Analysis] G --> K{Resolution} H --> K I --> K J --> K K -->|Resolved| L[Document Solution] K -->|Escalate| M[Advanced Diagnostics] end

19External Connectivity Testing

az network watcher test-connectivity \ --source-resource /subscriptions/{subscription-id}/resourceGroups/myResourceGroup/providers/Microsoft.Compute/virtualMachines/sourceVM \ --dest-address www.microsoft.com \ --dest-port 443 \ --resource-group myResourceGroup \ --protocol TCP \ --preferred-ip-version IPv4 # Returns: External HTTPS connectivity test with DNS resolution

External Test Benefits

  • Validates outbound internet access
  • Tests DNS resolution
  • Verifies firewall rules
  • Confirms application connectivity

20Performance-Focused Packet Capture

az network watcher packet-capture create \ --name performanceAnalysis \ --vm myVM \ --resource-group myResourceGroup \ --location eastus \ --storage-account networkwatcherstorage \ --time-limit 600 \ --filters '[{ "protocol": "TCP", "localPort": "443", "remoteIPAddress": "*" }]' # Returns: 10-minute HTTPS-focused capture for performance analysis

Performance Analysis Strategy

  • Longer capture windows: Identify intermittent issues
  • Protocol-specific filtering: Reduce noise
  • Port-based targeting: Focus on problem areas
  • Time-bound sessions: Prevent storage bloat

🎯 Best Practices & Cost Optimization

πŸ’° Cost Optimization

  • Enable flow logs selectively on critical NSGs
  • Use packet capture limits and filters
  • Implement data retention policies
  • Monitor Log Analytics ingestion costs

πŸ”’ Security Considerations

  • Secure storage accounts with proper access controls
  • Limit packet capture access to authorized personnel
  • Regular review and deletion of sensitive data
  • Use Azure Key Vault for connection strings

πŸ€– Automation Opportunities

  • Logic Apps: Automated response to alerts
  • Azure Functions: Custom diagnostic workflows
  • PowerShell/CLI Scripts: Scheduled monitoring tasks
  • Azure Automation: Runbook-based remediation

Thank You! πŸŽ‰

Questions & Discussion

1 / 30