6.1) VoIP Fundamentals
1. The Standard Call Path (The Chain)
Understanding the journey of a packet is essential for isolating call drops and audio degradation. This “Chain” connects the internal user to the global telephone network.
- PSTN (Public Switched Telephone Network): The global collection of interconnected circuit-switched networks. This is the “outside world” where traditional E.164 phone numbers reside.
- SIP Trunk: A virtual version of an analog phone line. It utilizes the Session Initiation Protocol (SIP) to connect a PBX to the PSTN over an internet connection.
- SBC (Session Border Controller): The gatekeeper and firewall for VoIP. It sits at the network “border” to manage security, NAT traversal, and protocol translation.
- PABX / PBX (Private Automatic Branch Exchange): The “brain” of the phone system. It manages internal switching and call routing logic. In the M365 ecosystem, this is replaced by the Microsoft Phone System.
- IVR (Interactive Voice Response): The automated menu logic (e.g., “Press 1 for Sales”). In Teams architecture, these are configured as Auto Attendants.
- Call Queues: The logic used to hold callers in a line until an agent is available, typically incorporating Music on Hold (MoH) and specific routing methods (Circular, Longest Idle).
- Handsets / Endpoints: The physical IP phones or “softphone” clients (the Teams app) where the audio termination occurs.
2. Technical VoIP Vocabulary
When troubleshooting with “comms” or telco engineers, the following metrics and terms define the health of the voice stream:
- Signaling (SIP): The “handshake” process that sets up, manages, and tears down a call. If signaling fails, the call will never connect.
- Media (RTP/SRTP): The actual voice data packets carried via Real-time Transport Protocol. If signaling succeeds but media fails, the result is “one-way audio”.
- Codecs: Algorithms that compress and decompress audio data (e.g., G.711 for “toll quality” or SILK/Satin for high-fidelity Teams audio).
- E.164: The international numbering plan ensuring unique global phone numbers (e.g.,
+61 3 ...). Teams requires all numbers to be normalized to this format. - Latency (Delay): The time taken for a packet to travel between points. For acceptable VoIP quality, target < 200ms.
- Jitter: The variation in the arrival time of packets. High jitter causes “robotic” audio. Target < 30ms.
- Packet Loss: The percentage of packets that fail to reach their destination. Even 1% loss can cause significant audio degradation.
3. Mapping Teams to Traditional VoIP
| Traditional Term | Teams Equivalent | Technical Role |
|---|---|---|
| Trunk / Line | Calling Plan / Operator Connect | The PSTN connection |
| SBC | Direct Routing SBC | The physical/virtual gateway |
| Extension | User UPN / Line URI | The unique identifier for a user |
| IVR | Auto Attendant | The “Front Door” menu logic |
| Hunt Group | Call Queue | Distributing calls to a group |
4. The Multi-Faceted Role of the SBC
In modern environments, the Session Border Controller (SBC) acts as both an external gateway and an internal aggregator for legacy hardware.
A. Northbound: The PSTN Gateway
This “External” face sits between the Microsoft Phone System and the telecommunications provider.
- Carrier Interconnect: Terminates the SIP Trunk from providers (e.g., Telstra, Optus) and translates provider-specific SIP signaling into Teams-compatible formats.
- Security & NAT Traversal: Protects the internal phone system from DoS attacks and hides internal IP addressing from the public internet.
- Media Bypass: Allows voice traffic (media) to travel directly between the Teams client and the SBC, bypassing the Microsoft cloud to reduce round-trip latency.
B. Southbound: The Local Anchor (Handsets & Legacy)
This “Internal” face manages on-premises hardware and local survival scenarios.
- Legacy PBX Integration: Enables “sandwich” migrations where Teams and legacy PBX systems coexist during a phased rollout.
- Analog Gateway Service: Provides connectivity for devices that cannot “speak” Teams natively, such as:
- Fax machines.
- Elevator/Lift emergency phones.
- Warehouse paging/PA systems.
- Local Survivability (SBA): A Survivable Branch Appliance allows local handsets to make PSTN calls even if the office loses its connection to the Microsoft 365 cloud.
C. Technical “Dual Use” Terminology
- SIP Registrar: The function where an endpoint “checks in” with the SBC to announce its online status and IP address.
- Transcoding: The real-time re-encoding of audio when the “Northbound” side uses one codec (G.711) and the “Southbound” device requires another (OPUS).
- Header Manipulation: The ability to rewrite SIP packets on the fly, such as masking a user’s direct extension with the main company number.
5. Common VoIP Symptoms & Root Causes
When diagnosing telephony issues, matching the user’s symptom to the underlying protocol (Signaling vs. Media) is the first step in root cause analysis.
- One-Way Audio:
- Symptom: The call connects, but only one party can hear the other.
- Root Cause: The SIP Signaling successfully established the call, but the Media (RTP) stream is being blocked in one direction. This is almost always a network issue, typically caused by a firewall blocking specific UDP port ranges, NAT (Network Address Translation) traversal failures, or asymmetric routing where return packets take a different, blocked path.
- The “X-Second” Call Drop (e.g., 30 seconds or 15 minutes):
- Symptom: Calls connect and audio works perfectly, but the call reliably disconnects after a specific, consistent amount of time.
- Root Cause: This is a Signaling timeout. It is commonly caused by SIP ALG (Application Layer Gateway) running on a local firewall, which improperly mangles SIP packets. It can also be caused by SIP Session Timers expiring; if a firewall’s UDP timeout is shorter than the SIP session refresh interval (re-INVITE), the SBC or carrier will assume the endpoint is dead and tear down the call.
- Robotic, Choppy, or Garbled Voice:
- Symptom: The audio cuts in and out, or sounds metallic and “robotic.”
- Root Cause: This is a Media degradation issue caused by network congestion. Specifically, it points to high Jitter (RTP packets arriving out of order) or high Packet Loss. The standard remediation is implementing QoS (Quality of Service) on the local network to prioritize voice VLAN traffic over standard data traffic.
- Post-Dial Delay (PDD):
- Symptom: A user dials a number, but there is a long silence (5–10 seconds) before they hear the ringback tone.
- Root Cause: This indicates a delay in the Signaling reaching the destination. It can occur if the SBC is struggling to process complex Dial Plans/normalization rules, or if the carrier is taking too long to route the call through the PSTN network to the destination endpoint.