Comstream · Intern Training · VoIP Track

SIP, line by line.

Session Initiation Protocol is how every call on our platforms is set up, modified, and torn down. The good news: it's plain text, and you can read it. By the end of this page you'll be able to follow a call flow, decode a packet capture, and explain why one-way audio is almost always NAT.

INVITE sip:bob@comstream.example SIP/2.0
Module 01 · Signaling vs Media

SIP sets up the call. It never carries the voice.

The single most important idea on this page. SIP is a signaling protocol (RFC 3261): it finds the other party, negotiates how to talk, and ends the session. The audio itself travels separately over RTP, usually on a completely different port — and often a completely different network path.

SIP — the handshake

Text messages over UDP, TCP, or TLS, usually port 5060 (5061 for TLS). Says who wants to talk to whom, and carries an SDP body describing how.

RTP — the voice

Binary audio packets over UDP, on high ports negotiated in the SDP (e.g. 10000–20000). If a call connects but nobody hears anything, signaling worked and media failed.

Who's who in a SIP network

User Agent (UA)

Anything that originates or answers calls: a Yealink desk phone, a softphone, a WebRTC client. Acts as UAC (client) when sending a request, UAS (server) when receiving one.

Registrar

Accepts REGISTER requests and remembers where each user currently is ("alice is at 10.0.1.24:5060"). This address book is the location service.

Proxy

Routes requests toward their destination. Doesn't answer calls itself — it forwards. Hosted platforms like NetSapiens play this role (plus much more).

B2BUA

Back-to-back user agent: terminates the call on one side and re-originates it on the other, giving it full control (hold, transfer, recording). Most PBXes — 3CX included — are B2BUAs, not pure proxies.

SBC

Session Border Controller: sits at the network edge handling security, NAT traversal, and protocol normalization between carriers and the platform.

SIP Trunk

A carrier connection (Telnyx, Twilio…) delivering PSTN calls to the platform over SIP instead of physical phone lines.

Debugging instinct #1Call won't set up → look at SIP. Call sets up but audio is missing, one-way, or choppy → look at RTP, codecs, and NAT.
Module 02 · Message Anatomy

Tap any line of this INVITE

A SIP request is a start line, a stack of headers, a blank line, and (sometimes) a body. This is a real INVITE — tap each line to decode it. Six headers — Via, From, To, Call-ID, CSeq, Contact — appear in every message and answer 90% of trace-reading questions.

Pick a line →
Every line of a SIP message earns its place. Tap one to see what it does.
Module 03 · Methods

The six core verbs (and the extensions you'll meet)

Every SIP request starts with a method — a verb saying what the sender wants. RFC 3261 defines six; extensions added more that you'll see daily in PBX work.

Core methods

INVITE

Start (or modify) a session. Carries the SDP offer. A mid-call INVITE is a "re-INVITE" — used for hold, codec changes, or moving media.

ACK

Confirms receipt of the final response to an INVITE. Completes the three-way handshake. The only request that never gets a response.

BYE

Ends an established session. Either side can send it. Answered with 200 OK.

CANCEL

Aborts a pending INVITE that hasn't received a final response yet — the caller hung up while it was still ringing. Triggers a 487 on the INVITE.

REGISTER

Tells the registrar where you are. Refreshed periodically (the expires value). Phone shows "offline"? Check registration first.

OPTIONS

Asks "what do you support?" — but in practice it's SIP's ping. Trunk keep-alives and monitoring almost always use OPTIONS.

Extensions you'll actually see

REFER

"Please call this other party" — the mechanism behind call transfer. The Refer-To header carries the target.

SUBSCRIBE / NOTIFY

Event packages: BLF lamp state, voicemail message-waiting (MWI), presence. SUBSCRIBE asks, NOTIFY delivers.

UPDATE

Modify session parameters before the call is fully answered (early dialog), without a full re-INVITE.

PRACK

Reliable provisional responses (RFC 3262): acknowledges a 180/183 the way ACK acknowledges a 200.

INFO

Mid-call application info. Historically used for DTMF (now mostly RFC 4733 telephone-events inside RTP instead).

MESSAGE

Instant messaging over SIP — pager-mode text without setting up a session.

Module 04 · Responses

Read the first digit, then the rest

Responses are three-digit codes, HTTP-style. The first digit is the class — it tells you instantly whether things are progressing, done, or broken. Filter by class; these codes cover nearly every trace you'll read.

1xx vs everything elseProvisional (1xx) responses say "working on it" — the transaction stays open. Any 2xx–6xx is final: the transaction completes, and (for INVITE) an ACK must follow.
Module 05 · Call Flows — The Simulator

Watch a call happen

Pick a scenario and step through it. Each arrow is a real SIP message — tap any arrow (or just step forward) to inspect the actual packet below the ladder. This is exactly how you'll read traces in NetSapiens, sngrep, or Wireshark.

Packet inspector
Press Play or Next to begin.
Why ACK after 200 — and after 487?INVITE transactions are three-way: INVITE → final response → ACK. That's true for success (200) and failure (486, 487…). A missing ACK makes the far end retransmit its final response — if you see repeated 200 OKs in a trace, the ACK isn't getting through (usually NAT).
Module 06 · SDP

The body that negotiates the media

SIP carries a Session Description Protocol body (RFC 8866) to negotiate media as an offer/answer exchange: the INVITE offers ("here's where to send my audio, and the codecs I speak"), the 200 OK answers. Tap each line.

Pick a line →
Single letters, rigid order, enormous consequences. Tap a line.
Debugging instinct #2One-way or no audio? Read the c= and m= lines on both sides. If either contains a private IP (10.x, 192.168.x, 172.16–31.x) that crossed the internet, you've found the bug. Codec mismatch instead? You'd see a 488 Not Acceptable Here.
Module 07 · Registration & Digest Auth

How a phone proves who it is

SIP never sends the password. It uses digest authentication: the server issues a one-time challenge (a nonce), and the client returns a hash that could only be computed with the password. Run the "Registration with auth" scenario in Module 05 to watch it live.

The 401 dance

First REGISTER goes out bare → registrar replies 401 Unauthorized with a WWW-Authenticate header carrying the realm and nonce → client computes the digest and resends REGISTER with an Authorization header → 200 OK. Proxies do the same dance on INVITEs using 407 and Proxy-Authenticate.

The math (MD5 digest)

HA1 = MD5( username : realm : password )
HA2 = MD5( method : request-uri )
response = MD5( HA1 : nonce : HA2 )
# with qop=auth, a nonce-count and client nonce join the final hash —
# that's what stops an attacker replaying a captured response.

Two registration facts that save support tickets: the expires value is how long the binding lives (phones re-register at roughly half of it), and a 403 Forbidden after correct credentials usually means a policy block — not a wrong password (a wrong password just gets re-challenged with 401, forever).

Module 08 · NAT Traversal

Why SIP and NAT hate each other

SIP writes IP addresses inside its messages — and a NAT router only rewrites the packet headers, not the SIP payload. So a phone behind NAT confidently tells the world to reach it at 192.168.1.50. The world cannot.

The three lies a NATed phone tells

Via header

Responses route back via this address. Private IP here → responses go nowhere → retransmissions and dropped calls.

Fix: ;rport (RFC 3581)

Contact header

Where in-dialog requests (ACK, BYE) should go. Private IP here → BYEs never arrive → ghost calls that won't hang up.

Fix: store the address actually seen on the wire

SDP c= / m= lines

Where to send RTP. Private IP here → audio sent into the void → the classic one-way audio ticket.

Fix: symmetric RTP / media latching

The toolbox

rport (RFC 3581)

The client adds ;rport to its Via; the server replies to the actual source IP and port it observed, not the address written in the header.

STUN

"What's my public address?" — the client asks an outside server and writes the public mapping into its messages. Works for most NATs; fails on symmetric NAT.

TURN

When direct media is impossible, relay RTP through a server. Costs bandwidth but always works. The fallback of last resort.

ICE

Tries every candidate pair — host, STUN-derived, TURN relay — and picks the best path that actually works. Mandatory in WebRTC.

SBC / far-end NAT traversal

The platform's border element ignores the addresses written in SIP/SDP and latches signaling and media to wherever packets actually arrive from.

Keep-alives

NAT mappings expire when idle. Short registration expirations, OPTIONS pings, or CRLF keep-alives hold the pinhole open so inbound calls still reach the phone.

Field ruleTurn SIP ALG off on customer routers. It tries to "helpfully" rewrite SIP payloads and mangles them more often than it fixes them. One-way audio plus random drops at a new site? Check ALG before anything else.