Back to Blog

Detecting DNS over HTTPS (DoH) via Traffic Pattern Analysis

Detecting DNS over HTTPS (DoH) via Traffic Pattern Analysis

The transition from traditional DNS (UDP/53) to DNS over HTTPS (DoH, RFC 8484) represents one of the most significant shifts in network privacy and security visibility in the last decade. While DoH provides much-needed protection against eavesdropping and Man-in-the-Middle (MitM) attacks on the DNS layer, it simultaneously creates a massive blind spot for network defenders.

By encapsulating DNS queries within standard HTTPS (TCP/443) traffic, DoH effectively hides the "intent" of a connection within the noise of general web browsing. For security practitioners, the traditional method of monitoring DNS logs or inspecting Port 5etric traffic is no longer sufficient. Detecting DoH requires a shift from signature-based inspection to sophisticated traffic pattern analysis.

The Visibility Gap: Why Traditional Methods Fail

In a traditional environment, DNS traffic is distinct. It uses a dedicated port (UDP 53), and the plaintext nature of the protocol allows Deep Packet Inspection (DPI) engines to parse queries and identify malicious domains (C2, phishing, or DGA-generated domains) in real-time.

DoH breaks this paradigm by leveraging the same protocol, port, and encryption used for almost all modern web traffic. When a client initiates a DoH request, the packet looks identical to a standard TLS-encrypted GET or POST request to a web server. The payload-containing the actual domain being queried-is encrypted within the TLS tunnel. Consequently, simple firewall rules or basic Intrusion Detection System (IDS) signatures targeting Port 53 are rendered useless.

The Core of Detection: Traffic Pattern Analysis

Since the payload is opaque, detection must rely on the "side channels" of the communication: metadata, packet dynamics, and statistical distributions.

1. Packet Size Distribution (PSD) Analysis

DNS queries and responses have highly characteristic sizes. A standard A-record query is typically small, while responses containing multiple records or large TXT records (often used in DNS tunneling) exhibit specific size ranges.

Even when wrapped in TLS, the length of the encrypted record is a strong proxy for the original plaintext size. By analyzing the Packet Size Distribution (PSD), we can identify patterns that deviate from standard web browsing (which usually involves large, variable-sized downstream bursts) and align more closely with the request-response cadence of DNS.

ability 2. Inter-Arrival Time (IAT) and Burstiness

Web browsing typically follows a "bursty" pattern: a flurry of requests to fetch HTML, CSS, and JavaScript, followed by periods of relative inactivity as the user consumes the content.

In contrast, DoH traffic often exhibits a different temporal signature. DNS queries are often triggered by specific application events (e.g., a user clicking a link or an application checking for updates) and are followed by a specific cadence of small, periodic packets. Analyzing the Inter-Arrival Time (IAT)-the time delta between successive packets in a flow-can reveal the rhythmic nature of DNS resolution compared to the more stochastic nature of HTTP/2 or HTTP/3 stream multiplexing.

3. TLS Fingerprinting (JA3 and SNI)

While the payload is encrypted, the TLS handshake remains partially visible.

  • Server Name Indication (SNI): If the client is using a known DoH provider (e.g., `cloudflare-dns.com` or `dns.google`), the SNI field in the Client Hello packet will explicitly reveal the destination.
  • JA3 Fingerprinting: Every TLS client (a browser, a Python script, or malware) has a unique way of negotiating a connection (supported ciphers, extensions, and elliptic curves). By generating a JA3 hash of the TLS Client Hello, defenders can identify specific libraries or applications known to implement DoH, even if the SNCT is masked or obscured.

Implementation Strategies

To implement DoH detection at scale, organizations should move toward a multi-layered telemetry approach.

Feature Engineering for Machine Learning

For high-fidelity detection, a supervised machine learning model (such as a Random Forest or a 1D-CNN) can be trained on extracted features. The feature vector for a flow should include:

  • Mean/Median/Variance of Packet Lengths: Capturing the size characteristics.
  • Entropy of Packet Sizes: Measuring the randomness of the payload sizes.
  • Flow Duration: DNS queries are typically short-lived compared to long-lived HTTPS sessions.
  • Byte Ratio: The ratio of upstream (client-to-server) to downstream (server-to-client) bytes.

Integration with Network Detection and Response (NDR)

The detection engine should ideally reside within an NDR or a high-performance sensor like Zeek (formerly Bro). Zeek allows for the creation of custom scripts to extract these statistical features from flow records in real-time.

Example Workflow:

  1. Capture: A network tap captures traffic at the perimeter.
  2. Extract: Zeek processes the traffic, calculating the IAT and PSD for all Port 443 flows.
  3. Classify: A lightweight classifier compares these features against known DoH profiles.
  4. Alert: If a flow matches the DoH profile and the SNI is an unrecognized or suspicious domain, a high-priority alert is sent to the SIEM.

Operational Challenges and Trade-offs

Implementing pattern-based detection is not without significant hurdles.

The False Positive Dilemma

The primary risk is the "False Positive" (FP). Modern web applications, particularly those using WebSockets or long-polling, can mimic the small, frequent packet patterns of DoH. Over-aggressive detection can lead to "alert fatigue" in the SOC, where legitimate application traffic is flagged as suspicious.

The Rise of Encrypted Client Hello (ECH)

The industry is moving toward Encrypted Client Hello (ECH), a TLS extension designed to encrypt the SNI. Once ECH becomes widespread, the "easy" detection method of inspecting the SNI will disappear

Conclusion

As shown across "The Visibility Gap: Why Traditional Methods Fail", "The Core of Detection: Traffic Pattern Analysis", "Implementation Strategies", a secure implementation for detecting dns over https (doh) via traffic pattern analysis depends on execution discipline as much as design.

The practical hardening path is to enforce strict token/claim validation and replay resistance, certificate lifecycle governance with strict chain/revocation checks, and protocol-aware normalization, rate controls, and malformed-traffic handling. This combination reduces both exploitability and attacker dwell time by forcing failures across multiple independent control layers.

Operational confidence should be measured, not assumed: track mean time to detect and remediate configuration drift and detection precision under peak traffic and adversarial packet patterns, then use those results to tune preventive policy, detection fidelity, and response runbooks on a fixed review cadence.

Related Articles

Explore related cybersecurity topics:

Recommended Next Steps

If this topic is relevant to your organisation, use one of these paths: