IdentityDevice Recognition

Device Recognition

Device recognition re-identifies returning visitors using server-side signals when cookies are cleared or unavailable. It operates on signals that the browser already sends with every HTTP request, without relying on browser fingerprinting libraries or third-party scripts.

Device recognition is not browser fingerprinting. It does not use canvas rendering, font enumeration, or other intrusive techniques. It uses signals that browsers already send with every HTTP request, plus a small number of optional attributes collected by Datafly.js with explicit opt-in.

The Problem

The _dfid cookie provides persistent first-party identity, but it can be lost:

  • The visitor clears their browser cookies or browsing data
  • The visitor uses private/incognito browsing
  • Safari ITP clears storage in edge cases
  • The visitor switches to a different browser profile

When the cookie is gone, the Ingestion Gateway generates a new _dfid, and all previously collected vendor IDs, click IDs, and identity associations are orphaned. The visitor appears as a brand new user.

Device recognition provides a probabilistic recovery path: if the incoming request’s device signals match a previously seen device, Signal can reconnect the visitor to their existing identity record.

The Two Cookies

Signal sets two distinct first-party cookies:

CookiePurposeSurvives cookie clear?
_dfidAnonymous visitor ID. Regenerated if the browser clears cookies.No
_dfdidDevice recognition ID. When device recognition is enabled, this ID is recovered via signal matching after a cookie clear and re-set on the response.Yes (probabilistically)

Both are set server-side via Set-Cookie response headers with HttpOnly, Secure, and SameSite=Lax, making them exempt from Safari ITP 7-day expiration.

How It Works

Device recognition uses server-side signals from each HTTP request to build a probabilistic device profile. When a visitor’s cookies are cleared, Signal checks incoming request signals against known device profiles to recover their identity. The process is automatic and transparent to the visitor.

Signals are collected from two sources:

  • Server-side signals (always available): information sent by the browser with every HTTP request, such as network and browser metadata
  • Client-side signals (optional): additional device attributes collected by Datafly.js when enabled, such as screen resolution and timezone

These signals are combined to create a unique device profile. When a returning visitor arrives without a cookie, Signal compares the incoming signals against stored profiles and, if a match meets the configured confidence threshold, recovers the visitor’s previous identity.

Configuration

Device recognition is configured per pipeline in the Identity tab of pipeline settings in the Management UI.

You can configure the following options:

  • Enable/disable device recognition for the pipeline
  • Confidence threshold — choose High, Medium, or Low depending on your use case. Paywall enforcement may require high confidence, while analytics enrichment can tolerate medium.
  • Retention period — how long device profiles are stored (default 30 days, maximum 90 days)
  • Optional signals — enable additional device attributes for higher accuracy. These are disabled by default and require Datafly.js to perform a small amount of additional work on page load.

Optional signals increase accuracy but should only be enabled when higher accuracy is needed and the privacy implications have been reviewed.

⚠️

Device recognition is probabilistic, not deterministic. A high-confidence match is strong evidence that the visitor is the same person, but it is not a guarantee. Configure your confidence threshold based on your use case.

Publisher Paywall Integration

For publishers using device recognition to enforce metered paywalls (detecting returning visitors who have cleared cookies to reset their article count), Signal provides a dedicated server-side integration for device checks.

This allows the publisher’s paywall logic to query Signal directly and determine whether the visitor has been seen before, regardless of cookie state.

Contact your account team for integration details and setup documentation.

Privacy Considerations

Device recognition is designed with privacy in mind:

  • Opt-in only — disabled by default; must be explicitly enabled per pipeline
  • Hashed signals — raw signals are never stored; only a hashed representation is persisted
  • Configurable retention — device profiles expire after the configured retention period (default 30 days, max 90 days)
  • No third-party sharing — device profiles are used only within the customer’s own Signal deployment
  • GDPR considerations — device profiles may constitute personal data under GDPR. Customers should include device recognition in their privacy policy and cookie consent mechanism. Profiles can be deleted via the Management API as part of a data subject access request (DSAR)
⚠️

Consult your data protection officer or legal team before enabling device recognition. While the signals used are less intrusive than traditional browser fingerprinting, the resulting profile may be considered personal data under GDPR, ePrivacy, or equivalent regulations in your jurisdiction.

Limitations

Device recognition is probabilistic and has inherent limits to its accuracy:

  • Shared networks — multiple users behind the same NAT (corporate offices, university campuses) can produce similar profiles, lowering match confidence.
  • Mobile and roaming visitors — mobile networks rotate addresses frequently, which weakens any network-derived component of the profile.
  • Browser and device changes — when a visitor updates their browser or changes their device, part of their profile is invalidated and they may appear as a new visitor until the profile is re-established.
  • Privacy-focused browsers and anti-tracking tools — browsers and extensions that randomise or strip request metadata reduce the signal available for matching.

Despite these limits, device recognition provides meaningful identity recovery for the majority of desktop browser sessions where cookies have been cleared — the most common identity loss scenario. For paywall enforcement or other use cases that require stronger guarantees, set the confidence threshold to High and combine with deterministic identifiers (logged-in user ID, hashed email) where available.