Browser Fingerprinting: How We Detect 97.4% of Review Bots
Canvas, WebGL, audio context — how fingerprinting works and why IP filtering alone is dead.
IP filtering catches lazy reviewers. Smart reviewers route through residential proxies. The IP looks like a Comcast home connection. The user-agent is a real Chrome string. They look human. So how do you catch them?
Browser fingerprinting. Specifically, the side-channels that real Chrome instances have but headless / instrumented browsers don't perfectly replicate.
What we fingerprint
- Canvas — render a hidden 2D canvas with specific text, hash the pixel output. Every GPU/driver combo produces a slightly different hash. Reviewers running in cloud VMs cluster around a small set of canvas hashes.
- WebGL — same idea, but 3D. WebGL fingerprints are more entropic and harder to spoof.
- Audio context — synthesize a tone, hash the resulting audio buffer. Hardware-dependent.
- Font enumeration — the list of installed fonts. Headless Chrome ships with a fixed font set; real machines have personal fonts (Office, Adobe, design tools).
- WebRTC — leaks the local IP even when the public IP is proxied. We use this to detect proxy users.
- Screen resolution + device pixel ratio — clusters non-randomly for cloud VMs.
- Timezone vs IP geo — real users' OS timezone matches their IP's timezone. Reviewers often run UTC machines.
TLS / JA3 fingerprinting
Below the application layer, every TLS handshake includes a specific order of cipher suites and extensions. Real Chrome on Mac sends a specific JA3 hash. Real Chrome on Windows sends a different one. Headless Chrome sends a recognizable variant. Anti-detect browsers (Multilogin, Kameleo) also have detectable signatures. We compute the JA3 at the edge and compare against known anomalous hashes.
Why fingerprinting matters more than IP
Residential proxy networks are commodity now. $50/month gets you rotating IPs from real ISPs. Your IP filter sees a Comcast Maryland IP — looks fine. But the canvas hash matches one of 50 hashes that all originate from a verifier farm in AWS Frankfurt. Caught.
Defenses bot operators use
- Canvas / WebGL noise (small random perturbation) — defeats exact-match fingerprinting but creates an even more anomalous fingerprint cluster
- Stealth plugins for Puppeteer / Playwright — patches the obvious leaks, but the patch itself is detectable
- Anti-detect browsers — try to spoof every dimension, but the combinations they offer cluster around their template profiles
The arms race continues. Layered detection — IP + headers + fingerprint + behavior — beats any single defense.
Stop running cloaking on duct tape.
Overcloak ships the 11-layer detection stack described above out of the box. $97/mo locked forever for the first 50 customers — only 13 founder seats left.