IDN Homograph Attacks: The Visual Spoofing Trap
· by Spicy Stromboli · link-analysis, homograph-attack, domain-spoofing, punycode, phishpond
An IDN (Internationalized Domain Name) homograph attack is a visual spoofing technique where attackers register domain names containing non-ASCII characters from foreign scripts (like Cyrillic, Greek, or Latin look-alikes). Because these characters look visually identical to standard English letters to the human eye, users cannot distinguish the spoofed domain from the legitimate brand. Browsers represent these internationalized domains in ASCII format using a translation system called Punycode, which acts as the primary signature to detect this exploit.
The human brain is an incredibly advanced pattern-matching engine, but it is prone to shortcuts. When we look at a domain name in a browser address bar, we do not inspect every single pixel of the letters. We recognize the overall shape of the word. If we see apple.com or microsoft.com spelled correctly, we trust the link.
In 2026, cybercriminals are exploiting this cognitive shortcut using a technique known as an IDN homograph attack. By registering domains that swap standard English letters for visually identical characters from other alphabets, they create perfect visual clones of famous websites. A link might look exactly like your bank’s portal, but clicking it takes you to a server hosted in a completely different part of the world.
Here is an analysis of the mechanics behind homograph spoofing, how browsers translate these domains using Punycode, and how you can spot these visual traps before entering your data.
The Unicode Problem: Expanding the Domain Character Set
In the early days of the internet, domain names were restricted to the basic Latin alphabet, numbers, and hyphens (the ASCII character set). This worked well for English-speaking countries, but it excluded languages that use non-Latin scripts, such as Russian, Chinese, Greek, or Arabic.
To make the internet truly global, the Internet Engineering Task Force introduced Internationalized Domain Names (IDNs) in 2003. This allowed domains to contain any character from the Unicode standard, a registry of over 140,000 characters covering hundreds of languages.
This globalization introduced a massive security vulnerability: the existence of Homoglyphs.
Homoglyphs are characters from different scripts that look identical or nearly identical to each other. For example, the Latin letter a (Unicode U+0061) looks exactly like the Cyrillic letter а (Unicode U+0430). To a computer, they are completely different numbers. To a human reader, they are indistinguishable. By registering a domain like аpple.com using the Cyrillic а, an attacker creates a link that is visually identical to the real site but routes traffic to a server they control.
Punycode: How Computers Read Unicode Domains
Because the core internet domain name system (DNS) still only understands ASCII, browsers need a way to translate Unicode domains into a format the network can route. This translation system is called Punycode.
Punycode translates Unicode strings into an ASCII-compatible encoding scheme. Every Punycode domain begins with the prefix xn--.
For example, if an attacker registers аpple.com (using the Cyrillic а in place of the first a), the browser translates the domain to its Punycode representation:
- Visual Domain:
аpple.com - Punycode Translation:
xn--pple-43d.com
If you look at the Punycode version, the spoof is immediately obvious. The xn-- prefix flags that the domain contains internationalized characters, and the numbers indicate which characters were swapped. The challenge is that browsers try to make the web experience seamless, so they automatically decode Punycode in the address bar, displaying the clean Unicode visual to the user.
Common Homograph Spoofing Variations
Scammers do not just swap single letters. They use several visual tricks to build convincing spoofs:
- Single-Character Swaps: Swapping a single letter (like replacing a Latin
owith a Cyrillicо) in a major brand domain. - Subdomain Stacking: Creating complex subdomain hierarchies that push the real domain off the screen. For example, in the URL
login.paypal.com.account-verify.net, the user sees the trusted brand at the start, but the actual domain they are visiting isaccount-verify.net. - Look-Alike Character Families: Using characters that look like English letters but have subtle differences, such as utilizing the Cyrillic
і(which looks like the Englishi) or the Greekο(which looks likeo). - Character Extensions: Adding small Unicode accents or dots beneath letters that users might mistake for dust on their screen (such as the Latin small letter
ewith a dot below:ẹ).
Why Gateways and Users Struggle to Catch Them
IDN homograph attacks are highly effective because they exploit the interface layer rather than security vulnerabilities in software:
- Valid SSL Certificates: Scammers can easily obtain free, automated SSL certificates (such as Let’s Encrypt certificates) for their Punycode domains. The browser displays the green padlock or secure lock icon, reassuring the user.
- Organic Appearance: In email messages, the links look completely clean. The hover-over preview displays the visual Unicode domain, not the Punycode string, meaning users who inspect links before clicking still see the spoof.
- Authentic Content: The landing page is usually a clone of the official site, often scraped and served in real-time, making visual detection of the page content impossible.
Defensive Strategies: How to Unmask the Homograph
While the threat is highly visual, you can protect yourself by changing how you inspect links and utilizing browser tools.
1. Force Punycode Display in Your Browser
Most modern web browsers have built-in defenses against homographs. If a domain mixes characters from different alphabets (e.g. combining Latin and Cyrillic in a single word), the browser will display the raw Punycode version (xn--...) in the address bar instead of the visual Unicode.
However, if an entire domain is registered using a single script (like registering the entire word apple using Cyrillic characters), some browsers might still display the visual version. You can configure your browser to always show the Punycode version of domains.
In Firefox, for example, you can toggle this setting:
- Type
about:configin the address bar. - Search for
network.IDN_show_punycode. - Set the value to
true.
Once enabled, any domain containing internationalized characters will display as xn--..., making spoofs instantly visible.
2. Inspect Links via Real-Time Tools
If you are ever unsure about a link, even if it looks correct, do not click it. Copy the link address and analyze it before loading the page.
You can paste the URL into PhishPond.io and run the Link Scout scanner. Link Scout follows the redirect hops, resolves the domain to its raw IP address, and automatically decodes any Punycode strings, highlighting the true ASCII destination of the domain. Taking this extra step is one of the most effective ways to check if a link is a scam before it has a chance to execute in your browser. Our guide on reading a redirect chain details what else you should look for at each step of a URL trace.
3. Navigate Manually
The gold standard of domain safety remains manual navigation. If you need to access your bank, tax portal, or password manager, never click a link in an email, SMS, or direct message. Type the official domain address directly into your browser’s address bar and bookmark it for future use.
For corporate environments, administrators should monitor their DNS logs for requests containing the xn-- prefix. While some internationalized domains are legitimate, a sudden influx of queries to unfamiliar Punycode domains is a strong indicator of a targeted phishing campaign.
Sources and References
- Unicode Consortium Security Guide: Unicode Security Mechanisms and Homoglyph Standards
- CISA Security Brief: Understanding IDN Homograph Spoofing and Domain Name Attacks
- Browser Security Standards: W3C Recommendation on Internationalized Domain Name Protections
- PhishPond Lab Report: SEO Poisoning, Typosquatting, and Visual Brand Evasion