Detecting AiTM Phishing Sites with Fuzzy Hashing

Background

In this blog, we will cover how Obsidian detects phishing kits or Phishing-as-a-Service (PhaaS) websites for our customers by analyzing the fuzzy hashes of visited website content.

This concept draws from prior industry art, as IOCs (ex: SHA-1/SHA-265) and fuzzy hashes (ex: SSDEEP, TLSH) have been used for hunting and detection on endpoints for some time. If unfamiliar, fuzzy hashing creates a hash value that attempts to detect the level of similarity between two things at the binary level.

The examples covered will include EvilProxy/Tycoon and a sophisticated APT group.

‍

EvilProxy/Tycoon Phishing Kit

Menlo Security [1], Proofpoint [2], Microsoft [3], Trendmicro [4], and Sekoia.io [5] have blogged about EvilProxy/Tycoon, an Adversary-in-the-Middle (AitM) phishing kit that steals credentials and session cookies in real-time.

Recent campaigns can be observed on any.run: https://app.any.run/submissions/#tag:tycoon

An example: https://dzse[.]izmqf[.]ru/nY8gx7

Most of these websites are protected with Cloudflare’s bot/scraping protection, which hinders attempts at automated scraping and analysis by many security products. Cloudflare’s protection looks for things such as mouse movements, clicks, and key presses while also using other techniques such as canvas fingerprinting.

‍

‍

Once the Cloudflare check is passed, the user is presented with a page impersonating the Microsoft login page.

‍

‍

When we view the HTML content, it’s a single external script resource:

<script language=”Javascript” src=”<https://dzse>[.]izmqf[.]ru/myscr602166.js”></script>

With the Javascript heavily obfuscated:

var erp = new Array;
erp[0] = 218774561;
erp[1] = 1146045268;
erp[2] = 1498432800;
erp[3] = 1752460652;
erp[4] = 1041041980;
erp[5] = 1752460652;
erp[6] = 543973742;
erp[7] = 1732059749;
erp[8] = 1847737869;
……
erp[1191] = 1041041933;
erp[1192] = 10;
var em = ”;
for(i=0;i0){
em += String.fromCharCode(Math.floor((tmp/Math.pow(256,3))));
};
tmp = tmp – (Math.floor((tmp/Math.pow(256,3))) * Math.pow(256,3));
……
};
document.write(em);

‍

However, once the Javascript runs, the Document Object Model (DOM) reveals what is displayed to the user:

‍

‍

Computing a fuzzy hash for the HTML would prove pretty fruitless since it’s short and not really unique (a single external script resource), and the URL will frequently change.

However, computing a fuzzy hash for the DOM will prove useful, as this is after the Javascript obfuscation has been unwound.

With some minification of the DOM, the computed TLSH hash we get for this website is: T1140351705096AE3B8193C1E1AA751B4E33A1CA0DCFE306564AFEC3AECBC7D89CE45551

If we repeat this process for another EvilProxy/Tycoon website, such as https://295g[.]kirklimo[.]com/h040n, we have the following DOM TLSH hash: T19D0351705096AE378193C1E1A9B51B0E33A1CA0ECFE306564AFE83AECBC7D85CF45551

If we compare these two fuzzy hashes, they are very similar:

$pip install py-tlsh
…
import tlsh
tlsh.diff(‘T1140351705096AE3B8193C1E1AA751B4E33A1CA0DCFE306564AFEC3AECBC7D89CE45551’, ‘T19D0351705096AE378193C1E1A9B51B0E33A1CA0ECFE306564AFE83AECBC7D85CF45551’)
9

A score of 9 has a false positive rate of roughly 0.001%, per Trend Micro’s paper.

‍

APT Phishing Kit

The same technique can be used to catch users visiting phishing websites created by a popular APT group at the moment.

Websites look like the following, with the logo switched out in each case.

‍

‍

Comparing two different campaigns, one targeting a telecom company and another an insurance company, we find the hashes are very similar for both the HTML and the DOM.

>>> import tlsh
>>> tlsh.diff(‘T11B7173044CFFCC1290034895E9B2F8582E9DE8679308DC8975DC95569F52FC74A53BAD’, ‘T1747171049CFFCC1290034896E9B2F85C1EADE4A79208DC8975DC96665F92FC74A53AAC’)
28

‍

Conclusions

While there are many ways to catch, detect, or block a phishing website, companies continue to be compromised by targeted spearphishing attacks or more sophisticated redteamers. A fuzzy hashing approach gives defenders another way of catching commoditized or targeted phishing attacks. We hope other companies start to incorporate this capability into their products.

‍

Interested in learning more about this capability in our product? Get in touch with our team.

‍