UC San Diego research team develops a system by which automated web crawling programs can identify spam-based sales websites and connect them to large affiliate networks.
A 2014 paper by UC San Diego researchers Matthew Der, Lawrence K. Saul, Stefan Savage, and Geoffrey M. Voelker titled “Knock It Off: Profiling the Online Storefronts of Counterfeit Merchandise,” describes how network investigation of html tags and other network identifiers made it possible to connect hundreds of thousands of disparate fake online pharmacy websites to the handful of affiliate networks that run them.
As Saul et al explained, they developed an automated system “able to identify the affiliate programs of 180,690 online storefronts.” The authors estimate that it took over 200 man-hours designing and validating the “regular expressions” for all 45 affiliate programs that were the focus of their study.
The researchers pool of online storefronts was collected from millions of spam emails offering counterfeit medication, counterfeit products, and pirated software. As they put it, their most profound discovery was that “the online storefronts of several dozen affiliate programs can be distinguished from automatically extracted features on their Web pages.” Using “nearest neighbor (NN)” classifiers on HTML and network features, they were able to achieve nearly 99.99% accuracy in connecting spam-based fake online pharmacy URLs with the affiliate programs that create and operate them.
They conclude that simple automation using their system of HTML term classification could make tracking websites back to the criminals running affiliate networks relatively easy.
As a consumer, you can protect yourself from fake online pharmacy criminals. Learn how to decode a fake online pharmacy, and keep you and your family safe.