All ETDs from UAB

Advisory Committee Chair

Alan Sprague

Advisory Committee Members

Michael Bailey

Tyler Moore

Nitesh Saxena

Chengcui Zhang

Document Type


Date of Award


Degree Name by School

Doctor of Philosophy (PhD) College of Arts and Sciences


Phishing has been a problem since before the early 2000s and has only become more prevalent and diverse since. Phishing countermeasures have been developed and used to prevent or mitigate phishing attacks. However, each countermeasure has pros and cons and not every countermeasure is effective in every situation. Choosing the best suited phishing countermeasure or combination of phishing countermeasures to use and track their effectiveness requires grouping phish based upon common characteristics and tactics used by phish or phish grouping. To be effective phish grouping needs to produce dependable groupings, quickly produce groups, and analyze large volumes of phish. This dissertation develops the Simple Set Comparison (SSC) tool. The SSC tool enables existing phish grouping processes to run faster. It also decreases the maximum amount of memory required allowing grouping of a larger number of phish. The SSC tool utilizes a multi-step approach that makes use of parallel processing to improve runtime and reduce the maximum amount of memory required. This dissertation evaluates the efficiency and quality of using the SSC tool with the SLINK style phish grouping algorithm used by Malcovery Security. The SLINK style algorithm using the SSC tool is compared to the SLINK style algorithm without using the SSC tool on the ability to produce a clustering, the quality of the clustering produced, and the runtime to produce a clustering. Four experiments are run using three different implementations of the SLINK style clustering algorithm over large phishing data sets. The SSC tool improved the runtime of the SLINK style algorithm in each experiment. The SLINK style algorithm algorithm with the SSC tool produces results 37 times faster than without in the first experiment, 404 times faster in the second experiment, 6 times faster in the third experiment, and 10.8 times faster in the fourth experiment. The tool produces results faster, while maintaining equivalent quality. The SSC tool improves the SLINK style algorithm's runtime and reduces the maximum amount of memory required to produce a clustering, allowing larger volumes of phish to be grouped, and produces similar clusterings to the SLINK style algorithm without the tool.



To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.