WikiPhish: A Diverse Wikipedia-Based Dataset for Phishing Website Detection

I’m proud to announce that our latest article WikiPhish: A Diverse Wikipedia-Based Dataset for Phishing Website Detection has been accepted at CODASPY 2024 (ACM Conference on Data and Application Security and Privacy).

TL;DR: Phishing poses a persistent threat, demanding advanced detection systems. Supervised machine learning is widely employed to automate detection, reliant on extensive annotated data. The introduction of WikiPhish dataset, comprising 110,606 webpages (from Wikipedia, OpenPhish and PhishTank), addresses this need for diverse and robust data, enhancing phishing detection model development.

🇬🇧 Gabriel Loiseau, Valentin Lefils, Maxime Meyer, Damien Riquet. WikiPhish: A Diverse Wikipedia-Based Dataset for Phishing Website Detection (2024). CODASPY 2024 (ACM Conference on Data and Application Security and Privacy).