A Smart Model for Web Phishing Detection Based on New Proposed Feature Selection Technique

Document Type : Original Article

Author

Computer Science and Engineering Department Faculty of Electronic Engineering Menoufia University Egypt

Abstract

Web-phishing attacks are one of the most serious cybercrime. It enables hackers to access the devices of many users and spy on their personal data such as passwords and credit card details. Hackers use a lot of tricks through the internet, which make users to share data, download files or open links that attack a computer. This research proposes meta-heuristic based approach to protect the internet users from the web-phishing. It consists of three phases, the first phase uses a new proposed method for evaluating and ranking the features of URL, HTML and JavaScript code, text, images and domain name of the web page. The second phase extracts the effective subset of the ranked features that achieves the highest classification accuracy of the web-phishing. The third phase constructs the Random forest classifier training by data features of the extracted subset. The new proposed method of the feature selection achieved the highest classification accuracy compared to the correlation feature selection, information gain, principle component analysis, and Relief feature selection algorithms. The proposed methodology of the web-phishing detection was also evaluated, it obtained the highest classification accuracy at the least possible time compared to the adaptive Neuro-fuzzy inference system.

Keywords


[1] R. Mohammad, F. Thabtah and L. McCluskey, “Predicting phishing websites based on self-structuring neural network”, Neural Computing and Applications, vol. 25(2), pp. 443-458, 2014.
[2] J. Maoa, J. Biana, W. Tiana, Sh. Zhua, T. Weic, A. Lid and Z. Liange, “Detecting Phishing Websites via Aggregation Analysis of PageLayouts”, In Proceedings of the International Conference on Identification, Information and Knowledge in the Internet of Things, China, 19-21 October, 2017.
[3] https://www.wombatsecurity.com/blog/the-latest-in-phishing-first-of-2019. (Accessed on: 2020)
[4] A. Jain and B. Gupta, “PHISH-SAFE: URL Features-Based Phishing Detection System Using Machine Learning”, Cyber Security, Advances in Intelligent Systems and Computing, vol. 729, pp. 467-474, 2018.
[5] N. Sanglerdsinlapachai and A. Rungsawang,“Using Domain Top-page Similarity Feature in Machine Learning-based Web Phishing Detection”, In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, Thailand, 09-10 Jan, 2010.
[6] M. Adebowale, K. Lwin, E. Sánchez and M. Hossain, “Intelligent web-phishing detection and protection scheme using integrated features of Images, frames and text”, Expert Systems with Applications, vol. 15, pp. 300-313, 2019.
[7] I. Hamid, A. Rahmi and A. Jemal “Phishing e-mail feature selection approach 2011.” In Proceedings of the International Joint Conference of IEEE, Taiwan, 25-27 May, 2011.
[8] N. Shekokar, C. Shah, M. Mahajan, and S. Rachh, “An ideal approach for detection and prevention of phishing attacks”, Procedia Computer Science, vol. 49, pp. 82-91, 2015.
[9] Y. Zhang, I. Hong, and F. Cranor, “Cantina: a content-based approach to detecting phishing web sites”, In Proceedings of the 16thinternational conference on World Wide Web, ACM, Canada, 08-12 May, 2007.
[10] M. Aburrous, A. Hossain, K. Dahal, and F. Thabtah, “Intelligent phishing detection system for e-banking using fuzzy data mining”, Expert Systems with Applications, vol. 37(12), pp. 7913-7921, 2010.
[11] A. Barraclough, A. Hossain, A. Tahir, G. Sexton, and N. Aslam, “Intelligent phishing detection and protection scheme for online transactions. (Re- port)”, Expert Systems with Applications, vol. 40 (11), pp. 4697-4706, 2013.
[12] UCI Machine Learning Repository available at: https://archive.ics.uci.edu/ml/machine-learning-databases/00327/ Training%20Dataset.arff. (Accessed on: 2020)
[13] Phishing websites Database available at: http://eprints.hud.ac.uk/24330/9/Mohammad14JulyDS_1.arff. (Accessed on: 2020)
[14] A. Ahmed and N. Abdullah, “Real Time Detection of Phishing Websites”, In Proceedings of the IEEE 7th Annual Information Technology, Electronics and Mobile Communication Conference, Canada, 13-15 October, 2016.
[15] L. Cranor, S. Egelman, I. Hong, and Y. Zhang, “Phishing Phish: An Evaluation of Anti-Phishing Toolbars”, In Proceedings of the Network and Distributed System Security Symposium Conference, NDSS, USA, 28th February – 02nd March,2007.
[16] B. Osareh, "Intrusion Detection in Computer Networks based on Machine Learning Algorithms", International Journal of Computer Science and Network Security, vol. 8(11), pp. 15-23, 2008.
[17] H. Shahriar and M. Zulkernine, “Information Source-based Classification of Automatic Phishing Website Detectors”, IEEE/IPSJ International Symposium on Applications and the Internet, Munich, pp. 190-195, 2011.
[18] L. Wenyin1, G. Huang1, L. Xiaoyue, Z. Min, and X. Deng, “Detection of phishing webpages based on visual similarity”, In Proceedings of the 14th international conference on World Wide Web, Japan, 10-14 May, 2005.
[19] A. Fu, L. Wenyin, and X. Deng, “Detecting Phishing Web Pages with Visual Similarity Assessment Based on Earth Mover's Distance (EMD)”, IEEE Transactions on Dependable and Secure Computing, vol. 3(4), pp. 301 - 311, 2006.
[20] V. Kumar and R. Kumar, “Detection of a phishing attack using visual cryptography in ad-hoc network”, In Proceedings of the IEEE International Conference on Communications and Signal Processing (ICCSP), INDIA, 02-04 April, 2015.
[21] S. Fatt, K. Leng, and S. Nah, “Phishdentity: Leverage Website Favicon to Offset Polymorphic Phishing Website”, In Proceedings of the IEEE Ninth International Conference on Availability, Reliability and Security (ARES), Switzerland, 08-12 September, 2014.
[22] A. Barraclough, A. Hossain, A. Tahir, G. Sexton, and N. Aslam, “Intelligent phishing detection and protection scheme for online transactions”, Expert Systems with Applications, vol. 40(11), pp. 4697-4706, 2013.
[23] M. Jian, T. Wenqian, L. Pei, W. Tao and L. Zhenkai, “Phishing-Alarm: Robust and Efficient Phishing Detection via Page Component Similarity”, IEEE Access, vol. 5, pp. 17020-17030, 2017.
[24] V. Shreeram, M. Suban, P. Shanthi, and K. Manjula, “Anti-phishing detection of phishing attacks using genetic algorithm”, In Proceedings of the IEEE International Conference on Communication Control and Computing Technologies (ICCCCT), India, 7-9 October, 2010.
[25] A. Yasin and A. Abuhasan, “An intelligent model for phishing email detection”, International Journal of Network Security & Its Applications(IJNSA), vol. 8(4), pp. 55-72, 2016.
[26] A. Agarwal, M. Mittal, A. Pathak and L. Goyal, “Fake News Detection Using a Blend of Neural Networks: An Application of Deep Learning”, SN Computer Science, 1:134, pp. 1-9, 2020.
[27] C. Monica and N. Nagarathna, “Detection of Fake Tweets Using Sentiment Analysis”, SN Computer Science, 1:89, pp. 1-7, 2020.
[28] N. Abdelhamid, A. Ayesh, and F. Thabtah, “Phishing detection based Associative Classification data mining”, Expert Systems with Applications, vol. 41(13), pp. 5948-5959, 2014.
[29] Y. Ping, G. Yuxiang, Z. Futai, Y. Yao, W. Wei and Z. Ting, “Web Phishing Detection Using a Deep Learning Framework”, Wireless Communications and Mobile Computing, pp. 1-9, 2018.[30] Y. Peng, Z. Guangzhen and Z. Peng, “Phishing Website Detection based on Multidimensional Features driven by Deep Learning”, IEEE Access, vol. 7, pp. 15196-15209, 2019.
[31] Z. Erzhou, Ch. Yuyang, Y. Chengcheng, L. Xuejun and L. Feng, “OFS-NN: An Effective Phishing Websites Detection Model Based on Optimal Feature Selection and Neural Network”, IEEE Access, vol. 7, pp. 73271-73284, 2019.
[32] R. Mohammad, F. Thabtah and T. Mccluskey, “Predicting phishing websites based on self-structuring neural network”, Neural Computing and Applications, vol. 25(2), pp. 443-458, 2013.
[33] A. Tharwat, “Classification assessment methods”, Applied Computing and Informatics, pp. 1-13, 2018.