Machine learning-based Naive Bayes approach for divulgence of Spam Comment in Youtube station

Sohom  Bhattacharya; Shubham  Bhattacharjee; Anup  Das; Anirban  Mitra; Ishita  Bhattacharya; Subir Gupta

Authors

Sohom Bhattacharya Department of Masters of Computer Application , Dr. B. C. Roy Engineering College https://orcid.org/0000-0002-8837-5418
Shubham Bhattacharjee Department of Masters of Computer Application , Dr. B. C. Roy Engineering College Durgapur, West Bengal – 713206, India
Anup Das CEO and founder of AnupTechTips, New Jalpaiguri, West Bengal – 734013, India
Anirban Mitra Department of Computer Science & Engineering ,Amity University Kolkata, West Bengal – 700135, India
Ishita Bhattacharya Department of Life Science Binod Bihari Mahto Koyalanchal ,University Dhanbad, Jharkhand – 828130, India
Subir Gupta Dr B C ROY ENGINEERING COLLEGE https://orcid.org/0000-0002-0941-0749

Keywords:

Artificial Intelligence, Naïve Bayes, Spam comment, QQ-plot, Ham comment

Abstract

In the 21st Century, web-based media assumes an indispensable part in the interaction and communication of civilization. As an illustration of web-based media viz. YouTube, Facebook, Twitter, etc., can increase the social regard of a person just as a gathering. Yet, every innovation has its pros as well as cons. In some YouTube channels, a machine-made spam remark is produced on that recordings, moreover, a few phony clients additionally remark a spam comment which creates an adverse effect on that YouTube channel. The spam remarks can be distinguished by using AI (artificial intelligence) which is based on different Algorithms namely Naive Bayes, SVM, Random Forest, ANN, etc. The present investigation is focussed on a machine learning-based Naive Bayes classifier ordered methodology for the identification of spam remarks on YouTube

Downloads

Download data is not yet available.

References

S. Aiyar and N. P. Shetty, “N-Gram Assisted Youtube Spam Comment Detection,” Procedia Comput. Sci., vol. 132, no. Iccids, pp. 174–182, 2018, doi: 10.1016/j.procs.2018.05.181.

C. A. Shue, M. Gupta, C. H. Kong, J. T. Lubia, and A. S. Yuksel, “Spamology: A study of spam origins,” 6th Conf. Email Anti-Spam, CEAS 2009, no. January, 2009.

A. Heydari, M. A. Tavakoli, N. Salim, and Z. Heydari, “Detection of review spam: A survey,” Expert Syst. Appl., vol. 42, no. 7, pp. 3634–3642, 2015, doi: 10.1016/j.eswa.2014.12.029.

R. Kaur, S. Singh, and H. Kumar, “Rise of spam and compromised accounts in online social networks: A state-of-the-art review of different combating approaches,” J. Netw. Comput. Appl., vol. 112, pp. 53–88, 2018, doi: 10.1016/j.jnca.2018.03.015.

Z. Guo, L. Tang, T. Guo, K. Yu, M. Alazab, and A. Shalaginov, “Deep Graph neural network-based spammer detection under the perspective of heterogeneous cyberspace,” Futur. Gener. Comput. Syst., vol. 117, pp. 205–218, 2021, doi: 10.1016/j.future.2020.11.028.

D. C. Corrales, A. Ledezma, and J. C. Corrales, “A case-based reasoning system for recommendation of data cleaning algorithms in classification and regression tasks,” Appl. Soft Comput. J., vol. 90, p. 106180, 2020, doi: 10.1016/j.asoc.2020.106180.

A. Fahfouh, J. Riffi, M. Adnane Mahraz, A. Yahyaouy, and H. Tairi, “PV-DAE: A hybrid model for deceptive opinion spam based on neural network architectures,” Expert Syst. Appl., vol. 157, p. 113517, 2020, doi: 10.1016/j.eswa.2020.113517.

S. Panda, A. K. Ghosh, A. Das, U. Dey, and S. Gupta, “Machine Learning-based Linear regression way to deal with making data science model for checking the sufficiency of night curfew in Maharashtra , India,” vol. 1, no. 2, pp. 168–173, 2021.

A. Kantchelian, J. Ma, and A. D. Joseph, “Robust Detection of Comment Spam Using Entropy Rate Categories and Subject Descriptors,” no. AISec, pp. 59–69, 2012.

E. Tan, L. Guo, S. Chen, X. Zhang, and Y. E. Zhao, “Spammer Behavior Analysis and Detection in User Generated Content on Social Networks,” 2012, doi: 10.1109/ICDCS.2012.40.

Advances in Intelligent Systems. .

J. T. Hancock and C. Cardie, “Finding Deceptive Opinion Spam by Any Stretch of the Imagination Finding Deceptive Opinion Spam by Any Stretch of the Imagination,” no. May, 2014.

S. Adamovi? et al., “An efficient novel approach for iris recognition based on stylometric features and machine learning techniques,” Futur. Gener. Comput. Syst., vol. 107, pp. 144–157, 2020, doi: 10.1016/j.future.2020.01.056.

S. Lee et al., Intelligent traffic control for autonomous vehicle systems based on machine learning, vol. 144. Elsevier Ltd, 2020.

T. Shaikhina, D. Lowe, S. Daga, D. Briggs, R. Higgins, and N. Khovanova, “Machine learning for predictive modelling based on small data in biomedical engineering,” IFAC-PapersOnLine, vol. 28, no. 20, pp. 469–474, 2015, doi: 10.1016/j.ifacol.2015.10.185.

S. Gupta et al., “Modelling the steel microstructure knowledge for in-silico recognition of phases using machine learning,” Mater. Chem. Phys., vol. 252, no. March, p. 123286, 2020, doi: 10.1016/j.matchemphys.2020.123286.

S. Gupta, J. Sarkar, M. Kundu, N. R. Bandyopadhyay, and S. Ganguly, “Automatic recognition of SEM microstructure and phases of steel using LBP and random decision forest operator,” Meas. J. Int. Meas. Confed., vol. 151, p. 107224, 2020, doi: 10.1016/j.measurement.2019.107224.

N. N. Amir Sjarif, N. F. Mohd Azmi, S. Chuprat, H. M. Sarkan, Y. Yahya, and S. M. Sam, “SMS spam message detection using term frequency-inverse document frequency and random forest algorithm,” Procedia Comput. Sci., vol. 161, pp. 509–515, 2019, doi: 10.1016/j.procs.2019.11.150.

Y. Tian, M. Mirzabagheri, P. Tirandazi, and S. M. H. Bamakan, “A non-convex semi-supervised approach to opinion spam detection by ramp-one class SVM,” Inf. Process. Manag., vol. 57, no. 6, p. 102381, 2020, doi: 10.1016/j.ipm.2020.102381.

B. K. Dedeturk and B. Akay, “Spam filtering using a logistic regression model trained by an artificial bee colony algorithm,” Appl. Soft Comput. J., vol. 91, p. 106229, 2020, doi: 10.1016/j.asoc.2020.106229.

N. M. Samsudin, C. F. B. Mohd Foozy, N. Alias, P. Shamala, N. F. Othman, and W. I. S. Wan Din, “Youtube spam detection framework using naïve bayes and logistic regression,” Indones. J. Electr. Eng. Comput. Sci., vol. 14, no. 3, pp. 1508–1517, 2019, doi: 10.11591/ijeecs.v14.i3.pp1508-1517.

C. C. Kiliroor and C. Valliyammai, Social context based naive bayes filtering of spam messages from online social networks, vol. 758. Springer Singapore, 2018.

C. M. Yeomans, R. K. Shail, S. Grebby, V. Nykänen, M. Middleton, and P. A. J. Lusty, “A machine learning approach to tungsten prospectivity modelling using knowledge-driven feature extraction and model confidence,” Geosci. Front., vol. 11, no. 6, pp. 2067–2081, 2020, doi: 10.1016/j.gsf.2020.05.016.

V. Zorkadis, D. A. Karras, and M. Panayotou, “Efficient information theoretic strategies for classifier combination, feature extraction and performance evaluation in improving false positives and false negatives for spam e-mail filtering,” Neural Networks, vol. 18, no. 5–6, pp. 799–807, 2005, doi: 10.1016/j.neunet.2005.06.045.

L. Yang et al., “Prediction model of the response to neoadjuvant chemotherapy in breast cancers by a Naive Bayes algorithm,” Comput. Methods Programs Biomed., vol. 192, 2020, doi: 10.1016/j.cmpb.2020.105458.

J. Kolluri and S. Razia, “Text classification using Naïve Bayes classifier,” Mater. Today Proc., no. xxxx, 2020, doi: 10.1016/j.matpr.2020.10.058.