Heterogeneous Data Analysis in Intelligent Fraud Detection Systems
DOI:
https://doi.org/10.31649/1997-9266-2019-143-2-78-90Keywords:
fraud detection, anomaly detection, classification model, method for analyzing heterogeneous dataAbstract
Fraud is being considered as an anomaly in the data in the work. The work is devoted to the development of a method of heterogeneous data analysis in intelligent fraud detection systems. The detection of fraud as an anomaly in heterogeneous data during mobile applications installation using set theory, which allowed further data analysis in such systems, is formalized. The mathematical model of the process of heterogeneous data analysis, the algorithm of heterogeneous data analysis, the method of heterogeneous data analysis on the basis of the proposed scales and coefficients that allowed processing of various input data — data of various metrics, templates, dimensions, which in the analysis process makes it possible to form a generalized fingerprint of fraudster, is proposed. The developed method uses the databases and knowledge bases, through which a generalized fingerprint of the fraudster is formed, the presence of which allows accelerating the detection of fraudsters in new data sets and detecting even implicit fraudsters. The proposed method is designed to use it in intelligent systems for fraud detection based on anomalies in data that, unlike existing ones, will allow analyzing heterogeneous data on the basis of which fraudulent decisions are made, to reduce the dimensionality of data and to classify users. Experimental researches of the proposed method of heterogeneous data analysis as a part of detection of fraud as anomalies in heterogeneous data and a classification model developed using fully connected deep neural networks with three hidden layers using the developed software and using a representative sample have been carried out. The scheme of experimental research of detection of anomalies in heterogeneous data during the mobile applications installation, based on which method of heterogeneous data analysis was presented has been proposed. The efficiency of using the proposed method in the fraud detection system is shown, the classification accuracy of which was 99,14 %, the accuracy of the fraud detection is 82,76 %. However, with the increase of rules in the developed knowledge base, which will grow with each launch on the new data, the accuracy of the system will increase.
References
T. Polhul, and A. Yarovyi “Development of a method for fraud detection in heterogeneous data during installation of mobile applications,” Eastern-European Journal of Enterprise Technologies, № 1/2 (97), 2019. https://doi.org/doi: 10.15587/1729-4061.2019.155060
D. Hawkins, “Identification of Outliers,” Chapman and Hall, 1980.
А. А. Яровий, О. Н. Романюк, І. Р. Арсенюк, та Т. Д. Польгуль, «Виявлення шахрайства при інсталюванні програмних додатків з використанням інтелектуального аналізу даних,» Наукові праці Донецького національного технічного університету. Серія: «Інформатика, кібернетика та обчислювальна техніка», № 2 (25), c. 126-131, 2017. [Електронний ресурс]. Режим доступу: http://science.donntu.edu.ua/wp-content/uploads/2018/03/ikvt_2017_2_site-1.pdf .
Т. Д. Польгуль, та А. А. Яровий, «Визначення шахрайських операцій при встановленні мобільних додатків з використанням інтелектуального аналізу даних,» Сучасні тенденції розвитку системного програмування. Тези доповідей. Київ, 2016. c. 55-56. [Електронний ресурс]. Режим доступу: http://ccs.nau.edu.ua/wp-content/uploads/2017/12/%D0%A1%D0%A2%D0%A0%D0%A1%D0%9F_2016_07.pdf .
Т. Д. Польгуль, та А. А. Яровий, «Метод подолання різнорідності даних для виявлення шахрайства при інсталюванні мобільних додатків,» Вісник СНУ ім. В. Даля, № 7 (248) c. 60-69, 2018.
T. Polhul, “Development of an intelligent system for detecting mobile app install fraud,” Proceedings of the IRES 156th International Conference, Bangkok, Thailand, 21st-22nd March 2019. pp. 25-29.
Kochava Uncovers Global Ad Fraud Scam. [Електронний ресурс]. Режим доступу: https://www.kochava.com/ .
Andrii Yarovyi, Raisa Ilchenko, Ihor Arseniuk, Yevhene Shemet, Andrzej Kotyra, and Saule Smailova, “An intelligent system of neural networking recognition of multicolor spot images of laser beam profile”. Proc. SPIE 10808, Photonics Applications in Astronomy, Communications, Industry, and High-Energy Physics Experiments 2018, 108081B (1 October 2018). https://doi.org/10.1117/12.2501691 .
V. Kozhemyako, L. Timchenko, and A. Yarovyy, “Methodological Principles of Pyramidal and Parallel-Hierarchical Image Processing on the Base of Neural-Like Network Systems,” Advances in Electrical and Computer Engineering, vol. 8, no. 2, pp. 54-60, 2008, https://doi.org/10.4316/AECE.2008.02010 .
M. Granik, V. Mesyura and A. Yarovyi, “Determining Fake Statements Made by Public Figures by Means of Artificial Intelligence,” 2018 IEEE 13th International Scientific and Technical Conference on Computer Sciences and Information Technologies (CSIT), Lviv, 2018, pp. 424-427. https://doi.org/ 10.1109/STC-CSIT.2018.8526631 .
R. Agrawal, R. Srikant, “Mining sequential patterns,” Proceedings of the Eleventh International Conference on Data Engineering. 1995. doi: https://doi.org/10.1109/icde.1995.380415
V. Chandola, A. Banerjee, V. Kumar, “Anomaly detection,” ACM Computing Surveys, vol. 41, iss. 3, pp. 1-582009. https://doi.org/https://doi.org/10.1145/1541880.1541882 .
S. Guido, A. Müller, Introduction to Machine Learning with Python: A Guide for Data Scientists. O’Reilly Media, 2016. 400 p.
D.-Y. Yeung, C. Chow, “Parzen-window network intrusion detectors,” “Object recognition supported by user interaction for service robots.” 2002. https://doi.org/10.1109/icpr.2002.1047476 .
E. Keogh, J. Lin, A. Fu, “HOT SAX: Efficiently Finding the Most Unusual Time Series Subsequence,” Fifth IEEE International Conference on Data Mining (ICDM’05). 2005. https://doi.org/10.1109/icdm.2005.79 .
E. Keogh, J. Lin, S.-H. Lee, H. V. Herle “Finding the most unusual time series subsequence: algorithms and applications,” Knowledge and Information Systems, vol. 11, iss. 1, pp. 1-27 , 2006. https://doi.org/10.1007/s10115-006-0034-6 .
А. Г. Кюльян, Т. Д. Польгуль, та М. Б. Хазін, «Математична модель рекомендаційного сервісу на основі методу колаборативної фільтрації,» Комп’ютерні технології та Інтернет в інформаційному суспільстві, c. 226-227, 2012. [Електронний ресурс] Режим доступу: http://ir.lib.vntu.edu.ua/bitstream/handle/123456789/7911/226-227.pdf?sequence=1&isAllowed=y
А. А. Яровий, та Т. Д. Польгуль, «Комп’ютерна програма «Програмний модуль збору даних інформаційної технології» виявлення шахрайства при інсталюванні програмних додатків.» Cвідоцтво про реєстрацію авторського права на твір № 76348. К.: Міністерство економічного розвитку і торгівлі України, 2018.
А. А. Яровий, та Т. Д. Польгуль, «Комп’ютерна програма «Програмний модуль визначення схожості користувачів інформаційної технології виявлення шахрайства при інсталюванні програмних додатків,» Cвідоцтво про реєстрацію авторського права на твір № 76347. К.: Міністерство економічного розвитку і торгівлі України, 2018.
Downloads
-
PDF (Українська)
Downloads: 564
Published
How to Cite
Issue
Section
License
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).