Information Technology for the Cryptocurrency Rate Forecasting on the Basics of Complex Feature Engineering
DOI:
https://doi.org/10.31649/1997-9266-2022-161-2-81-93Keywords:
cryptocurrency, feature engineering, forecasting, information technology, bitcoinAbstract
The paper is devoted to the development of information technology for cryptocurrency exchange rate forecasting based on complex feature engineering. The peculiarity of this technology is a systematic approach to the feature selection. The analysis of external and internal groups of factors of potential influence on the cryptocurrency market was carried out. The analysis of features that characterize changes in cryptocurrency rates showed that in addition to the basic primary features, which are available on many cryptocurrencies, more important for the further forecasting of cryptocurrency rates are secondary features derived from basic primary ones by applying various mathematical operations and/or algorithmic transformations to them. The analysis of a large number of sources showed that cryptocurrencies have several characteristics that have caused their great popularity and which should also be taken into account when forming and choosing features. The systematization of such key characteristics are carried out in the paper, and also it is offered how to formalize them in the form of features. It is suggested to formalize the features according to a systematic approach, according to the postulates of technical cybernetics, which state that any object of study can be represented as a black box (BB), which is in contact with the environment at five points, which can in a multidimensional case be sets of features or variables. A general mathematical model of the formation of these factors is given, which consists in generating a large number of secondary features based on simple mathematical, algorithmic, and statistical transformations with subsequent selection of the most relevant of them. The technology involves the synthesis of new secondary features based on other secondary features, with some exceptions, which are formalized as a system of rules. This will reduce the overfitting of the model and improve its generalizing ability.
To prove the efficiency of the developed technology, an example of its application based on the cryptocurrency bitcoin according to the daily data of 2020―2021 is considered. Studies and computer experiments have shown the efficiency of the suggested technology.
References
N. P. Patel, et al., “Fusion in Cryptocurrency Price Prediction: A Decade Survey on Recent Advancements, Architecture, and Potential Future Directions,” IEEE Access, vol. 10, pp. 34511-34538, 2022, https://doi.org/10.1109/ACCESS.2022.3163023.
M. Chen, and N. Narwal, “Predicting price changes in ethereum,” Int. J. Comput. Sci. Eng., vol. 4, pp. 975, Apr. 2017.
T. Phaladisailoed, and T. Numnonda, “Machine learning models comparison for bitcoin price prediction,” Proc. 10th Int. Conf. Inf. Technol. Electr. Eng. (ICITEE), Jul. 2018, pp. 506-511.
Bitcoin price prediction using Machine Learning. [Online]. Available: https://medium.com/@rohansawant7978/forecasting-of-bitcoin-price-using-machine-learning-deep-learning-techniques-93bf662f46ab. Accessed on: April 7, 2022.
E. Akyildirim, A. Goncu, and A. Sensoy, “Prediction of cryptocurrency returns using machine learning,” Annals of Operations Research, no. 297, pp. 3-36, 2021. https://doi.org/10.1007/s10479-020-03575-y.
Zheshi Chen, Chunhong Li and Wenjun Sun, “Bitcoin price prediction using machine learning: An approach to sample dimension engineering,” Journal of Computational and Applied Mathematics, v. 365, 2020. https://doi.org/10.1016/j.cam.2019.112395.
E. Almasri, and E. Arslan, “Predicting cryptocurrencies prices with neural networks,” in Proc. 6th Int. Conf. Control Eng. Inf. Technol. (CEIT), pp. 1-5, Oct. 2018.
Franco Valencia, Alfonso Gómez-Espinosa, and Benjamín Valdés-Aguirre. “Price Movement Prediction of Cryptocurrencies Using Sentiment Analysis and Machine Learning,” Entropy, 21, no. 6: 589, 2019. https://doi.org/10.3390/e21060589.
Mingxi Liu, Guowen Li, Jianping Li, Xiaoqian Zhu and Yinhong Yao, “Forecasting the price of Bitcoin using deep learning,” Finance Research Letters, vol. 40, issue C, 2021. https://doi.org/10.1016/j.frl.2020.101755.
Kaggle Competition “G-Research Crypto Forecasting,” 2021. [Online]. Available: https://www.kaggle.com/competitions/g-research-crypto-forecasting. Accessed on:April 7, 2022.
Blockchain & crypto. [Online]. Available: https://academy.binance.com. Accessed on: April 7, 2022.
Lukas Menkhoff. “The use of technical analysis by fund managers: International evidence,” Journal of Banking and Finance, vol. 34, issue 11, p. 2573-2586, 2010. https://doi.org/10.1016/j.jbankfin.2010.04.014.
Bitcoin technical analysis for beginners. [Online]. Available: https://www.moneycontrol.com/msite/wazirx-cryptocontrol-articles/bitcoin-technical-analysis-for-beginners/. Accessed on: April 7, 2022.
Major world market indices. [Online]. Available: https://www.investing.com/indices/major-indices. Accessed on: April 7, 2022.
Commodities trading: an overview. [Online]. Available: https://www.investopedia.com/investing/commodities-trading-overview. Accessed on: April 7, 2022.
Tsfresh documentation. [Online]. Available: https://tsfresh.readthedocs.io/en/latest/. Accessed on: April 7, 2022.
TSFEL documentation. [Online]. Available: https://tsfel.readthedocs.io/en/latest/. Accessed on: April 7, 2022.
Vitalii Mokin. Kaggle Notebook «BTC & COVID-19 in USA : EDA & Prediction», April 2022. [Online]. Available: https://www.kaggle.com/vbmokin/btc-covid-19-in-usa-eda-prediction . Accessed on: April 7, 2022.
Autofeat library. [Online]. Available: https://github.com/cod3licious/autofeat. Accessed on: April 7, 2022.
Featuretools. [Online]. Available: https://www.featuretools.com/. Accessed on: April 7, 2022.
Tsaug documentation. [Online]. Available: https://tsaug.readthedocs.io/en/stable/. Accessed on: April 7, 2022.
Gplearn documentation. [Online]. Available: https://gplearn.readthedocs.io/en/stable/. Accessed on: April 7, 2022.
Featuretools documentation. [Online]. Available: https://featuretools.alteryx.com/en/stable. Accessed on: April 7, 2022.
R. C. Phillips and D. Gorse, “Predicting cryptocurrency price bubbles using social media data and epidemic modelling,” Proc. IEEE Symp. Ser. Comput. Intell. (SSCI), pp. 1-7, Nov. 2017.
A. Aggarwal, I. Gupta, N. Garg and A. Goel, “Deep learning approach to determine the impact of socio economic factors on bitcoin price prediction,” Proc. 12th Int. Conf. Contemp. Comput. (IC3), pp. 1-5, Aug. 2019.
J. Abraham, D. Higdon, J. Nelson and J. Ibarra, “Cryptocurrency price prediction using tweet volumes and sentiment analysis,” SMU Data Sci. Rev., vol. 4, pp. 1, Apr. 2018.
M. M. Patel, S. Tanwar, R. Gupta and N. Kumar, “A deep learning-based cryptocurrency price prediction scheme for financial institutions,” J. Inf. Secur. Appl., vol. 55, Dec. 2020.
S. Khuntia and J. Pattanayak, “Adaptive market hypothesis and evolving predictability of bitcoin,” Econ. Lett., vol. 167, pp. 26-28, Dec. 2018.
C. Gurdgiev, and D. O’Loughlin, “Herding and anchoring in cryptocurrency markets: Investor reaction to fear and uncertainty,” J. Behav. Exp. Finance, vol. 25, Mar. 2020.
V. L. Tran, and T. Leirvik, “Efficiency in the markets of crypto-currencies,” Finance Res. Lett., vol. 35, Aug. 2020.
O. Angela, and Y. Sun, “Factors affecting cryptocurrency prices: Evidence from ethereum,” Proc. Int. Conf. Inf. Manage. Technol. (ICIMTech), pp. 318-323, Aug. 2020.
Vitalii Mokin, Kaggle Notebook “BTC & Gold : EDA,” 2022. [Online]. Available: https://www.kaggle.com/code/vbmokin/btc-gold-eda. Accessed on: April 7, 2022.
Bitcoin rainbow price chart. [Online]. Available: https://www.blockchaincenter.net/bitcoin-rainbow-chart/. Accessed on: April 7, 2022.
Б. І. Мокін, В. Б. Мокін, і О. Б. Мокін. «Практикум для самостійної роботи студентів з навчальної дисципліни «Методологія та організація наукових досліджень,» Частина 1: від постановки задачі до синтезу та ідентифікації математичної моделі». Вінниця, Україна: ВНТУ, 2018, 179 с. [Електронний ресурс]. Режим доступу: https://ecopy.posibnyky.vntu.edu.ua/txt/2018/Mokin_Pos_%D0%A1%D0%A0%D0%A1_%D0%9C%D0%9E%D0%9D%D0%94%20-%20p008.pdf. Дата звернення: Квітень 7, 2022.
AutoML documentation. [Online]. Available: https://www.automl.org/automl/. Accessed on: April 7, 2022.
Time series made easy in Python. [Online]. Available: https://unit8co.github.io/darts/. Accessed on: April 7, 2022.
В. Б. Мокін, О. В. Слободянюк, О. М. Давидюк, і Д. О. Шмундяк, «Інформаційна технологія пошуку можливих джерел підвищеного забруднення річки з використанням моделі Prophet,» Вісник Вінницького політехнічного інституту, № 4, с. 15-24, 2020. https://doi.org/10.31649/1997-9266-2020-151-4-15-24.
О. Б. Мокін, В. Б. Мокін, і Б. І. Мокін, «Метод ідентифікації моделі авторегресії-ковзного середнього АРКС(Р,Q) з довільними значеннями порядків Р, Q, який узагальнює методику Юла–Уокера,» Наукові праці Вінницького національного технічного університету, № 2, 2014.
Pratik Gandhi, 7 libraries that help in time-series problems, 2021. [Online]. Available: https://towardsdatascience.com/7-libraries-that-help-in-time-series-problems-d59473e48ddd. Accessed on: April 7, 2022.
Vitalii Mokin, Kaggle Notebook. “BTC Growth Forecasting with Advanced FE for OHLC.” [Electronic resource]. Available: https://www.kaggle.com/code/vbmokin/btc-growth-forecasting-with-advanced-fe-for-ohlc.
Downloads
-
pdf (Українська)
Downloads: 365
Published
How to Cite
Issue
Section
License
This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).