Перевод
Язык оригинала
17.06.2025
Machine learning approach to predicting systemic risk in Russian banks
Abstract
This study contributes to the literature on systemic risk in the Russian banking sector. Economic downturns, interest rate fluctuations, geopolitical events, and regulatory changes can impact the entire financial system, creating systemic risks for banks. Individual banks are ohen unable to cope with these risks, which can threaten their stability and profitability. Thus, good risk management is required to predict and mitigate the impact of such systemic hazards. Banks using machine learning tech- niques can improve their ability to identify trends, anticipate impending crises, and implement decisions to maintain financial resilience in the face of systemic risks.
Purpose — The goal of this study is to create a strong machine learning frame- work for predicting systemic risk in Russia’s banking sector. The study’s goal is to provide insights into the determinants of systemic risk, as well as a data-driven approach to risk management and decision-making for banks and regulators in emerging markets.
Problems — Selecting and implementing the best machine learning models for detecting systemic risk is difficult. Furthermore, evaluating and improving the
performance of machine learning models when dealing with imbalanced datasets in systemic risk prediction situations is another significant challenge.
Methodology — The study uses a comprehensive machine learning framework to detect systemic risk in Russian banks. A variety of machine learning methods, Selecting and implementing the best machine learning models for detecting sys- temic risk is difficult. Furthermore, evaluating and improving the performance of machine learning models when dealing with imbalanced datasets in systemic risk prediction situations is another significant challenge. such as gradient boosting machines (GBM), and deep learning models, are used to detect complicated pat- terns in financial data. To solve the issue of data imbalance, the study uses advanced resampling techniques including Adaptive Synthetic Sampling (ADASYN). Fur- thermore, feature engineering and dimensionality reduction methods, such as Principal Component Analysis (PCA), are utilized to improve model performance and interpretability. The DCoVaR approach is used to assess individual banks’ sys- temic risk contributions, and cross-validation techniques ensure that the predic- tive models are resilient and reliable.
Keywords: Systemic risk, Risk Prediction, Banks, Machine learning, Classifi- cation problem, Emerging economy.
2 — Methodology
2.1 Data and sample
The study will focus on Russian banks and will use data fr om a variety of sources, including individual bank share prices on the Moscow Exchange, govern- ment variables from the Central Bank of Russia, and banking statistics from the Center for Monitoring the Russian Economy. The dataset contains both categori- cal and continuous variables, organized into two categories: aspects of the bank’s balance sheet (e.g., ownership, size, profitability) and characteristics of the bank’s shares (e.g., valuations, market risk). These characteristics influence systemic risk, which is estimated biweekly for each bank and averaged quarterly.
The analysis is divided into two sections, one focusing on high systemic risk and the other on moderate systemic risk as a robustness check. The study uses machine learning (ML) models to categorize banks according to their systemic risk categories (high or moderate). Discretized quartiles of systemic risk (DCoVaR) are used to generate classes, simplifying model training and making predictions more useful. The goal is to create a classifier that can predict cases of high and moder- ate systemic risk, allowing regulators and institutions to anticipate future problems without relying on human experts. The main goal is to create a model that accu- rately generalizes to previously unexplored data.
2.2 Estimation of systemic risk
The DCoVaR metrics of Tobias and Brunnermeier (2016) are used to estimate systemic risk in the Russian banking sector. Quantile regression is used to estimate
systemic risk by defining the stress state as the 1% quantile. DCoVaR is the spillover of bank risk into the system, which accumulates as systemic risk. The estimation methodology includes state variables such as call money rate, 91-day treasury bill and 10-year GSec rate. VaR of bank asset growth is regressed on the lagged state variables.
2.3 Machine learning models
Model characteristics are the key parameters that determine the model’s com- plexity. Prediction models have poor accuracy due to irrelevant features. Feature selection is an important component in machine learning models since it removes redundant features that may have an impact on the model’s performance. The fac- tors stated above are the input features for this investigation. The ML models elim- inate duplicate features and train a durable model. This approach enhances model accuracy while decreasing overfitting. This subsection provides an overview of the ML models utilized in the research.
2.3.1 Logistic regression
LR is a technique for modelling dichotomous categorical outcome variables using one or more explanatory variables. LR calculates the probability that a given bank may emit high or moderate systemic risk.
When the input variables have complex associations, LR performs less effec- tively.
2.3.2 Deep Learning Models in Systemic Risk Prediction
Deep learning algorithms are used in this study to predict systemic risk in Russian banks because they can capture complicated, nonlinear correlations in financial data. Deep learning, a form of machine learning, employs artificial neural networks with several hidden layers to detect complex patterns in banks’ financial statements, balance sheets, and market data. These models, such as feedforward neural networks, may learn from vast amounts of data, allowing them to identify tiny indicators of systemic risk that traditional models may miss. The application of deep learning models enables a more accurate and com- plete assessment of which institutions may be contributing to Russia ‘s financial instability.
2.3.3 Gradient boosting machine
GBM models are gaining popularity due to their efficacy in identifying difficult data sets (Bissacco et al., 2007; Hutchinson et al., 2011; Pittman and Brown, 2011; Johnson and Zhang, 2012). The goal of boosting was to improve a poor learning algorithm in order to generate a strong predictive model. The GBM classifier is built on ensemble techniques that merge multiple weak learning models in a progressive, cumulative, and sequential manner. Gradient boosting algorithms are typically implemented using decision trees. For all ML models, the study filters out critical data-set factors known as feature selection. Feature selection distinguishes valuable factors from noise and other irrelevant variables.
2.4 Addressing Data Imbalance
Addressing data imbalance is critical in systemic risk prediction to avoid model bias towards the dominant class. Techniques such as Adaptive Synthetic Sampling (ADASYN) generate synthetic samples for the minority class, improving the model’s capacity to identify high-risk banks. This method increases the sensi- tivity and forecast accuracy of the models in detecting systemic risk.
2.4.1 Adaptive Synthetic Sampling (ADASYN) for Data Imbalance
Given the rarity of systemic risk events in comparison to non-risk events, the dataset used in this analysis is significantly skewed. To solve this issue, adap- tive synthetic sampling (ADASYN) is used as a data preprocessing technique. ADASYN creates synthetic examples of the minority class based on the difficulty of categorization in the feature space. By adaptively providing more synthetic samples in places wh ere the model struggles to distinguish between high-risk and low-risk banks, ADASYN assures that machine learning models can learn well from imbalanced data. This method improves the model’s capacity to reliably anticipate systemic risk among Russian banks, especially when faced with an uneven distri- bution of risk occurrences.
2.5 Feature Engineering and Dimensionality Reduction
Feature engineering involves creating and transforming input variables to improve model performance, whereas dimensionality reduction seeks to reduce the number of features while retaining critical information. Principal Component Analysis (PCA) simplifies high-dimensional datasets, increasing model efficiency and interpretability by focusing on the most important factors for systemic risk pre- diction.
2.5.1 Principal Component Analysis (PCA)
The dataset used to predict systemic risk in Russian banks has a large number of financial and macroeconomic factors, which can result in high dimensionality and potential difficulties like multicollinearity. To address this, Principal Compo- nent Analysis (PCA) is used as a dimensionality reduction technique. PCA reduces the original high-dimensional data to a smaller set of uncorrelated principle com- ponents that retain the most significant variation from the source data. The study uses PCA to efficiently reduce the dataset’s complexity, enhancing the efficiency and interpretability of machine learning models. This technique helps to narrow the study to the most important aspects contributing to systemic risk, making the predictive framework more robust and manageable.
3 — Limitations of the study
The study may have several limitations, including reliance on historical data that may not reflect future systemic risk scenarios, a lack of generalizability to banks in other areas, issues with imbalanced datasets impacting model accuracy, and potential bias in feature selection. Despite these limitations, the study makes a
substantial contribution to knowledge by improving our understanding of systemic risk factors, offering machine learning approaches for categorization and predic- tion, and generating insights that might help regulators and policymakers. It indi- cates essential features that influence systemic risk, enhances risk management techniques in banking, and lays the groundwork for future research in the area.
4 — Conclusions
1. This study confirms the effectiveness of machine learning, in particular GBM and deep learning, in predicting systemic risk in Russian banks, allowing for a more proactive approach to risk management.
2. The study highlights the importance of addressing data imbalances using strategies such as ADASYN to improve model accuracy and minimize bias in predicting systemic risk.
3. Machine learning models help Russian banks detect systemic threats at an early stage, leading to greater financial stability and resilience in the face of economic and geopolitical challenges.
4. Russian policymakers can use machine learning models to improve systemic risk management practices, especially in response to economic sanctions, geopolitical events, and oil market fluctuations
5. Future research should focus on integrating real-time data sources, expanding the use of advanced machine learning techniques, and examining how exter- nal economic and geopolitical variables affect systemic risk in the Russian financial system.
References
1. Bissacco, A., Yang, M.-H. and Soatto, S. (2007), “Fast human pose estimation using appearance and motion via multi-dimensional boosting regression”, IEEE Conference on Computer Vision and Pattern Recognition, CVPR’07, Minneapolis, MN,doi:10.1109/ CVPR.2007.383129.
2. Hutchinson, R.A., Liu, L.-P. and Dietterich, T.G. (2011), “Incorporating boosted regression trees into ecological latent variable models”, AAAI’11, San Francisco, CA, pp. 343-1348, available at:www.aaai.org/ocs/index.php/AAAI/AAAI11/paper/view/3711.
3. Johnson, R. and Zhang, T. (2012), “Learning nonlinear functions using regularized greedy Forest”, Technical Report. arXiv:1109.0887. doi: 10.2172/1052139.
4. Pittman, S.J. and Brown, K.A. (2011), “Multi-scale approach for predicting fish species distributions across coral reef seascapes”, PLoS ONE, Vol. 6 No. 5, p. e20583, doi: 10.1371/ journal. pone.0020583.
5. Tobias, A. and Brunnermeier, M.K. (2016), “CoVaR”, American Economic Review, Vol. 106 No. 7, p. 1705.
This study contributes to the literature on systemic risk in the Russian banking sector. Economic downturns, interest rate fluctuations, geopolitical events, and regulatory changes can impact the entire financial system, creating systemic risks for banks. Individual banks are ohen unable to cope with these risks, which can threaten their stability and profitability. Thus, good risk management is required to predict and mitigate the impact of such systemic hazards. Banks using machine learning tech- niques can improve their ability to identify trends, anticipate impending crises, and implement decisions to maintain financial resilience in the face of systemic risks.
Purpose — The goal of this study is to create a strong machine learning frame- work for predicting systemic risk in Russia’s banking sector. The study’s goal is to provide insights into the determinants of systemic risk, as well as a data-driven approach to risk management and decision-making for banks and regulators in emerging markets.
Problems — Selecting and implementing the best machine learning models for detecting systemic risk is difficult. Furthermore, evaluating and improving the
performance of machine learning models when dealing with imbalanced datasets in systemic risk prediction situations is another significant challenge.
Methodology — The study uses a comprehensive machine learning framework to detect systemic risk in Russian banks. A variety of machine learning methods, Selecting and implementing the best machine learning models for detecting sys- temic risk is difficult. Furthermore, evaluating and improving the performance of machine learning models when dealing with imbalanced datasets in systemic risk prediction situations is another significant challenge. such as gradient boosting machines (GBM), and deep learning models, are used to detect complicated pat- terns in financial data. To solve the issue of data imbalance, the study uses advanced resampling techniques including Adaptive Synthetic Sampling (ADASYN). Fur- thermore, feature engineering and dimensionality reduction methods, such as Principal Component Analysis (PCA), are utilized to improve model performance and interpretability. The DCoVaR approach is used to assess individual banks’ sys- temic risk contributions, and cross-validation techniques ensure that the predic- tive models are resilient and reliable.
Keywords: Systemic risk, Risk Prediction, Banks, Machine learning, Classifi- cation problem, Emerging economy.
2 — Methodology
2.1 Data and sample
The study will focus on Russian banks and will use data fr om a variety of sources, including individual bank share prices on the Moscow Exchange, govern- ment variables from the Central Bank of Russia, and banking statistics from the Center for Monitoring the Russian Economy. The dataset contains both categori- cal and continuous variables, organized into two categories: aspects of the bank’s balance sheet (e.g., ownership, size, profitability) and characteristics of the bank’s shares (e.g., valuations, market risk). These characteristics influence systemic risk, which is estimated biweekly for each bank and averaged quarterly.
The analysis is divided into two sections, one focusing on high systemic risk and the other on moderate systemic risk as a robustness check. The study uses machine learning (ML) models to categorize banks according to their systemic risk categories (high or moderate). Discretized quartiles of systemic risk (DCoVaR) are used to generate classes, simplifying model training and making predictions more useful. The goal is to create a classifier that can predict cases of high and moder- ate systemic risk, allowing regulators and institutions to anticipate future problems without relying on human experts. The main goal is to create a model that accu- rately generalizes to previously unexplored data.
2.2 Estimation of systemic risk
The DCoVaR metrics of Tobias and Brunnermeier (2016) are used to estimate systemic risk in the Russian banking sector. Quantile regression is used to estimate
systemic risk by defining the stress state as the 1% quantile. DCoVaR is the spillover of bank risk into the system, which accumulates as systemic risk. The estimation methodology includes state variables such as call money rate, 91-day treasury bill and 10-year GSec rate. VaR of bank asset growth is regressed on the lagged state variables.
2.3 Machine learning models
Model characteristics are the key parameters that determine the model’s com- plexity. Prediction models have poor accuracy due to irrelevant features. Feature selection is an important component in machine learning models since it removes redundant features that may have an impact on the model’s performance. The fac- tors stated above are the input features for this investigation. The ML models elim- inate duplicate features and train a durable model. This approach enhances model accuracy while decreasing overfitting. This subsection provides an overview of the ML models utilized in the research.
2.3.1 Logistic regression
LR is a technique for modelling dichotomous categorical outcome variables using one or more explanatory variables. LR calculates the probability that a given bank may emit high or moderate systemic risk.
When the input variables have complex associations, LR performs less effec- tively.
2.3.2 Deep Learning Models in Systemic Risk Prediction
Deep learning algorithms are used in this study to predict systemic risk in Russian banks because they can capture complicated, nonlinear correlations in financial data. Deep learning, a form of machine learning, employs artificial neural networks with several hidden layers to detect complex patterns in banks’ financial statements, balance sheets, and market data. These models, such as feedforward neural networks, may learn from vast amounts of data, allowing them to identify tiny indicators of systemic risk that traditional models may miss. The application of deep learning models enables a more accurate and com- plete assessment of which institutions may be contributing to Russia ‘s financial instability.
2.3.3 Gradient boosting machine
GBM models are gaining popularity due to their efficacy in identifying difficult data sets (Bissacco et al., 2007; Hutchinson et al., 2011; Pittman and Brown, 2011; Johnson and Zhang, 2012). The goal of boosting was to improve a poor learning algorithm in order to generate a strong predictive model. The GBM classifier is built on ensemble techniques that merge multiple weak learning models in a progressive, cumulative, and sequential manner. Gradient boosting algorithms are typically implemented using decision trees. For all ML models, the study filters out critical data-set factors known as feature selection. Feature selection distinguishes valuable factors from noise and other irrelevant variables.
2.4 Addressing Data Imbalance
Addressing data imbalance is critical in systemic risk prediction to avoid model bias towards the dominant class. Techniques such as Adaptive Synthetic Sampling (ADASYN) generate synthetic samples for the minority class, improving the model’s capacity to identify high-risk banks. This method increases the sensi- tivity and forecast accuracy of the models in detecting systemic risk.
2.4.1 Adaptive Synthetic Sampling (ADASYN) for Data Imbalance
Given the rarity of systemic risk events in comparison to non-risk events, the dataset used in this analysis is significantly skewed. To solve this issue, adap- tive synthetic sampling (ADASYN) is used as a data preprocessing technique. ADASYN creates synthetic examples of the minority class based on the difficulty of categorization in the feature space. By adaptively providing more synthetic samples in places wh ere the model struggles to distinguish between high-risk and low-risk banks, ADASYN assures that machine learning models can learn well from imbalanced data. This method improves the model’s capacity to reliably anticipate systemic risk among Russian banks, especially when faced with an uneven distri- bution of risk occurrences.
2.5 Feature Engineering and Dimensionality Reduction
Feature engineering involves creating and transforming input variables to improve model performance, whereas dimensionality reduction seeks to reduce the number of features while retaining critical information. Principal Component Analysis (PCA) simplifies high-dimensional datasets, increasing model efficiency and interpretability by focusing on the most important factors for systemic risk pre- diction.
2.5.1 Principal Component Analysis (PCA)
The dataset used to predict systemic risk in Russian banks has a large number of financial and macroeconomic factors, which can result in high dimensionality and potential difficulties like multicollinearity. To address this, Principal Compo- nent Analysis (PCA) is used as a dimensionality reduction technique. PCA reduces the original high-dimensional data to a smaller set of uncorrelated principle com- ponents that retain the most significant variation from the source data. The study uses PCA to efficiently reduce the dataset’s complexity, enhancing the efficiency and interpretability of machine learning models. This technique helps to narrow the study to the most important aspects contributing to systemic risk, making the predictive framework more robust and manageable.
3 — Limitations of the study
The study may have several limitations, including reliance on historical data that may not reflect future systemic risk scenarios, a lack of generalizability to banks in other areas, issues with imbalanced datasets impacting model accuracy, and potential bias in feature selection. Despite these limitations, the study makes a
substantial contribution to knowledge by improving our understanding of systemic risk factors, offering machine learning approaches for categorization and predic- tion, and generating insights that might help regulators and policymakers. It indi- cates essential features that influence systemic risk, enhances risk management techniques in banking, and lays the groundwork for future research in the area.
4 — Conclusions
1. This study confirms the effectiveness of machine learning, in particular GBM and deep learning, in predicting systemic risk in Russian banks, allowing for a more proactive approach to risk management.
2. The study highlights the importance of addressing data imbalances using strategies such as ADASYN to improve model accuracy and minimize bias in predicting systemic risk.
3. Machine learning models help Russian banks detect systemic threats at an early stage, leading to greater financial stability and resilience in the face of economic and geopolitical challenges.
4. Russian policymakers can use machine learning models to improve systemic risk management practices, especially in response to economic sanctions, geopolitical events, and oil market fluctuations
5. Future research should focus on integrating real-time data sources, expanding the use of advanced machine learning techniques, and examining how exter- nal economic and geopolitical variables affect systemic risk in the Russian financial system.
References
1. Bissacco, A., Yang, M.-H. and Soatto, S. (2007), “Fast human pose estimation using appearance and motion via multi-dimensional boosting regression”, IEEE Conference on Computer Vision and Pattern Recognition, CVPR’07, Minneapolis, MN,doi:10.1109/ CVPR.2007.383129.
2. Hutchinson, R.A., Liu, L.-P. and Dietterich, T.G. (2011), “Incorporating boosted regression trees into ecological latent variable models”, AAAI’11, San Francisco, CA, pp. 343-1348, available at:www.aaai.org/ocs/index.php/AAAI/AAAI11/paper/view/3711.
3. Johnson, R. and Zhang, T. (2012), “Learning nonlinear functions using regularized greedy Forest”, Technical Report. arXiv:1109.0887. doi: 10.2172/1052139.
4. Pittman, S.J. and Brown, K.A. (2011), “Multi-scale approach for predicting fish species distributions across coral reef seascapes”, PLoS ONE, Vol. 6 No. 5, p. e20583, doi: 10.1371/ journal. pone.0020583.
5. Tobias, A. and Brunnermeier, M.K. (2016), “CoVaR”, American Economic Review, Vol. 106 No. 7, p. 1705.
Abstract
This study contributes to the literature on systemic risk in the Russian banking sector. Economic downturns, interest rate fluctuations, geopolitical events, and regulatory changes can impact the entire financial system, creating systemic risks for banks. Individual banks are often unable to cope with these risks, which can threaten their stability and profitability. Thus, good risk management is required to predict and mitigate the impact of such systemic hazards. Banks using machine learning techniques can improve their ability to identify trends, anticipate impending crises, and implement decisions to maintain financial resilience in the face of systemic risks.
Purpose – The goal of this study is to create a strong machine learning framework for predicting systemic risk in Russia's banking sector. The study's goal is to provide insights into the determinants of systemic risk, as well as a data-driven approach to risk management and decision-making for banks and regulators in emerging markets.
Problems – Selecting and implementing the best machine learning models for detecting systemic risk is difficult. Furthermore, evaluating and improving the performance of machine learning models when dealing with imbalanced datasets in systemic risk prediction situations is another significant challenge.
Methodology – The study uses a comprehensive machine learning framework to detect systemic risk in Russian banks. A variety of machine learning methods, Selecting and implementing the best machine learning models for detecting systemic risk is difficult. Furthermore, evaluating and improving the performance of machine learning models when dealing with imbalanced datasets in systemic risk prediction situations is another significant challenge. such as gradient boosting machines (GBM), and deep learning models, are used to detect complicated patterns in financial data. To solve the issue of data imbalance, the study uses advanced resampling techniques including Adaptive Synthetic Sampling (ADASYN). Furthermore, feature engineering and dimensionality reduction methods, such as Principal Component Analysis (PCA), are utilized to improve model performance and interpretability. The DCoVaR approach is used to assess individual banks' systemic risk contributions, and cross-validation techniques ensure that the predictive models are resilient and reliable.
Keywords: Systemic risk, Risk Prediction, Banks, Machine learning, Classification problem, Emerging economy.
2- Methodology
2.1 Data and sample
The study will focus on Russian banks and will use data fr om a variety of sources, including individual bank share prices on the Moscow Exchange, government variables from the Central Bank of Russia, and banking statistics from the Center for Monitoring the Russian Economy. The dataset contains both categorical and continuous variables, organized into two categories: aspects of the bank’s balance sheet (e.g., ownership, size, profitability) and characteristics of the bank’s shares (e.g., valuations, market risk). These characteristics influence systemic risk, which is estimated biweekly for each bank and averaged quarterly.
The analysis is divided into two sections, one focusing on high systemic risk and the other on moderate systemic risk as a robustness check. The study uses machine learning (ML) models to categorize banks according to their systemic risk categories (high or moderate). Discretized quartiles of systemic risk (DCoVaR) are used to generate classes, simplifying model training and making predictions more useful. The goal is to create a classifier that can predict cases of high and moderate systemic risk, allowing regulators and institutions to anticipate future problems without relying on human experts. The main goal is to create a model that accurately generalizes to previously unexplored data.
2.2 Estimation of systemic risk
The DCoVaR metrics of Tobias and Brunnermeier (2016) are used to estimate systemic risk in the Russian banking sector. Quantile regression is used to estimate systemic risk by defining the stress state as the 1% quantile. DCoVaR is the spillover of bank risk into the system, which accumulates as systemic risk. The estimation methodology includes state variables such as call money rate, 91-day treasury bill and 10-year GSec rate. VaR of bank asset growth is regressed on the lagged state variables.
2.3 Machine learning models
Model characteristics are the key parameters that determine the model's complexity. Prediction models have poor accuracy due to irrelevant features. Feature selection is an important component in machine learning models since it removes redundant features that may have an impact on the model's performance. The factors stated above are the input features for this investigation. The ML models eliminate duplicate features and train a durable model. This approach enhances model accuracy while decreasing overfitting. This subsection provides an overview of the ML models utilized in the research.
2.3.1 Logistic regression
LR is a technique for modelling dichotomous categorical outcome variables using one or more explanatory variables. LR calculates the probability that a given bank may emit high or moderate systemic risk.
When the input variables have complex associations, LR performs less effectively.
2.3.2 Deep Learning Models in Systemic Risk Prediction
Deep learning algorithms are used in this study to predict systemic risk in Russian banks because they can capture complicated, nonlinear correlations in financial data. Deep learning, a form of machine learning, employs artificial neural networks with several hidden layers to detect complex patterns in banks' financial statements, balance sheets, and market data. These models, such as feedforward neural networks, may learn from vast amounts of data, allowing them to identify tiny indicators of systemic risk that traditional models may miss. The application of deep learning models enables a more accurate and complete assessment of which institutions may be contributing to Russia 's financial instability.
2.3.3 Gradient boosting machine
GBM models are gaining popularity due to their efficacy in identifying difficult data sets (Bissacco et al., 2007; Hutchinson et al., 2011; Pittman and Brown, 2011; Johnson and Zhang, 2012). The goal of boosting was to improve a poor learning algorithm in order to generate a strong predictive model. The GBM classifier is built on ensemble techniques that merge multiple weak learning models in a progressive, cumulative, and sequential manner. Gradient boosting algorithms are typically implemented using decision trees. For all ML models, the study filters out critical data-set factors known as feature selection. Feature selection distinguishes valuable factors from noise and other irrelevant variables.
2.4 Addressing Data Imbalance
Addressing data imbalance is critical in systemic risk prediction to avoid model bias towards the dominant class. Techniques such as Adaptive Synthetic Sampling (ADASYN) generate synthetic samples for the minority class, improving the model's capacity to identify high-risk banks. This method increases the sensitivity and forecast accuracy of the models in detecting systemic risk.
2.4.1 Adaptive Synthetic Sampling (ADASYN) for Data Imbalance
Given the rarity of systemic risk events in comparison to non-risk events, the dataset used in this analysis is significantly skewed. To solve this issue, adaptive synthetic sampling (ADASYN) is used as a data preprocessing technique. ADASYN creates synthetic examples of the minority class based on the difficulty of categorization in the feature space. By adaptively providing more synthetic samples in places wh ere the model struggles to distinguish between high-risk and low-risk banks, ADASYN assures that machine learning models can learn well from imbalanced data. This method improves the model's capacity to reliably anticipate systemic risk among Russian banks, especially when faced with an uneven distribution of risk occurrences.
2.5 Feature Engineering and Dimensionality Reduction
Feature engineering involves creating and transforming input variables to improve model performance, whereas dimensionality reduction seeks to reduce the number of features while retaining critical information. Principal Component Analysis (PCA) simplifies high-dimensional datasets, increasing model efficiency and interpretability by focusing on the most important factors for systemic risk prediction.
2.5.1 Principal Component Analysis (PCA)
The dataset used to predict systemic risk in Russian banks has a large number of financial and macroeconomic factors, which can result in high dimensionality and potential difficulties like multicollinearity. To address this, Principal Component Analysis (PCA) is used as a dimensionality reduction technique. PCA reduces the original high-dimensional data to a smaller set of uncorrelated principle components that retain the most significant variation from the source data. The study uses PCA to efficiently reduce the dataset's complexity, enhancing the efficiency and interpretability of machine learning models. This technique helps to narrow the study to the most important aspects contributing to systemic risk, making the predictive framework more robust and manageable.
3- Limitations of the study
The study may have several limitations, including reliance on historical data that may not reflect future systemic risk scenarios, a lack of generalizability to banks in other areas, issues with imbalanced datasets impacting model accuracy, and potential bias in feature selection. Despite these limitations, the study makes a substantial contribution to knowledge by improving our understanding of systemic risk factors, offering machine learning approaches for categorization and prediction, and generating insights that might help regulators and policymakers. It indicates essential features that influence systemic risk, enhances risk management techniques in banking, and lays the groundwork for future research in the area.
4- Conclusions
1. This study confirms the effectiveness of machine learning, in particular GBM and deep learning, in predicting systemic risk in Russian banks, allowing for a more proactive approach to risk management.
2. The study highlights the importance of addressing data imbalances using strategies such as ADASYN to improve model accuracy and minimize bias in predicting systemic risk.
3. Machine learning models help Russian banks detect systemic threats at an early stage, leading to greater financial stability and resilience in the face of economic and geopolitical challenges.
4. Russian policymakers can use machine learning models to improve systemic risk management practices, especially in response to economic sanctions, geopolitical events, and oil market fluctuations
5. Future research should focus on integrating real-time data sources, expanding the use of advanced machine learning techniques, and examining how external economic and geopolitical variables affect systemic risk in the Russian financial system.
Читать весь текст
Социальные сети Instagram и Facebook запрещены в РФ. Решением суда от 21.03.2022 компания Meta признана экстремистской организацией на территории Российской Федерации.