4.1 RQ1: What were the key research themes that outline the research scape for ESG and AI in finance?
To generate an optimal number of topic clusters with the highest topic coherence, the study performed topic modelling. The topic modelling results were cross examined with the topic clustering outputs of network analysis. The latter was undertaken for pattern recognition to collectively cluster similar topics using an unsupervised learning approach.
Both topic modelling and network analysis identified eight latent topics, and their results mirrored well with each other. The cluster members derived from network analysis in Fig. 2 corresponded well to the top keywords from the latent topics in Table 1. As an illustration, Fig. 3 shows the topic modelling output for the latent topic ESG Disclosure, Measurement and Governance.
Cross examination of topic modelling, network analysis and full paper reviews allowed the identification of eight research thematic archetypes. The archetypical domains were as follows, namely: (i) Trading and Investment, (ii) ESG Disclosure, Measurement and Governance, (iii) Firm Governance, (iv) Financial Markets and Instruments, (v) Risk Management, (vi) Forecasting and Valuation, (vii) Data, and (viii) Responsible Use of AI.
Table 1
Topic modelling – Latent topic and top keywords
Topic No.
|
Latent Topic
|
Percentage of Tokens
|
Top Keywords
|
1
|
ESG Disclosure, Measurement and Governance
|
15.10%
|
Performance; Examine; Financial; CSR; Corporate; ESG; Firm; Predict; Governance
|
2
|
Responsible Use of AI
|
9.20%
|
Responsible; Explainable; Trustworthy; Auditable; Policy; Tool; Governance; Protocol; Manage
|
3
|
Firm Governance
|
14.70%
|
ESG; Firm; CSR; Corporate; Rating; Capital; Environment; Stock; Management
|
4
|
Financial Markets and Instruments
|
11.90%
|
Index; Bank; Green; Carbon; Climate; ESG; Financing; Islamic; Technology
|
5
|
Data
|
9.90%
|
Data; Dataset; Method; Quality; Generate; Methodology; Parameter; Decision; Issue
|
6
|
Forecasting and Valuation
|
10.70%
|
Predict; Price; Factor; Score; Rating; Emission; Reaction; Stock; Bond
|
7
|
Risk Management
|
11.20%
|
ESG; Risk; Score; Rating; Assessment; Default; Credit; News; Transition
|
8
|
Trading and Investment
|
17.30%
|
Stock; ESG; Market; Price; Impact; Return; Rate; Bond; Sovereign
|
The eight research thematic archetypes are introduced hereforth. To improve the granularity of insights into the archetypical domains, the study further broke down the archetypes into sub-research themes, wherever possible. Selected recent literature were discussed in Table 2 to Table 9 to provide a thematic disposition of the state-of-the-art.
This research archetype comprises papers involving the applications of AI in trading and investment activities within the ESG domain, associated with ESG news or factors, or expressing ESG considerations.
Efficient and effective investment strategies and portfolio selection and optimization have been major areas of research in finance. In recent years, the application of AI techniques has revolutionized investment management. These techniques enable automation and intelligent investment, which involves predicting market trends, recommending trading signals, and minimizing risk through causal representations of limit order book markets (Cao et al., 2020).
Incorporating ESG factors into investment strategies and portfolio selection has become increasingly important. One example of this is portfolio risk analysis, for instance, where AI techniques are employed to detect exceptional decoupling scenarios such as correlation changes, structural breaks or simultaneous asset shocks in portfolio assets to minimize risk (e.g., Zhang et al., 2022; Taleb et al., 2020; Serafeim and Yoon, 2022; Fabozzi and Karagozoglu; 2021). AI techniques can also be used to analyze the impact of ESG factors on investment performance (e.g., Yoo, 2022; Ullah et al., 2021; Twinamatsiko and Kumar, 2022; Ielasi, Ceccherini and Zito, 2020; Erhardt, 2020). For instance, natural language processing (NLP) can be applied to analyze news and social media content to identify ESG-related trends and issues.
One of the critical areas of exploration is designing and implementing intelligent trading and investment decision-support platforms, online services, and mobile applications that consider ESG factors (e.g., Yang et al., 2020; Sokolov et al., 2021b; Katterbauer et al., 2022; Katterbauer and Moschetta, 2022; He et al., 2021; Hakala, 2019). For example, machine learning-enabled recommenders and game theory can be used to analyze loan supply-demand equilibrium and risk-return balance in peer-to-peer lending loans. Personalized stock recommendations can also be made by considering investor preferences, behaviors, past performance, and ESG factors.
This thematic area had the highest number of papers (\(n=91\)), with average citation count of 17.3. The publication types were diversified, in the form of book (\(n=1\)), case study (\(n=1\)), review article (\(n=1\)), perspectives, opinions and commentaries (\(n=3\)), and original research (\(n=85\)). Publication avenues include Journal of Banking and Finance, Journal of Sustainable Finance and Investment, Journal of Portfolio Management, Journal of Impact and ESG Investing, Journal of Financial Data Science, Sustainability, Decision Support Systems, IEEE Symposium on Computational Intelligence for Financial Engineering and Economics, and ACM International Conference on AI in Finance.
Papers from this thematic area can be further segregated into the following sub-research themes: (i) trading and investing design and strategies, (ii) online and offline portfolio optimization, (iii) automated and smart investment, and (iv) market anomaly analysis (Table 2).
Table 2
Sub-theme(s) within research thematic area of Trading and Investment
Research sub-theme(s)
|
Example(s) of Related Literature
|
Trading and investing design and strategies
|
• Melas (2021), Coqueret and Guida (2020), Haghshenas and Karim (2022), and Cherief et al. (2022) implemented ESG factors in factor investing, utilizing mainly multivariate regression. The latter two further utilized hidden Markov model, and a combination of gradient boosting and enhanced random forest respectively.
• Jacobsen, Lee and Ma (2019), Antoncic (2020), Erhardt (2020), and Chen and Liu (2020) implemented multivariate regression, natural language processing, XGBoost and a basket of regularized regression, support vector regression, random forest and LSTM approaches respectively, to identify alpha signals in ESG investing.
• Guo et al. (2020) proposed a deep learning framework incorporating natural language processing, known as esg2risk, to predict stock volatility using ESG news.
• Brusseau (2021) proposed a model for ethical investing for AI-intensive companies.
• Quinn, Fisch and Robertson (2021) examined the consistency of ESG funds delivering ESG investing using multivariate regression.
• Rannou, Boutabba and Mercadier (2022) applied fuzzy c-means clustering and k-means clustering to examine greenness in the portfolios of funds holding the socially responsible investment (SRI) labels at stock level.
• Novak, Amicis and Mozetič (2018) investigated players and their interactions in the impacting investing market using support vector machine, naïve bayes and network analysis.
• Zhang (2021) examined impact ventures, fundraising and business performance using multivariate regression and dynamic Bayesian model.
• Begenau and Siriwardane (2022), Kieffer et al. (2021) and Kirppu (2019) examined private equity and ESG considerations; the former two used multivariate regression, while the latter utilized genetic algorithm.
• de la Barcena Grau (2021) discussed the application of machine learning to generate social impact returns.
|
Online and offline portfolio optimization
|
• Hanine et al. (2021) explored fuzzy approach that incorporate ESG factors within multiple objective portfolio optimization.
• Meier and Danzinger (2022), and Dreżewski, Dziuban and Pająk (2018) applied evolutionary learning-based genetic algorithm, whereas Maree and Omlin (2022), Yang et al. (2020) and Vo et al. (2019) proposed reinforcement learning based approaches on portfolio selection expressing ESG considerations.
• Reyners (2021) employed tree-based regression with gradient boosting, and gaussian process regression to examine impact of ESG criteria in risk and returns of optimal portfolios.
• Gobet and Lage (2021) employed an optimal transport and multistage optimization criterion approach to align credit scores with ESG scores for credit portfolios.
|
Automated and smart investment
|
• Katterbauer and Moschetta (2022) and Katterbauer et al. (2022) looked at robo-advisory platforms for Islamic finance instruments.
• Hakala (2019) studied the use of AI in robo-advisors for private wealth advisory services.
|
Market anomaly analysis
|
Market anomaly analysis generally involves multivariate regression.
• Deng et al. (2022a) performed an event studies analysis on stock market’s reaction to Russia-Ukraine war, particularly in relation to sanctions, energy and ESG.
• Caferra et al. (2022) examined the sustainable orientation of investors during Covid-19.
• Faccini, Matin, and Skiadopoulos (2021), and Mousa, Saleem and Sági (2021) studied the impact of climate risk on stock prices.
Recent studies further incorporate natural language processing to identify signals from text data. These studies can be divided into two sub-types.
• One approach involves the processing and conversion of text data into market signals to identify event impacts, such as Bessec and Fouquau (2020) that examined the impact of green sentiment on stock returns.
• Another approach is the application of post-processed sentiment data into numerical input attributes that can be directly added into multivariate regressions to investigate event impacts, such as Naumer and Yurtoglu (2020) that examined tonality of news flow and the cross section of expected stock returns, taking ESG scores into account.
|
This research archetype relates to research associated with ESG disclosure, measurement and governance aspects that involves the application of AI, spanning across from firm to macro level considerations. Examples of related literature in the sub-research themes ESG disclosure, measurement, governance are shown in Table 3.
ESG disclosure refers to the information that companies provide to investors about their ESG performance. This information can include data on a company's environmental impact, social policies, and governance practices. Analyzing ESG disclosure data can identify trends and insights associated with equity research and investor sentiment analysis, among others (e.g., Clarkson et. Al, 2020; Goloshchapova et al., 2019; Raman, Bang and Nourbakhsh, 2020). One example of how AI is being used to analyze ESG disclosure data is through textual analysis. Researchers have used machine learning algorithms to analyze corporate ESG reports and identify trends and insights associated with credit assessment. Another approach is the use of NLP to examine corporate social responsibility (CSR) disclosures and identify the specific ESG issues that companies are addressing.
ESG measurement refers to the process of measuring the social and financial impact of ESG factors on companies and investments. AI is being used to develop ESG measurement tools such as indices (e.g., Green Sentiment Index) that help investors and analysts anticipate changes in a company's ESG performance and make decisions accordingly (e.g., Huang, Wang and Yang, 2021; Reig-Mullor et al., 2022; Briere and Ramelli, 2020; Chang and Lee, 2020) .
ESG governance refers to the processes and structures that companies have in place to identify and assess potential ESG risks, as well as to monitor and ensure compliance with relevant regulations and policies. AI is being used to automate the tracking and analysis of ESG data, which can assist regulators in enforcing ESG regulations and encourage companies to improve their ESG governance practices (e.g., Chen, Wang and Jin, 2022; Fan and Wu, 2022).
This thematic area had the second highest number of papers (\(n=77\)), with average citation count of 10.5. The publication types were diversified, in the form of book (\(n=1\)), review article (\(n=4\)), perspectives, opinions and commentaries (\(n=2\)), regulatory study or guidelines (\(n=1\)), and original research (\(n=69\)). Publication avenues include Journal of Business Ethics, Review of Accounting Studies, European Journal of Finance, Economic Research, Sustainability, Technological Forecasting and Social Change, Machine Learning and Knowledge Extraction, and AAAI Workshop on Knowledge Discovery from Unstructured Data in Financial Services.
Table 3
Sub-theme(s) within research thematic area of ESG Disclosure, Measurement and Governance
Research sub-theme(s)
|
Example(s) of Related Literature
|
ESG disclosure
|
• Clarkson et. al (2020) performed textual analysis to analyze ESG disclosure using classifier algorithms including random forest and XGBoost.
• Goloshchapova et al. (2019) examined corporate social responsibility disclosure using natural language processing.
• Raman, Bang and Nourbakhsh (2020) detected historical trends of ESG discussions by analyzing the transcripts of corporate earning calls using pre-trained BERT, XLNet and RoBERTa.
|
ESG measurement
|
• Bouyé and Menville (2020) examined sovereign ESG ratings using regularized regression and principal component regression.
• Huang, Wang and Yang (2021) developed FinBERT, a language model adapted to the financial domain that can help improve ESG sentiment analysis and classification accuracy.
• Briere and Ramelli (2020) proposed a Green Sentiment Index that captures shifts in investors' appetite for environmental responsibility using multivariate regression.
• Chang and Lee (2020) proposed a Sustainable Development Progress Index based on the circular economy, using forest-based learning, clustering and multivariate regression.
• Reig-Mullor et al. (2022) proposed a neutrosophic AHP-TOPSIS based approach that applies fuzzy set, analytic hierarchy process, and multicriteria decision analysis to assess corporate ESG performance.
|
ESG governance
|
• Chen, Wang and Jin (2022) evaluated ESG guidelines, green innovation and environmental impact, and financial performance of green enterprises using multivariate regression.
• Fan and Wu (2022) studied the effect of environmental regulations on firm valuation and policies using natural language processing and multivariate regression.
|
This research archetype relates to research on the application of AI on firm-related ESG studies, specifically encompassing several sub-research themes: (i) smart operations, CSR and regulations, (ii) intelligent e-commerce and supply chain management, (iii) corporate finance, (iv) intelligent marketing. Examples of related literature are shown in Table 4.
Smart operations, CSR and regulations involve a thorough analysis, optimization, and evaluation of operational, financial, and service risks to enhance performance and mitigate low-performing areas, failures, and losses (e.g., Seele, 2017; Chava, Du, and Malakar, 2021; Svanberg et al., 2022; Hernandez-Perdomo, Elvis; Guney, Yilmaz; Rocco, 2019). For instance, firms can adopt an ESG perspective to assess and mitigate environmental risks associated with business operations. This can include identifying and reducing energy waste, monitoring and reducing carbon emissions, and predicting and preventing equipment failure can help firms reduce their carbon footprint.
Intelligent e-commerce and supply chain management require the estimation, prediction, and optimization of various business aspects, including pricing, demand, supply, production, storage, logistics, delivery, marketing, risk, and fraud (e.g., Jebamikyous et al.,2023; Soni et al., 2022; Coqueret and Tran, 2022). To adopt an ESG perspective, firms can optimize logistics and delivery processes and predict demand for sustainable products and services to reduce their carbon footprint.
Corporate finance involves the analysis, prediction, and optimization of various financial aspects of business, including corporate financial budget, accounting integrity, auditing issues, and payment accuracy. Firms must detect and mitigate financial fraud, irregularities, and unethical behavior by company executives. From an ESG perspective, analyzing the impact of environmental policies and regulations on financial performance is necessary. Evaluating the social impact of corporate financial decisions, such as layoffs and plant closures, can help companies improve their overall governance practices (e.g., Chava, 2014; Bussmann, Tanda and Yu; 2022; De Lucia, Pazienza and Bartlett, 2020; Yadav, Kar and Kashiramka; 2022; Turunen, 2021; Teoh et al., 2019).
In terms of intelligent marketing, AI can help firms analyze marketing performance, recommend and optimize marketing campaigns, and understand customer needs, sentiment, satisfaction, and concerns (e.g., Akter et al., 2022; Wirtz et al., 2023; He et al., 2021; Dash and Kajiji, 2020). To adopt an ESG perspective, firms can analyze the social impact of marketing campaigns and recommend socially responsible marketing strategies that address issues like diversity and inclusion. Firms should also consider the environmental impact of marketing campaigns and recommend sustainable marketing strategies to reduce their carbon footprint.
This thematic area had the third highest number of papers (\(n=52\)), with average citation count of 20.8. The publication types were in the form of review article (\(n=4\)), and original research (\(n=48\)). Publication avenues include Management Science, Journal of Applied Corporate Finance, Journal of Enterprise Information Management, Technological Forecasting and Social Change, Journal of the Operational Research Society, and Sustainability.
Table 4
Sub-theme(s) within research thematic area of Firm Governance
Research sub-theme(s)
|
Example(s) of Related Literature
|
Smart operations, CSR and governance
|
• Seele (2017) studied predictive policing of corporate sustainability management.
• Hrazdil, Mahmoudian, and Nazari (2021) examined corporate executive personalities and CSR performance based on S&P 500 firms using multicriteria decision analysis and multivariate regression.
• Chava, Du, and Malakar (2021) investigated if managers walk the talk on environmental and social issues, using pre-trained RoBERTa and multivariate regression.
• Moniz and Jong (2014) examined the impact of employee satisfaction on firm earnings using natural language processing.
• Ranta and Ylinen (2021) looked at workplace diversity as a predictor of firm performance using XGBoost and regularized regression.
• Kiriu and Nozaki (2020) evaluated firms’ CSR activities using natural language processing.
• Chae and Park (2018) employed natural language processing to examine CSR using Twitter and topic modelling.
• Economidou et al. (2022) examined whether a firm's engagement in ESG practices is material to market participants, using k-nearest neighbor and multivariate regression.
• Sukthomya and Laosiritaworn (2018) employed neural network to examine relationship between CSR and stock price.
• Chen et al. (2021) applied natural language processing on CSR reports of Russell 1000 companies to measure corporate alignment with the United Nations Sustainable Development Goals.
• Hernandez-Perdomo, Elvis; Guney, Yilmaz; Rocco (2019) assessed corporate governance and optimal risk-taking using association analysis and decision tree.
• Svanberg et al. (2022) predicted corporate governance performance ratings using a range of machine learning techniques, including linear and RBF support vector machine, and quadratic discriminant analysis.
|
Intelligent e-commerce and supply chain management
|
• Jebamikyous et al. (2023) leveraged AI and blockchain in e-commerce applications.
• Coqueret and Tran (2022) evaluated how ESG shocks can affect the returns of clients and suppliers using multivariate regression.
• Soni et al. (2022) applied fuzzy set to propose a framework for decision making for sustainable supply chain finance.
|
Corporate finance
|
• Chava (2014) examined environmental externalities and cost of capital using multivariate regression.
• Bussmann, Tanda and Yu (2022) reviewed the role of ESG in a firm’s cost of capital using XGBoost.
• De Lucia, Pazienza and Bartlett (2020) investigated if good ESG performances are linked to better financial performances using a range of supervised and deep learning techniques, including support vector machine, regularized regression and neural network.
• Alkaraan et al. (2022) examined the impact of ESG on financial performance using textual analysis and multivariate regression.
|
Intelligent Marketing
|
• Wirtz et al. (2023) examined corporate digital responsibility in service firms and their ecosystems.
• He et al. (2021) investigated cognitive user interface for portfolio optimization to enhance user experience.
|
This research archetype relates to research on the application of AI on ESG issues pertinent to the financial markets, market participants, products, activities and technology solutions. In particular, papers from the cluster can be further segregated into: (i) market systems and simulation, (ii) smart banking and payment, (iii) financial products and services, and (iv) financial technology (Table 5).
Market systems and simulation (i) examines the interplay, associations, and effects among macro and microeconomic elements, and (ii) emulates and evaluates market mechanisms, models, hypotheses, policies, innovative products and services, trading rules, and regulations using multiagent systems. It involves developing models to understand the impact of various economic, social, cultural, and political factors on financial markets and their resilience. From an ESG perspective, this encompasses studying the consequences of environmental threats on economic expansion, investigating the role of societal and cultural components on financial markets, and examining the outcomes of policy interventions on sustainable economic growth (e.g., Tufail et al., 2022; Semet, Roncalli and Stagnol, 2021; Wang et al., 2021; Yu et al., 2022; Zhang and Han, 2022).
Smart banking and payment addresses the design and analysis of intelligent, secure, and risk-averse digital banking and payment methods, tools, behaviors, and services. The goal is to study and forecast banking and payment trends, growth, risk, fraud, security, and malfunctions. From an ESG perspective, this involves supporting eco-friendly banking practices, investing in sustainable initiatives, ensuring access to banking services for underserved populations, safeguarding customer data and privacy, and promoting financial literacy among customers (e.g., Oro, Ruffolo and Pupo, 2020; Nguyen et al., 2023).
Financial products and services studies a range of products and services, ranging from wealth management products to internet finance services. For instance, in the area of insurance, studies focus on estimating, predicting, optimizing, and recommending insurance products and services, along with their pricing and market positioning. It also involves personalized product customization, fraud detection, and risk assessment. From an ESG standpoint, this includes identifying environmental risks linked to insured assets and liabilities, evaluating the carbon footprint and environmental impact of insured assets and services, and developing sustainable insurance products and services (e.g., Sætra, 2022; Dugo, 2021; Duchin, Gao and Xu, 2022; Cherrington et al., 2020).
The realm of financial technology can include studies on blockchain systems and mechanisms. This involve analyzing the intricacies of blockchain systems to improve their design and functioning, evaluating and enhancing bitcoin and cryptographic contract models, and optimizing pricing and portfolio management. From an ESG standpoint, this includes ensuring that blockchain systems are energy-efficient to minimize the environmental impact of cryptocurrency mining, examining the ESG implications of blockchain on various industries, such as supply chain transparency, ethical sourcing, and carbon footprint reduction, and developing sustainable crypto models (e.g., Yu, 2022; Jebamikyous et al., 2023).
This thematic area had the forth highest number of papers (\(n=50\)), with average citation count of 10.8 The publication types were in the form of review article (\(n=7\)), case study (\(n=1\)), and original research (\(n=42\)). Publication avenues include Journal of Corporate Finance, Research in International Business and Finance, Review of Financial Economics, Economic Research, Omega, Technological Forecasting and Social Change, ACM Computing Surveys, Journal of Artificial Intelligence and Technology, Computational Intelligence and Neuroscience, Neural Computing and Applications, and Annals of Operations Research.
Table 5
Sub-theme(s) within research thematic area of Financial Markets and Instruments
Research sub-theme(s)
|
Example(s) of Related Literature
|
Market systems and simulation
|
• Rolnick et al. (2022) discussed the applications of machine learning in climate change problems, including climate finance.
• Klusak et al. (2021) examined effect of climate change on sovereign creditworthiness using random forest and multivariate regression.
• Sun (2022) examined the relationship between green finance and carbon emissions using neural network.
• Hemanand et al. (2022) analyzed green finance for environmental development using neural network and financial maximally filtered graph (FMFG) algorithm.
• Lin and Zhao (2022) examined impact of green finance on the ecologicalization of urban industrial structures using k-nearest neighbor, random forest and multivariate regression.
• Khalfaoui, Jabeur and Dogan (2022) examined spillover effects and connectedness among green commodities, Bitcoins, and US stock markets using multivariate regression.
• Bariz (2022) examined if credit default swap markets incorporate ESG-related information in their assessment of a firm's risk, using natural language processing and multivariate regression.
• Deng et al. (2022b) examined the stock market for Russia-Ukraine war and climate policy expectations.
• Roy, Rao and Zhu (2020) examined relationship between CSR and stock market liquidity using multivariate regression.
• Entrop, Rohleder, and Seruset (2022) examined if religiosity affects liquidity for U.S. listed companies using multivariate regression.
• Strignert and Malm (2021) examined volatility between green and non-green ETFs using multivariate regression.
|
Smart banking and payment
|
• Ishizaka, Lokman and Tasiou (2021) applied hierarchical clustering to evaluate bank performances using ESG criteria.
• Citterio (2020) used a range of machine learning approaches on ESG indicators, including linear discriminant analysis and k-nearest neighbor, to predict bank failures.
• Kneppers (2022) developed a CSR tool using natural language processing for performance analysis in banks.
• Miranti and Oktaviana (2022) examined capital structure and financial sustainability of Sharia public financing bank using multivariate regression.
• Similarly, applying multivariate regression, Newton et al. (2022) analyzed the relationship between banks, public bond market and borrowers' ESG performance, and Demir and Danisman (2021) analyzed relationship between bank stock price and ESG scores.
|
Financial products and services
|
• Sokolov et al. (2020) applied pre-trained BERT for ESG index construction.
• Slimane (2021) utilized genetic algorithm for bond index tracking with constraints on ESG considerations.
• Lichtenberger, Braga and Semmler (2022) examined performance of green against non-green bonds using decision tree and multivariate regression.
• Kanzi and Moula (2021) examined Islamic finance indices and their conventional counterparts using multivariate regression.
• Soni et al. (2022) used fuzzy set to propose a framework for decision making for sustainable supply chain finance.
• Pettinari (2021), Chen (2021) and Hong et al. (2021) investigated mergers and acquisition and ESG value, risk and sustainable development, respectively. The former two utilized multivariate regression, whereas the latter utilized support vector machine and Adaboost.
• Fedorova, Druchok and Drogovoz (2021) used transfer learning (BERT), random forest and multivariate regression to examine the impact of media sentiment on climate change and environmental policies on IPO underpricing.
• Vo (2020) applied a range of deep learning and machine learning models, including LSTM and ensemble models, to tackle challenges in wealth management.
|
Financial technology
|
• Hakala (2019) discussed roboadvisory, with considerations of ethical and ESG investing.
• Khan and Ahmad (2022) proposed integrating ESG and machine learning considerations in decentralized finance.
• Faust (2022) examined effects of altruism on crowdfunding outcomes in initial coin offerings using natural language processing and multivariate regression.
• Gidron et al. (2021) identified tech impact startups (TIS) within startup databases using pre-trained BERT.
|
This research archetype comprises papers involving the application of AI on risk management, spanning from macro level risk to portfolio level risk. In particular, papers from the cluster can be further segregated into: (i) risk management, and (ii) credit management (Table 6).
Within the realm of risk management, AI techniques are employed to model, forecast, and control risk elements, implications, and intensity, as well as fraud, criminal activity, security incidents, and money laundering linked to a wide range of financial products, methods, markets, and participants. With respect to ESG, this involves utilizing ESG data to identify and prevent fraud and crime, estimate and tackle the repercussions of ESG-related occurrences, and enhance risk management approaches by incorporating ESG factors (e.g., Yang, Caporin and Jiménez-Martin, 2022; Roy and Shaw, 2021).
Credit management entails the application of AI to assess, forecast, and fine-tune credit scores, ceilings, assessments, timelines, defaults, repayments, refinancing, risk management, and fraud prevention. In the context of ESG, this encompasses providing green financing for renewable energy initiatives, granting credit accessibility for disadvantaged communities, determining the carbon footprint of credit portfolios, endorsing eco-friendly business operations, and supporting equitable lending practices (e.g., Liermann, Li and Waizner, 2021; Gobet and Lage, 2021; Brogi, Lagasio and Porretta, 2022; Mansouri and Momtaz, 2022).
This thematic area had the fifth highest number of papers (\(n=41\)), with average citation of 25.1. The publication types were in the form of review article (\(n=1\)), perspectives, opinions and commentaries (\(n=3\)), and original research (\(n=37\)). Publication avenues include Risks, Risk Management, The Accounting Review, Sustainability and Journal of Big Data.
Table 6
Sub-theme(s) within research thematic area of Risk Management
Research sub-theme(s)
|
Example(s) of Related Literature
|
Risk management
|
• Patterson et al. (2022) and Yang and Broby (2020) applied geospatial analytics for macro ESG insights using computer vision.
• Angelova et al. (2021) examined sovereign rating methodologies, ESG and climate change risk using multivariate regression.
• Hofinger (2021) developed a scoring model that captures quality of regulatory risk disclosures in banking industry using principal component analysis, natural language processing and multivariate regression.
• Apel, Betzer and Scherer (2021) developed a transition risk point-in-time index to approximate changes in transition risk from climate-related news events, using pre-trained BERT and multivariate regression.
• Jan (2021) used LSTM to detect financial statement fraud for sustainable development of capital markets.
• Fianu (2021) examined ESG contributions in risk measurement of insurance sector using network analysis and hierarchical clustering.
• Nguyen, Diaz-Rainey and Kuruppuarachchi (2021) used a range of machine learning approaches, including tree-based ensemble, k-nearest neighbor, and regularized regression, to predict corporate carbon footprints for financial risk analysis.
• Khan, Serafeim and Yoon (2016) examined the materiality levels of corporate sustainability issues using multivariate regression.
• Hisano, Sornette, and Mizuno (2020) predicted firms in negative screening list for ESG criteria, using natural language processing and random forest.
• Duong et al. (2022) analyzed firm carbon risk management from credit default swap market using multivariate regression.
• Hamidi et al. (2021) examined corporate reputation risk in social media using natural language processing.
• Mehra, Louka, and Zhang (2022) proposed ESGBERT, a text mining approach based on pre-trained BERT and multivariate regression, to identify material ESG risk and growth opportunities for investing.
|
Credit management
|
• Nguyen et al. (2020) examined the relationship between loan portfolios and transition risk using multivariate regression.
• Hasan, Lynch and Siddique (2022) used regularized regression and XGBoost to examine corporate default risk and ESG factors.
• Roy and Shaw (2022) utilized a fuzzy BWM and fuzzy TOPSIS approach to develop a multi-criteria sustainable credit score system.
• Katterbauer and Moschetta (2022) applied random forest on credit scoring for inclusive and microfinance.
|
This research archetype comprises papers involving the application of AI to forecast ESG-related asset pricing, ratings or scores, or value ESG-related instruments, carbon emissions or biodiversity. In particular, papers from the cluster can be further segregated into: (i) Valuation, and (ii) Forecasting (Table 7).
Valuation research area is centered on the use of AI for estimating and predicting values, prices, demand, and supply with an emphasis on ESG factors. For instance, optimization of pricing, movement, supply, and demand for various energy sources, including electricity, oil, solar, gas, wind, nuclear, and water power. This research area also involves determining the environmental impact of factors such as carbon emissions, water consumption, and waste management. Additionally, it ascertains the supply and demand for sustainable practices in areas like investments (e.g., Sætra 2022; Dugo, 2021; Duchin, Gao and Xu, 2022; Cherrington et al., 2020).
Forecasting focuses on developing models to predict market movements, trends, volatility, anomalies, and events from an ESG perspective. This includes anticipating the consequences of environmental and social events, such as natural disasters and resource depletion, on financial markets. Furthermore, it entails forecasting the influence of CSR practices on financial performance and projecting the future effects of governance policies on market volatility. Predicting pricing, movement, supply, and demand within the energy market is also an integral part of this research area (e.g., Larsson and Ling, 2021; Anas et al., 2020; Jabeur, Khalfaoui and Arfi, 2021; Gao, Wang and Yang, 2022; Guliyev and Mustafayev, 2022).
This thematic area had the sixth highest number of papers (\(n=38\)), with the second smallest average citation of 4.7. All publications belonged to the original research category. Publication avenues include Decisions in Economics and Finance, Computational Management Science, Journal of Sustainable Finance and Investment, Sustainability, Science of the Total Environment, and Journal of Environment Management.
Table 7
Sub-theme(s) within research thematic area of Forecasting and Valuation
Research sub-theme(s)
|
Example(s) of Related Literature
|
Forecasting
|
• Wang et al. (2021) used a random forest-based ensemble for feature extraction and deep learning LSTM model for carbon price forecasting.
• Jabeur, Khalfaoui and Arfi (2021) used a range of tree-based ensemble, including LightGBM, CatBoost, XGBoost, random forest, and a deep learning neural network model, to predict oil prices using green energy resources, ESG indices, and stock markets.
• García et al. (2020) used to rough set approach with multivariate regression to predict ESG ratings using financial performance variables.
• D’Amato, D’Ecclesia, and Levantesi (2022) predicted ESG scores for ESG investing using random forest.
• Marozzi and Lanza (2022) used LSTM and pre-trained BERT and BERT derivatives, such as RoBERTa and XLM-RoBERTa, to formulate real-time forward Twitter-based ESG score for stocks.
• Hossain (2020) used a range of supervised and deep learning methods, such as naïve bayes and multilayer perceptron, to predict corporate green Islamic bond ratings.
• Yu et al. (2018) applied natural language processing, support vector machine, and a dynamic time warping-based clustering approach, to predict stock price reaction to negative news, including ESG news.
• Chen et al. (2019) used tree-base ensemble, restricted Boltzmann machine and bi-directional LSTM to incorporate fine-grained events such as ESG news on stock movement prediction.
|
Valuation
|
• Applying regularized multivariate regression, Semet, Roncalli and Stagnol (2021) priced ESG and sovereign risk in credits and credit ratings, and Joshi and Chauhan (2021) valued stocks incorporating ESG factors.
• Obrizzo (2021) applied a web-scraping approach using natural language processing and multivariate regression to ESG valuation.
• Han et al. (2021) used gradient boosted trees and neural network to estimate corporate greenhouse gas emissions for investing decision making.
• Agarwala et al. (2022) used random forest to value the impact of loss of nature and biodiversity on sovereign credit ratings.
• Katterbauer et al. (2022) used XGBoost to price Islamic bonds.
• Yang and Jiménez-Martin (2022) used multivariate regression to construct ESG risk factors for asset pricing models.
• Agliardi and Agliardi (2021) studied the determinants of green bond prices using multivariate regression.
• Sautner et al. (2021) estimates risk premium for firm-level climate change exposure among S&P 500 stocks using natural language processing, on top of multivariate regression.
|
This research archetype comprises papers discussing issues related to the utilization of ESG data or methodologies proposed to create new ESG datasets. Examples of related literature are shown in Table 8.
This approach aims to identify and extract ESG-related information from vast amounts of unstructured data, including news articles, social media posts, company reports, and satellite images. Such data can be used to quantify various environmental factors, such as deforestation, air pollution, and water usage. From an ESG perspective, among other techniques, this research area includes leveraging textual data from corporate sustainability reports to extract ESG-related information. By using natural language processing techniques, researchers can create ESG scores for companies, which can help evaluate their ESG performance (e.g., Lopez, Contreras and Bendix, 2020; Geissler et al., 2022; Gupta, Sharma and Gupta, 2021; Sokolov et al., 2021b).
This thematic area had the second smallest number of papers (\(n=14\)), with average citation of 14.7. All publications belonged to the original research category. Publication avenues include Journal of Applied Corporate Finance, Journal of Impact and ESG Investing, Global Economic Review, Sustainability, and Big Data and Society.
Table 8
Sub-theme(s) within research thematic area of ESG as an Alternative Data
Research sub-theme(s)
|
Example(s) of Related Literature
|
ESG as an Alternative Data
|
These are generally qualitative papers, with the exception of papers that focus on introducing novel methodologies to generate ESG datasets.
• Kotsantonis and Serafeim (2019), and Lopez, Contreras and Bendix (2020) discussed about the quality and issues regarding the use of ESG data.
• In, Rook and Monk (2019) shared the integration of ESG data in investment decision making.
• In terms of generating ESG dataset, Geissler et al. (2022) used generative adversarial network to generate multivariate data such as ESG scores for financial scenarios generation, and Sokolov et al. (2021b) used the pre-trained BERT model for natural language processing to generate ESG scores.
• Gupta, Sharma and Gupta (2021) presented a methodology to create an ESG dataset, and framework to gauge the importance of ESG parameters for investment decisions, using multivariate regression and random forest.
|
This research archetype comprises papers discussing issues related to the responsible and explainable use of AI in finance, including but not limited to more efficient management of carbon emissions when developing AI models, and the introduction of firm or industry level ESG policing and governance to manage AI assets, capabilities and activities. Examples of related literature are shown in Table 9.
By adopting techniques like explainable AI, financial institutions can develop more transparent and interpretable models. This allows for a better understanding of the decision-making process, ensuring that it is responsible and ethical. From an ESG standpoint, key considerations include preventing machine learning algorithms from exacerbating or amplifying biases against any social or demographic group and guaranteeing the transparency and interpretability of models and their decisions (e.g., Lacoste et al., 2019; Hoepner et al., 2021; Fritz-Morgenthal, Hein and Papenbrock, 2022).
This thematic area was the smallest in terms of number of papers (\(n=7\)) published, with however, the highest average citation of 46.3. The publication types were in the form of review article (\(n=1\)), perspectives, opinions and commentaries (\(n=3\)), and original research (\(n=3\)). Publication avenues include European Journal of Finance, Frontiers in Artificial Intelligence, and Nature Machine Intelligence.
Table 9
Sub-theme(s) within research thematic area of Responsible Use of AI
Research sub-theme(s)
|
Example(s) of Related Literature
|
Responsible and Explainable AI
|
These are generally framework and/or qualitative-type papers.
• Lacoste et al. (2019) proposed a tool that quantifies carbon emissions of machine learning models for corporate practitioners.
• Hoepner et al. (2021) discussed the importance of explainability issues in financial data science research.
• Fritz-Morgenthal, Hein and Papenbrock (2022) exposited on governance issues to support the establishment of responsible, trustworthy, explainable, auditable and manageable AI in production in the finance industry.
• Sætra (2022) developed a company's ESG protocol for better firm governance and stakeholder communication for AI capabilities, assets, and activities.
• Seele (2017) discussed predictive policing of corporate sustainability management, and its value to shareholders and financial analysts.
|
Through the identification of distinct thematic research focal areas and sub-themes that underlie existing research efforts, as evidenced by extant literature, this review provides a thematic classification and disposition of the state-of-the-art.
4.2 RQ2: How were the research intensity and research interest across various research archetypical domains, and how did they evolve across time?
To understand the magnitude of key research efforts, broken down by the archetypical domains, we next studied the overall research intensity. Table 10 provides an indication of the overall research intensity for each thematic research cluster, in terms of total paper count, total citation count and citation impact. A citation-based research metric, the Citation Impact, measures research impact by normalizing the citation count in terms of the number of publication, to obtain the average number of times each publication is cited (McMaster University, 2022).
Table 10
Publication and citation numbers by research cluster
Thematic Research Area
|
Total Paper Count
|
Total Citation Count
|
Citation Impact
|
Trading and Investment
|
91
|
1578
|
17.3
|
ESG Disclosure, Measurement and Governance
|
77
|
809
|
10.5
|
Firm Governance
|
52
|
1082
|
20.8
|
Financial Markets and Instruments
|
50
|
542
|
10.8
|
Risk Management
|
41
|
1030
|
25.1
|
Forecasting and Valuation
|
38
|
178
|
4.7
|
Data
|
14
|
206
|
14.7
|
Responsible Use of AI
|
7
|
324
|
46.3
|
Overall
|
370
|
5749
|
15.5
|
Research covered 370 publications generating a total of 5749 citations. This translated to a mean citation count per publication of 15.5. On an overall basis, it was observed that Trading and Investment generated the highest research attention in terms of the total number of papers published, and in turn the highest total citation count, translating to an above average citation count per publication. ESG Disclosure, Measurement and Governance followed in second place, but its overall citation count was relatively low, resulting in the second lowest citation count per publication.
Interestingly, it was Responsible Use of AI that brought about the highest traction in terms of citation impact of 46.3, followed by Risk Management and Firm Governance. In contrast, Forecasting and Valuation generated a meagre 4.7 in terms of citation count per publication; a one order-of-magnitude difference with Responsible Use of AI. Alongside Forecasting and Valuation, the aforementioned ESG Disclosure, Measurement and Governance, and Financial Markets and Instruments round up the bottom three positions.
The next natural question to evaluate was how the archetypical domains evolved across time. Figure 4 shows the absolute publication numbers, broken down by the archetypical domains, between 2008 to 2022. It was observed that 2018 appeared to be the inflexion point, beyond which research efforts grew significantly.
In terms of the evolution of the absolute citation count as shown in Fig. 5, Trading and Investment had historically generated the highest interest across the years. Citations within the Risk Management, Firm Governance and ESG Disclosure, Measurement and Governance domains grew considerably in the later years. The other domains generated relatively smaller absolute citation count numbers.
To measure the citation impact contribution of each research domain across the years, the study proportioned the citation impact values such that the relative contributions of each archetypical domain to the total contributions of all domains across time could be observed (Fig. 6).
In Fig. 6, it was noted that while Trading and Investment was a crowded space in terms of total publication numbers and citation counts, the relative research impact per publication was reduced over time. In contrast, while research within the Risk Management and Responsible Use of AI domains started later, their relative citation impacts were higher, and these impacts were picked up within a relatively short space of time.
Other further points to note in terms of relative research interest were as follows: (i) as a relative proportion to the collective papers in the search space, citation impact for the Trading and Investment and Firm Governance domains appeared to exhibit downward trends, and (ii) while citation impacts for the domains of Data, ESG Disclosure, Measurement and Governance, Financial Markets and Instruments and Forecasting and Valuation appeared to exhibit small tractions, these were offset by their continued growth trends and relatively late entries into the research space. They could yet make impressive strides going forward.
The insights above allow researchers to recognize where and how research intensity and interest, and their evolution occurred in recent years, so as to better assess and allocate their present research efforts going forward.
4.3 RQ3: How were the use and evolution of different AI techniques employed across the research archetypical domains, and how did they evolve over time?
It was interesting to see that different research archetypes exhibited different dominant AI techniques, as observed in Fig. 7.
-
Trading and Investment: The AI techniques used by the papers in this domain were highly diversified, ranging from the application of relatively simple tree-based algorithms such as decision trees (e.g. Lanza, Bernardini and Faiella, 2020) and ensemble models such as tree-based regression (e.g. Reyners, 2021), to portfolio optimization algorithms such as Michaud optimization (e.g. De Spiegeleer et al., 2021; He et al., 2021), and deep reinforcement learning algorithms such as Q-learning (e.g. Maree and Omlin, 2022; Yang et al., 2020).
-
ESG Disclosure, Measurement and Governance: ESG information obtained through sources such as web scrapped reporting disclosure might be text-based. Text mining algorithms represented a significant proportion of papers in this domain. Text mining algorithms, most commonly applying Bidirectional Encoder Representations from Transformers (BERT), or other BERT variants such as RoBERTa, LinkBERT, FinBERT, ClimateBERT or SBERT, were applied to extract insights (e.g., Ghosh and Naskar, 2022; Bingler et al., 2022; Koloski et al., 2022; Huang, Wang, and Yang, 2023).
-
Firm Governance: The majority of papers in Firm Governance utilized multivariate regression to examine the impact of ESG at the company level (e.g., Alkaraan et al., 2022; Heath et al, 2021).
-
Financial Markets and Instruments: Research practices in this domain applied multivariate regression and/or unsupervised analysis to examine relationships between ESG and macro market issues, market microstructure, market participants, financial products and activities, and technology influence (e.g., Klusak et al., 2021; Bai et al., 2022; Ishizaka, Lokman and Tasiou, 2021). Evolutionary learning, deep learning and text mining have been applied to create new ESG products, such as ESG index construction (e.g., Slimane, 2021; Chang et al., 2021; Sokolov et al., 2021a).
-
Risk Management: The domain of Risk Management employed multivariate regression to analyze risk factors and contributions (e.g., Brogi, Lagasio and Porretta, 2018; Michalski and Low, 2021), text mining and deep learning to detect changes in risk level or predict negative screening ESG lists (e.g., Jan, 2021; Hisano, Sornette and Mizuno, 2020), and unsupervised learning and computer vision to recognize risk patterns (e.g., Patterson et al., 2020; Yang and Broby, 2020; Patterson et al., 2022).
-
Forecasting and Valuation: This domain primarily employed supervised learning approaches, which might be augmented by ensembling through different regularization (e.g., lasso) and boosting (e.g., XGBoost, LightBoost) techniques (e.g., Guliyev and Mustafayev, 2022; Semet, Roncalli and Stagnol, 2021). Deep learning methods were increasingly being utilized to improve forecasting results (e.g., Wang et al., 2021; Jabeur, Khalfaouiand and Arfi 2021), and finely grained news events could be integrated through text mining to achieve real-time prediction (e.g., Chen et al., 2019; Yu et al., 2018; Zhang, 2022).
-
Data: The domain of Data might involve simple supervised learning and natural language processing (e.g., Gupta, Sharma and Gupta, 2021; Sokolov et al., 2021b), to the use of generative AI (e.g., Geissler et al., 2022) for the creation of ESG scores and rating datasets. Only 36% of the papers reviewed employed AI techniques. The majority of the papers were concerned about providing qualitative assessments to the quality and use of ESG data.
-
Responsible Use of AI: This domain was relatively scarce in terms of the employment of AI techniques, with only 29% of the papers reviewed employing some form of AI techniques. For instance, neural network had been employed as a tool to quantify carbon emissions of machine learning systems (Lacoste et al, 2019), and assess token offerings to finance socially good sustainable entrepreneurship (Mansouri and Momtaz, 2021). Other useful AI techniques include algorithmic fairness and bias mitigation techniques, and feature importance analysis, model interpretation, and local explanation methods for explainable AI. The majority of the papers reviewed discussed qualitatively on AI ethics issues and social good initiatives.
Table 11
Number of unique dominant AI techniques utilized across time
Year
|
Count of Unique Type of Learning Algorithms
|
2016
|
2
|
2017
|
1
|
2018
|
7
|
2019
|
18
|
2020
|
57
|
2021
|
106
|
2022
|
129
|
Total
|
324
|
When viewed across time, it was observed that the use of AI techniques, in terms of the number of unique techniques employed, was growing at an exponential rate (Table 11). 2018 appeared to be the inflexion point. In 2018, there were a count of seven unique AI techniques applied by the papers in this emerging field. By 2022, this number grew by more than 18 times, to 129.
The bump graph analysis in Fig. 8 provided a visualization of the evolution. While only the more dominant techniques were represented in the graph, it could be clearly observed that a myriad of AI techniques were adopted and applied in recent years across different papers to uncover hidden ESG driven insights. Multivariate regression (and its various regularized or ensemble forms) continued to dominate, especially for papers looking to evaluate relationships between ESG and non-ESG attributes. This was followed by natural language processing techniques, which were useful to inspect semi-structured or unstructured alphanumeric ESG-related datasets.
The analyses above provide researchers insights into how each research archetypical domain can differ in terms of the application of AI techniques. A time-series review of the techniques also revealed that while the application of unique techniques was experiencing exponential growth, multivariate regression continued to command the largest influence, and natural language processing models were increasingly dominant. This review provides an insight into the use and evolution of different AI techniques within the research scape.