Machine Learning

  • 详情 Positive Press, Greener Progress: The Role of ESG Media Reputation in Corporate Energy Innovation
    The growing emphasis on Environmental, Social, and Governance (ESG) principles, particularly in corporate sectors, shapes investment trends and operational strategies, whose shift is supported by the increasing role of media in monitoring and influencing corporate ESG performance, thereby driving the energy innovation. Therefore, based on reported events from Baidu News and patent text information of Chinese A-share listed companies from 2012 to 2022, this study innovatively applied machine learning and text analysis to measure ESG news sentiment and corporate energy innovation indicators. Combing with reputation, stakeholder, and agency theories, we find that a good reputation conveyed by positive ESG textual sentiments in the media significantly promotes corporate energy innovation, and the effect is mainly realized through alleviating financing constraints and agency problems and promoting green investment. Further analysis shows that ESG news sentiment promotes corporate energy innovation mainly among private firms, non-growth-stage firms, high-energy-consuming firms, and regions with better green finance development and higher ESG governance intensity. From the perspective of ESG news content and information content, greater ESG news attention can also exert an energy innovation incentive effect, in which the incentive effect exerted by positive media sentiment in the environmental (E) and social (S) dimensions, as well as excellent attention, is more robust. This study provides new insights for promoting green and low-carbon development and understanding the external governance role of media in corporate ESG development.
  • 详情 Tracing the Green Footprint: The Evolution of Corporate Environmental Disclosure Through Deep Learning Models
    Environmental disclosure in emerging markets remains poorly understood, despite its critical role in sustainability governance. Here, we analyze 42,129 firm-year environmental disclosures from 4,571 Chinese listed firms (2008-2022) using machine learning techniques to characterize disclosure patterns and regulatory responses. We show that increased disclosure volume primarily comprises boilerplate content rather than material information. Cross-sectional analyses reveal systematic variations across industries, with manufacturing and high-pollution sectors exhibiting more comprehensive disclosures than consumer and technology sectors. Notably, regional rankings in environmental disclosure volume do not align with local economic development levels. Through examination of staggered regulatory implementation, we demonstrate that market-based mechanisms generate more substantive disclosures compared to command-and-control approaches. These results provide empirical evidence that firms strategically manage environmental disclosures in response to institutional pressures. Our findings have important implications for regulatory design in emerging markets and advance understanding of voluntary disclosure mechanisms in sustainability governance.
  • 详情 Artificial Intelligence, Stakeholders and Maturity Mismatch: Exploring the Differential Impacts of Climate Risk
    The corporate maturity mismatch is highly likely to trigger systemic financial risks, which is a realistic issue commonly faced by businesses. In the context of the intelligent era, the impact of artificial intelligence on maturity mismatch has emerged as a focal point of academic inquiry. Leveraging data from Chinese A-share companies over the 2011–2023 timeframe, this research employs a double machine learning approach to systematically examine the influence and underlying mechanisms of artificial intelligence on maturity mismatch. The findings reveal that artificial intelligence significantly exacerbates maturity mismatch. However, this effect is notably mitigated by government subsidies, media attention, and collectivist cultural. Further analysis indicates that in high-climate-risk scenarios, collectivist culture exerts a notably strong moderating influence. By contrast, government subsidies and media attention exhibit stronger moderating influences in low-climate-risk environments. This study constructs a multi-stakeholder collaborative governance framework, which helps to reveal the 'black box' between artificial intelligence and maturity mismatch, thereby offering a theoretical basis for monitoring maturity mismatch.
  • 详情 From Green-Washing to Innovation-Washing: Environmental Information Intangibility and Corporate Green Innovation in China
    We use a sample of China’s listed firms and employ a naïve Bayesian machine learning algorithm to reveal that environmental information intangibility superficially promotes green innovation. We demonstrate that this effect is channelled through the acquisition of institutional resources, including bank loans and government subsidies. The impact of environmental information intangibility on green innovation is most pronounced within state-owned enterprises, large firms, and politically connected firms. Furthermore, we confirm that environmental information intangibility does not lead to improvements in innovation efficiency or quality. This implies that green innovation may serve as a symbolic environmental activity. Our findings contribute to the understanding of the consequences of environmental information intangibility, greenwashing behaviour, and their relationship to green innovation.
  • 详情 Spatiotemporal Correlation in Stock Liquidity Through Corporate Networks from Information Disclosure Texts
    The healthy operation of the stock market relies on sound liquidity. We utilize the semantic information from disclosure texts of listed companies on the China Science and Technology Innovation Board (STAR Market) to construct a daily corporate network. Through empirical tests and performance analyses of machine learning models, we elucidate the relationship between the similarity of company disclosure text contents and the temporal and spatial correlations of stock liquidity. Our liquidity indicators encompass trading costs, market depth, trading speed, and price impact, recognized across four dimensions. Furthermore, we reveal that the information loss caused by employing Minimum Spanning Tree (MST) topology significantly affects the explanatory power of network topology indicators for stock liquidity, with a more pronounced impact observed at the document level. Subsequently, by establishing a neural network model to predict next-day liquidity indicators, we demonstrate the temporal relationship of stock liquidity. We model a liquidity predicting task and train a daily liquidity prediction model incorporating Graph Convolutional Network (GCN) modules to solve it. Compared to models with the same parameter structure containing only fully connected layers, the GCN prediction model, which leverages company network structure information, exhibits stronger performance and faster convergence. We provide new insights for research on company disclosure and capital market liquidity.
  • 详情 ESG Rating Results and Corporate Total Factor Productivity
    ESG is emerging as a new benchmark for measuring a company's sustainable development capabilities and social impact. As a measure of ESG performance, ESG ratings are increasingly receiving attention from companies, the general public, and government institutions, and are becoming an important reference factor influencing their decision-making. This paper investigates the impact of corporate ESG ratings on Total Factor Productivity (TFP) and its mechanisms of action. Focusing on listed companies in China, we find that higher ESG ratings contribute to improving a company's TFP, and this conclusion remains valid after robustness tests and addressing endogeneity issues. Further exploration into the reasons behind this result reveals that ESG ratings can be seen as a signal that a company sends to the outside world, representing its overall performance. Higher ESG ratings enhance a company's TFP by reducing market financing constraints and obtaining government subsidies. Heterogeneity analysis shows that the positive impact of ESG ratings on TFP is more pronounced for companies with higher levels of attention, reputation, and audit quality. Additionally, we explore whether ESG ratings can serve as a predictive indicator for measuring a company's TFP. This hypothesis was tested using machine learning algorithms, and the results indicate that models incorporating ESG rating indicators significantly improve the accuracy of predicting a company's TFP capabilities.
  • 详情 Predicting Stock Price Crash Risk in China: A Modified Graph Wavenet Model
    The stock price of a firm is dynamically influenced by its own factors as well as those of its peers. In this study, we introduce a Graph Attention Network (GAT) integrated with WaveNet architecture—termed the GAT-WaveNet model—to capture both time-series and spatial dependencies for forecasting the stock price crash risk of Chinese listed firms from 2012 to 2021. Utilizing node-rolling techniques to prevent overfitting, our results show that the GAT-WaveNet model significantly outperforms traditional machine learning models in prediction accuracy. Moreover, investment portfolios leveraging the GAT-WaveNet model substantially exceed the cumulative returns of those based on other models.
  • 详情 Different Opinion or Information Asymmetry: Machine-Based Measure and Consequences
    We leverage machine learning to introduce belief dispersion measures to distinguish different opinion (DO) and information asymmetry (IA). Our measures align with the human-based measure and relate to economic outcomes in a manner consistent with theoretical prediction: DO positively relates to trading volume and negatively linked to bid-ask spread, whereas IA shows the opposite effects. Moreover, IA negatively predicts the cross-section of stock returns, while DO positively predicts returns for underpriced stocks and negatively for overpriced ones. Our findings reconcile conflicting disagree-return relations in the literature and are consistent with Atmaz and Basak (2018)’s model. We also show that the return predictability of DO and IA stems from their unique economic rationales, underscoring that components of disagreement can influence market equilibrium via distinct mechanisms.
  • 详情 Risk-Based Peer Networks and Return Predictability: Evidence from textual analysis on 10-K filings
    We construct a novel risk-based similarity peer network by applying machine learning techniques to extract a comprehensive set of disclosed risk factors from firms' annual reports. We find that a firm's future returns can be significantly predicted by the past returns of its risk-similar peers, even after excluding firms within the same industry. A long-short portfolio, formed based on the returns of these risk-similar peers, generates an alpha of 84 basis points per month. This return predictability is particularly pronounced for negative-return stocks and those with limited investor attention, suggesting that the effect is driven by slow information diffusion across firms with similar risk exposures. Our findings highlight that the risk factors disclosed in 10-K filings contain valuable information that is often overlooked by investors.
  • 详情 The Transformative Role of Artificial Intelligence and Big Data in Banking
    This paper examines how the integration of artificial intelligence (AI) and big data affects banking operations, emphasizing the crucial role of big data in unlocking the full potential of AI. Leveraging a comprehensive dataset of over 4.5 million loans issued by a leading commercial bank in China and exploiting a policy mandate as an exogenous shock, we document significant improvements in credit rating accuracy and loan performance, particularly for SMEs. Specifically, the adoption of AI and big data reduces the rate of unclassified credit ratings by 40.1% and decreases loan default rates by 29.6%. Analyzing the bank's phased implementation, we find that integrating big data analytics substantially enhances the effectiveness of AI models. We further identify significant heterogeneity: improvements are especially pronounced for unsecured and short-term loans, borrowers with incomplete financial records, first-time borrowers, long-distance borrowers, and firms located in economically underdeveloped or linguistically diverse regions. Our findings underscore the powerful synergy between big data and AI, demonstrating their joint capability to alleviate information frictions and enhance credit allocation efficiency.