machine learning

  • 详情 Spatiotemporal Correlation in Stock Liquidity Through Corporate Networks from Information Disclosure Texts
    The healthy operation of the stock market relies on sound liquidity. We utilize the semantic information from disclosure texts of listed companies on the China Science and Technology Innovation Board (STAR Market) to construct a daily corporate network. Through empirical tests and performance analyses of machine learning models, we elucidate the relationship between the similarity of company disclosure text contents and the temporal and spatial correlations of stock liquidity. Our liquidity indicators encompass trading costs, market depth, trading speed, and price impact, recognized across four dimensions. Furthermore, we reveal that the information loss caused by employing Minimum Spanning Tree (MST) topology significantly affects the explanatory power of network topology indicators for stock liquidity, with a more pronounced impact observed at the document level. Subsequently, by establishing a neural network model to predict next-day liquidity indicators, we demonstrate the temporal relationship of stock liquidity. We model a liquidity predicting task and train a daily liquidity prediction model incorporating Graph Convolutional Network (GCN) modules to solve it. Compared to models with the same parameter structure containing only fully connected layers, the GCN prediction model, which leverages company network structure information, exhibits stronger performance and faster convergence. We provide new insights for research on company disclosure and capital market liquidity.
  • 详情 ESG Rating Results and Corporate Total Factor Productivity
    ESG is emerging as a new benchmark for measuring a company's sustainable development capabilities and social impact. As a measure of ESG performance, ESG ratings are increasingly receiving attention from companies, the general public, and government institutions, and are becoming an important reference factor influencing their decision-making. This paper investigates the impact of corporate ESG ratings on Total Factor Productivity (TFP) and its mechanisms of action. Focusing on listed companies in China, we find that higher ESG ratings contribute to improving a company's TFP, and this conclusion remains valid after robustness tests and addressing endogeneity issues. Further exploration into the reasons behind this result reveals that ESG ratings can be seen as a signal that a company sends to the outside world, representing its overall performance. Higher ESG ratings enhance a company's TFP by reducing market financing constraints and obtaining government subsidies. Heterogeneity analysis shows that the positive impact of ESG ratings on TFP is more pronounced for companies with higher levels of attention, reputation, and audit quality. Additionally, we explore whether ESG ratings can serve as a predictive indicator for measuring a company's TFP. This hypothesis was tested using machine learning algorithms, and the results indicate that models incorporating ESG rating indicators significantly improve the accuracy of predicting a company's TFP capabilities.
  • 详情 Predicting Stock Price Crash Risk in China: A Modified Graph Wavenet Model
    The stock price of a firm is dynamically influenced by its own factors as well as those of its peers. In this study, we introduce a Graph Attention Network (GAT) integrated with WaveNet architecture—termed the GAT-WaveNet model—to capture both time-series and spatial dependencies for forecasting the stock price crash risk of Chinese listed firms from 2012 to 2021. Utilizing node-rolling techniques to prevent overfitting, our results show that the GAT-WaveNet model significantly outperforms traditional machine learning models in prediction accuracy. Moreover, investment portfolios leveraging the GAT-WaveNet model substantially exceed the cumulative returns of those based on other models.
  • 详情 Different Opinion or Information Asymmetry: Machine-Based Measure and Consequences
    We leverage machine learning to introduce belief dispersion measures to distinguish different opinion (DO) and information asymmetry (IA). Our measures align with the human-based measure and relate to economic outcomes in a manner consistent with theoretical prediction: DO positively relates to trading volume and negatively linked to bid-ask spread, whereas IA shows the opposite effects. Moreover, IA negatively predicts the cross-section of stock returns, while DO positively predicts returns for underpriced stocks and negatively for overpriced ones. Our findings reconcile conflicting disagree-return relations in the literature and are consistent with Atmaz and Basak (2018)’s model. We also show that the return predictability of DO and IA stems from their unique economic rationales, underscoring that components of disagreement can influence market equilibrium via distinct mechanisms.
  • 详情 Risk-Based Peer Networks and Return Predictability: Evidence from textual analysis on 10-K filings
    We construct a novel risk-based similarity peer network by applying machine learning techniques to extract a comprehensive set of disclosed risk factors from firms' annual reports. We find that a firm's future returns can be significantly predicted by the past returns of its risk-similar peers, even after excluding firms within the same industry. A long-short portfolio, formed based on the returns of these risk-similar peers, generates an alpha of 84 basis points per month. This return predictability is particularly pronounced for negative-return stocks and those with limited investor attention, suggesting that the effect is driven by slow information diffusion across firms with similar risk exposures. Our findings highlight that the risk factors disclosed in 10-K filings contain valuable information that is often overlooked by investors.
  • 详情 The Transformative Role of Artificial Intelligence and Big Data in Banking
    This paper examines how the integration of artificial intelligence (AI) and big data affects banking operations, emphasizing the crucial role of big data in unlocking the full potential of AI. Leveraging a comprehensive dataset of over 4.5 million loans issued by a leading commercial bank in China and exploiting a policy mandate as an exogenous shock, we document significant improvements in credit rating accuracy and loan performance, particularly for SMEs. Specifically, the adoption of AI and big data reduces the rate of unclassified credit ratings by 40.1% and decreases loan default rates by 29.6%. Analyzing the bank's phased implementation, we find that integrating big data analytics substantially enhances the effectiveness of AI models. We further identify significant heterogeneity: improvements are especially pronounced for unsecured and short-term loans, borrowers with incomplete financial records, first-time borrowers, long-distance borrowers, and firms located in economically underdeveloped or linguistically diverse regions. Our findings underscore the powerful synergy between big data and AI, demonstrating their joint capability to alleviate information frictions and enhance credit allocation efficiency.
  • 详情 How Does China's Household Portfolio Selection Vary with Financial Inclusion?
    Portfolio underdiversification is one of the most costly losses accumulated over a household’s life cycle. We provide new evidence on the impact of financial inclusion services on households’ portfolio choice and investment efficiency using 2015, 2017, and 2019 survey data for Chinese households. We hypothesize that higher financial inclusion penetration encourages households to participate in the financial market, leading to better portfolio diversification and investment efficiency. The results of the baseline model are consistent with our proposed hypothesis that higher accessibility to financial inclusion encourages households to invest in risky assets and increases investment efficiency. We further estimate a dynamic double machine learning model to quantitatively investigate the non-linear causal effects and track the dynamic change of those effects over time. We observe that the marginal effect increases over time, and those effects are more pronounced among low-asset, less-educated households and those located in non-rural areas, except for investment efficiency for high-asset households.
  • 详情 Uncertainty and Market Efficiency: An Information Choice Perspective
    We develop an information choice model where information costs are sticky and co-move with firm-level intrinsic uncertainty as opposed to temporal variations in uncertainty. Incorporating analysts' forecasts, we predict a negative relationship between information costs and information acquisition, as proxied by the predictability of analysts' forecast biases. Finally, the model shows a contrasting pattern between information acquisition and intrinsic and temporal uncertainty, where intrinsic uncertainty strengthens return predictability of analysts' biases through the information cost channel, while temporal uncertainty weakens it through the information benefit channel. We empirically confirm these opposing relationships that existing theories struggle to explain.
  • 详情 Chinese Housing Market Sentiment Index: A Generative AI Approach and An Application to Monetary Policy Transmission
    We construct a daily Chinese Housing Market Sentiment Index by applying GPT-4o to Chinese news articles. Our method outperforms traditional models in several validation tests, including a test based on a suite of machine learning models. Applying this index to household-level data, we find that after monetary easing, an important group of homebuyers (who have a college degree and are aged between 30 and 50) in cities with more optimistic housing sentiment have lower responses in non-housing consumption, whereas for homebuyers in other age-education groups, such a pattern does not exist. This suggests that current monetary easing might be more effective in boosting non-housing consumption than in the past for China due to weaker crowding-out effects from pessimistic housing sentiment. The paper also highlights the need for complementary structural reforms to enhance monetary policy transmission in China, a lesson relevant for other similar countries. Methodologically, it offers a tool for monitoring housing sentiment and lays out some principles for applying generative AI models, adaptable to other studies globally.
  • 详情 Disagreement on Tail
    We propose a novel measure, DOT, to capture belief divergence on extreme tail events in stock returns. Defined as the standard deviation of expected probability forecasts generated by distinct information processing functions and neural network models, DOT exhibits significant predictive power for future stock returns. A value-weighted (equal-weighted) long-short portfolio based on DOT yields an average return of -1.07% (-0.98%) per month. Furthermore, we document novel evidence supporting a risk-sharing channel underlying the negative relation between DOT and the equity premium following extreme negative shocks. Finally, our findings are also in line with a mispricing channel in normal periods.