machine learning

  • 详情 How Does China's Household Portfolio Selection Vary with Financial Inclusion?
    Portfolio underdiversification is one of the most costly losses accumulated over a household’s life cycle. We provide new evidence on the impact of financial inclusion services on households’ portfolio choice and investment efficiency using 2015, 2017, and 2019 survey data for Chinese households. We hypothesize that higher financial inclusion penetration encourages households to participate in the financial market, leading to better portfolio diversification and investment efficiency. The results of the baseline model are consistent with our proposed hypothesis that higher accessibility to financial inclusion encourages households to invest in risky assets and increases investment efficiency. We further estimate a dynamic double machine learning model to quantitatively investigate the non-linear causal effects and track the dynamic change of those effects over time. We observe that the marginal effect increases over time, and those effects are more pronounced among low-asset, less-educated households and those located in non-rural areas, except for investment efficiency for high-asset households.
  • 详情 Uncertainty and Market Efficiency: An Information Choice Perspective
    We develop an information choice model where information costs are sticky and co-move with firm-level intrinsic uncertainty as opposed to temporal variations in uncertainty. Incorporating analysts' forecasts, we predict a negative relationship between information costs and information acquisition, as proxied by the predictability of analysts' forecast biases. Finally, the model shows a contrasting pattern between information acquisition and intrinsic and temporal uncertainty, where intrinsic uncertainty strengthens return predictability of analysts' biases through the information cost channel, while temporal uncertainty weakens it through the information benefit channel. We empirically confirm these opposing relationships that existing theories struggle to explain.
  • 详情 Chinese Housing Market Sentiment Index: A Generative AI Approach and An Application to Monetary Policy Transmission
    We construct a daily Chinese Housing Market Sentiment Index by applying GPT-4o to Chinese news articles. Our method outperforms traditional models in several validation tests, including a test based on a suite of machine learning models. Applying this index to household-level data, we find that after monetary easing, an important group of homebuyers (who have a college degree and are aged between 30 and 50) in cities with more optimistic housing sentiment have lower responses in non-housing consumption, whereas for homebuyers in other age-education groups, such a pattern does not exist. This suggests that current monetary easing might be more effective in boosting non-housing consumption than in the past for China due to weaker crowding-out effects from pessimistic housing sentiment. The paper also highlights the need for complementary structural reforms to enhance monetary policy transmission in China, a lesson relevant for other similar countries. Methodologically, it offers a tool for monitoring housing sentiment and lays out some principles for applying generative AI models, adaptable to other studies globally.
  • 详情 Disagreement on Tail
    We propose a novel measure, DOT, to capture belief divergence on extreme tail events in stock returns. Defined as the standard deviation of expected probability forecasts generated by distinct information processing functions and neural network models, DOT exhibits significant predictive power for future stock returns. A value-weighted (equal-weighted) long-short portfolio based on DOT yields an average return of -1.07% (-0.98%) per month. Furthermore, we document novel evidence supporting a risk-sharing channel underlying the negative relation between DOT and the equity premium following extreme negative shocks. Finally, our findings are also in line with a mispricing channel in normal periods.
  • 详情 Spatiotemporal Correlation in Stock Liquidity Through Corporate Networks from Information Disclosure Texts
    The healthy operation of the stock market relies on sound liquidity. We utilize the semantic information from disclosure texts of listed companies on the China Science and Technology Innovation Board (STAR Market) to construct a daily corporate network. Through empirical tests and performance analyses of machine learning models, we elucidate the relationship between the similarity of company disclosure text contents and the temporal and spatial correlations of stock liquidity. Our liquidity indicators encompass trading costs, market depth, trading speed, and price impact, recognized across four dimensions. Furthermore, we reveal that the information loss caused by employing Minimum Spanning Tree (MST) topology significantly affects the explanatory power of network topology indicators for stock liquidity, with a more pronounced impact observed at the document level. Subsequently, by establishing a neural network model to predict next-day liquidity indicators, we demonstrate the temporal relationship of stock liquidity. We model a liquidity predicting task and train a daily liquidity prediction model incorporating Graph Convolutional Network (GCN) modules to solve it. Compared to models with the same parameter structure containing only fully connected layers, the GCN prediction model, which leverages company network structure information, exhibits stronger performance and faster convergence. We provide new insights for research on company disclosure and capital market liquidity.
  • 详情 Customers’ emotional impact on star rating and thumbs-up behavior towards food delivery service Apps
    This study explores the intricate relationship between emotional cues present in food delivery app reviews, normative ratings, and reader engagement. Utilizing lexicon-based unsupervised machine learning, our aim is to identify eight distinct emotional states within user reviews sourced from the Google Play Store. Our primary goal is to understand how reviewer star ratings impact reader engagement, particularly through thumbs-up reactions. By analyzing the influence of emotional expressions in user-generated content on review scores and subsequent reader engagement, we seek to provide insights into their complex interplay. Our methodology employs advanced machine learning techniques to uncover subtle emotional nuances within user-generated content, offering novel insights into their relationship. The findings reveal an inverse correlation between review length and positive sentiment, emphasizing the importance of concise feedback. Additionally, the study highlights the differential impact of emotional tones on review scores and reader engagement metrics. Surprisingly, user-assigned ratings negatively affect reader engagement, suggesting potential disparities between perceived quality and reader preferences. In summary, this study pioneers the use of advanced machine learning techniques to unravel the complex relationship between emotional cues in customer evaluations, normative ratings, and subsequent reader engagement within the food delivery app context.
  • 详情 Do Enterprises Adopting Digital Finance Exhibit Higher Values? Based on Textual Analysis
    In this paper, we investigate whether those enterprises adopting digital finance exhibit higher values. On the basis of the constructed fintech-related lexicon developed by the machine learning-based Word2Vec model, we employ the frequency of fintech-related words (phrases) in the management discussion sections of annual reports as a proxy variable for the degree to which enterprises apply digital finance. We utilize panel data regression and mediation models based on data of Chinese A-share listed companies from 2016 to 2022 and explore the impact of this degree of digital finance application on enterprise value. We find that the degree to which enterprises apply digital finance elevates their values. The in-depth integration of digital technology and finance directly enhances enterprise value by reducing financing costs. Additionally, the effects are more evident among small-scale firms and enterprises located in regions with lower marketization levels. However, in the face of the impact of the COVID-19 pandemic, the positive effects on enterprises are relatively low.
  • 详情 Treasury Bond Pricing Via No Arbitrage Arguments and Machine Learning: Evidence from China
    This paper proposes a novel bond return (price or yield curve) prediction methodology, unifying the classical no arbitrage pricing framework, which is ubiquitous and serves as a fundamental and theoretical building block in mathematical finance, and empirical asset (bond) pricing methodologies, e.g., Bianchi, Büchner, & Tamoni (2021) for treasury bonds and Gu, Kelly, & Xiu (2020) for equities. The methodology can be viewed as a unification of theoretical and empirical asset pricing frameworks. Our method is mathematically and theoretically rigorous, arbitrage-free and meantime enjoys the flexibility offered by the empirical asset pricing framework, i.e., a potentially rich factor structure and accurate function approximations via machine learning regression. Real market back-testing studies show that our predictions are accurate, in the sense that the formulated equally-weighted treasury bond portfolios in China exchange-based markets bear significant positive returns. The average hit rate for yield curve prediction reaches 77.71% across all tenors and the related long-only trading strategy based on the prediction results in an annualized absolute return as high as 12.35% with Calmar ratio achieving 7.31 for equally-weighted portfolios. As a by-product of our prediction framework, spot yield curves can be predicted accurately in an arbitrage-free manner.
  • 详情 AI-mimicked Behavior and Fundamental Momentum: The Evidence from China
    We track the fundamental informed traders' (FITs) behavior and show the fundamental momentum effect in the Chinese stock market. We train the deep learning model with a set of fundamental characteristics to extract fundamental implied component from realized returns. The fundamental part characterizes the price movement driven by FITs. Fundamental momentum differentiates from the fundamental trend and is not quality minus junk (QMJ) factor. Underreaction bias helps explain the strategy, as it generates stronger profit during periods of low investor sentiment and aggregate idiosyncratic volatility. Fundamental momentum is not sensitive to changing beta and robust in subsamples and machine learning models.
  • 详情 Managerial Risk Assessment and Fund Performance: Evidence from Textual Disclosure
    Fund managers’ ability to evaluate risk has important implications for their portfolio management and performance. We use a state-of-the-art deep learning model to measure fund managers’ forward-looking risk assessments from their narrative discussions. We validate that managers’ negative (positive) risk assessments lead to subsequent decreases (increases) in their portfolio risk-taking. However, only managers who identify negative risk generate superior risk-adjusted returns and higher Sharpe ratios, and have better intraquarter trading skills, suggesting that cautious, skilled managers are less subject to overconfidence biases. interestingly, only sophisticated investors respond to the narrative-based risk assessment measure, consistent with limited attention by retail investors.