详情
Large Language Models and Return Prediction in China
We examine whether large language models (LLMs) can extract contextualized representation of Chinese public news articles to predict stock returns. Based on representativeness and influences, we consider seven LLMs: BERT, RoBERTa, FinBERT, Baichuan, ChatGLM, InternLM, and their ensemble model. We show that news tones and return forecasts extracted by LLMs from Chinese news significantly predict future returns. The value-weighted long-minus-short portfolios yield annualized returns between 35% and 67%, depending on the model. Building on the return predictive power of LLM signals, we further investigate its implications for information efficiency. The LLM signals contain firm fundamental information, and it takes two days for LLM signals to be incorporated into stock prices. The predictive power of the LLM signals is stronger for firms with more information frictions, more retail holdings and for more complex news. Interestingly, many investors trade in opposite directions of LLM signals upon news releases, and can benefit from the LLM signals. These findings suggest LLMs can be helpful in processing public news, and thus contribute to overall market efficiency.