中国金融学术研究网

Large Language Models

详情 Beyond Prompting: An Autonomous Framework for Systematic Factor Investing via Agentic AI
This paper develops an autonomous framework for systematic factor investing via agentic AI. Rather than relying on sequential manual prompts, our approach operationalizes the model as a self-directed engine that endogenously formulates interpretable trading signals. To mitigate data snooping biases, this closed-loop system imposes strict empirical discipline through out-of-sample validation and economic rationale requirements. Applying this methodology to the U.S. equity market, we document that long-short portfolios formed on the simple linear combination of signals deliver an annualized Sharpe ratio of 2.75 and a return of 54.81%. Finally, our empirics demonstrate that self-evolving AI offers a scalable and interpretable paradigm.
详情 Autonomous Market Intelligence: Agentic AI Nowcasting Predicts Stock Returns
Can fully agentic AI nowcast stock returns? We deploy a state-of-the-art Large Language Model to evaluate the attractiveness of each Russell 1000 stock each trading day, starting in April 2025 when AI web interfaces enabled real-time search. Our data contribution is unique along three dimensions. First, the nowcasting framework is completely out-of-sample and free of look-ahead bias by construction: predictions are collected at the current edge of time, ensuring the AI has no knowledge of future outcomes. Second, this temporal design is irreproducible once the information environment passes. Third, our framework is fully agentic: we do not feed the model curated news or disclosures; it autonomously searches the web, filters sources, and synthesises information into quantitative predictions. We find that AI possesses genuine stock-selection ability, but that its predictive power is concentrated in identifying future winners. A daily value-weighted portfolio of the 20 highestranked stocks earns a Fama-French five-factor plus momentum alpha of 19.4 basis points and an annualised Sharpe ratio of 2.68 over April 2025–March 2026. The same portfolio accumulates roughly 49.0% cumulative return, versus 21.2% for the Russell 1000 benchmark. The strategy is economically implementable: the average bid-ask spread of the daily Top-20 portfolio is 1.79 basis points, less than 10% of gross daily alpha. However, the signal remains asymmetric. Bottom-ranked portfolios generally exhibit alphas close to zero, while the strongest predictive content sits in the extreme top ranks. Delayed-entry tests further show that predictability does not vanish after a single day; rather, the signal remains positive over a broad window of subsequent entry dates, consistent with slow information diffusion rather than a fleeting overnight anomaly.
详情 Technological Momentum in China: Large Language Model Meets Simple Classifications
This study applies large language models (LLMs) to measure technological links and examines its predictive power in the Chinese stock market. Using the BAAI General Embedding (BGE) model, we extract semantic information from patent textual data to construct the technological momentum measure. As a comparison, the measure based on traditional International Patent Classification (IPC) is also considered. Empirical analysis shows that both measures significantly predict stock returns and they capture complementary dimensions of technological links. Further investigation through stratified analysis reveals the critical role of investor inattention in explaining their differential performance: in stocks with low investor inattention, IPC-based measure loses its predictive power while BGE-based measure remains significant, indicating that straightforward information is fully priced in while complex semantic relationships require greater cognitive processing; in stocks with high investor inattention, both measures exhibit predictability, with BGE-based measure showing stronger effects. These findings support behavioral finance theories suggesting that complex information diffuses more slowly in markets, especially under significant cognitive constraints, and demonstrate LLMs’ advantage in uncovering subtle technological connections that traditional methods overlook.
详情 A multifactor model using large language models and investor sentiment from photos and news: new evidence from China
This study introduces an innovative approach for constructing multimodal investor sentiment indices and explores their varying impacts on stock market returns. We employ the RoBERTa model to quantify text-based sentiment, the Google Inception(v3) model for image-based sentiment measurement, and a multimodal semantic correlation fusion model to comprehensively consider the interplay between textual and visual sentiment features. These sentiment indices are further categorised into industry-specific investor sentiment and market-wide investor sentiment, enabling separate analyses of their effects on stock markets. Furthermore, we leverage these indices to build a multifactor stock selection model and timing strategies. Our research findings demonstrate that multimodal sentiment analysis yields superior predictive accuracy. Industry-specific investor sentiment exerts bidirectional positive influences on stock market returns, whereas market-wide investor sentiment indices exhibit unidirectional impacts. Integrating industry-specific investor sentiment into our multifactor stock selection model effectively enhances portfolio returns. Furthermore, combining market-wide investor sentiment with timing strategy optimisation further augments this advantage.
详情 Large Language Models and Return Prediction in China
We examine whether large language models (LLMs) can extract contextualized representation of Chinese news articles and predict stock returns. The LLMs we examine include BERT, RoBERTa, FinBERT, Baichuan, ChatGLM and their ensemble model. We find that tones and return forecasts extracted by LLMs from news significantly predict future returns. The equal- and value-weighted long minus short portfolios yield annualized returns of 90% and 69% on average for the ensemble model. Given that these news articles are public information, the predictive power lasts about two days. More interestingly, the signals extracted by LLMs contain information about firm fundamentals, and can predict the aggressiveness of future trades. The predictive power is noticeably stronger for firms with less efficient information environment, such as firms with lower market cap, shorting volume, institutional and state ownership. These results suggest that LLMs are helpful in capturing under-processed information in public news, for firms with less efficient information environment, and thus contribute to overall market efficiency.
详情 Burden of Improvement: When Reputation Creates Capital Strain in Insurance
A strong reputation is a cornerstone of corporate finance theory, widely believed to relax financial constraints and lower capital costs. We challenge this view by identifying an ‘reputation paradox’: under modern risk-sensitive regulation, for firms with long-term liabilities, a better reputation may paradoxically increase capital strain. We argue that the improvement of firm’s reputation alters customer behavior , , which extends liability duration and amplifies measured risk. By using the life insurance industry as an ideal laboratory, we develop an innovative framework that integrates LLMs with actuarial cash flow models, which confirms that the improved reputation increases regulatory capital demands. A comparative analysis across major regulatory regimes—C-ROSS, Solvency II, and RBC—and two insurance products, we further demonstrate that improvements in reputation affect capital requirements unevenly across product types and regulatory frameworks. Our findings challenge the conventional view that reputation uniformly alleviates capital pressure, emphasizing the necessity for insurers to strategically align reputation management with solvency planning.
详情 Reputation in Insurance: Unintended Consequences for Capital Allocation
Reputation is widely regarded as a stabilizing factor in financial institutions, reducing capital constraints and enhancing firm resilience. However, in the insurance industry, where capital requirements are shaped by solvency regulations and policyholder behavior, the effects of reputation on capital management remain unclear. This paper examines the unintended consequences of reputation in insurance asset-liability management, focusing on its impact on capital allocation. Using a novel reputation risk measure based on large language models (LLMs) and actuarial models, we show that reputation shifts influence surrender rates, altering capital requirements. While higher reputation reduces surrender risk, it increases capital demand for investment-oriented insurance products, whereas protection products remain largely unaffected. These findings challenge the conventional wisdom that reputation always eases capital constraints, highlighting the need for insurers to integrate reputation management with capital planning to avoid unintended capital strain.
详情 Dissecting the Sentiment-Driven Green Premium in China with a Large Language Model
The general financial theory predicts a carbon premium, as brown stocks bear greater uncertainty under climate transition. However, a contrary green premium has been identified in China, as evidenced by the return spread between green and brown sectors. The aggregated climate transition sentiment, measured from news data using a large language model, explains 12%-33% of the variability in the anomalous alpha. This factor intensifies after China announced its national commitments. The sentiment-driven green premium is attributed to speculative trading by retail investors targeting green “concept stocks.” Additionally, the discussion highlights the advantages of large language models over lexicon-based sentiment analysis.
详情 Large Language Models and Return Prediction in China
We examine whether large language models (LLMs) can extract contextualized representation of Chinese public news articles to predict stock returns. Based on representativeness and influences, we consider seven LLMs: BERT, RoBERTa, FinBERT, Baichuan, ChatGLM, InternLM, and their ensemble model. We show that news tones and return forecasts extracted by LLMs from Chinese news significantly predict future returns. The value-weighted long-minus-short portfolios yield annualized returns between 35% and 67%, depending on the model. Building on the return predictive power of LLM signals, we further investigate its implications for information efficiency. The LLM signals contain firm fundamental information, and it takes two days for LLM signals to be incorporated into stock prices. The predictive power of the LLM signals is stronger for firms with more information frictions, more retail holdings and for more complex news. Interestingly, many investors trade in opposite directions of LLM signals upon news releases, and can benefit from the LLM signals. These findings suggest LLMs can be helpful in processing public news, and thus contribute to overall market efficiency.
详情 The Market Value of Generative AI: Evidence from China Market
Our study explored the rise of public companies competing to launch large language models (LLMs) in the Chinese stock market after ChatGPTs' success. We analyzed 25 companies listed on the Chinese Stock Exchange and discovered that the cumulative abnormal return (CAR) was high up to 3% before LLMs' release, indicating a positive view from insiders. However, CAR dropped to around 1.5% after their release. Early LLM releases had better market reactions, especially those focused on customer service, design, and education. Conversely, LLMs dedicated to IT and civil service received negative feedback.