This study introduces an innovative approach for constructing multimodal investor sentiment indices and explores their varying impacts on stock market returns. We employ the RoBERTa model to quantify text-based sentiment, the Google Inception(v3) model for image-based sentiment measurement, and a multimodal semantic correlation fusion model to comprehensively consider the interplay between textual and visual sentiment features. These sentiment indices are further categorised into industry-specific investor sentiment and market-wide investor sentiment, enabling separate analyses of their effects on stock markets. Furthermore, we leverage these indices to build a multifactor stock selection model and timing strategies. Our research findings demonstrate that multimodal sentiment analysis yields superior predictive accuracy. Industry-specific investor sentiment exerts bidirectional positive influences on stock market returns, whereas market-wide investor sentiment indices exhibit unidirectional impacts. Integrating industry-specific investor sentiment into our multifactor stock selection model effectively enhances portfolio returns. Furthermore, combining market-wide investor sentiment with timing strategy optimisation further augments this advantage.
展开