We use supervised machine learning to develop a Chinese language financial sentiment dictionary from 3.1 million financial news articles. Our dictionary maps semantically similar words to a subset of human-expert generated financial sentiment words. In article-level validation tests, our dictionary scores the sentiment of articles consistently with a human reading of full articles. In return validation tests, our dictionary outperforms and subsumes previous Chinese financial sentiment dictionaries such as direct translations of Loughran and McDonald’s (2011) financial words. We also generate a list of politically-related positive words that is unique to China; this list has a weaker association with returns than does the list of otherwise positive words. We demonstrate that state media exhibits a sentiment bias by using more politically-related positive and fewer negative words, and this bias renders state media’s sentiment less return-informative. Our findings demonstrate that dictionary-based sentiment analysis exhibits strong language and domain specificity.
展开