natural language processing

  • 详情 Analyst Reports and Stock Performance: Evidence from the Chinese Market
    This article applies natural language processing (NLP) to extract and quan- tify textual information to predict stock performance. Leveraging an exten- sive dataset of Chinese analyst reports and employing a customized BERT deep learning model for Chinese text, this study categorizes the sentiment of the reports as positive, neutral, or negative. The findings underscore the predictive capacity of this sentiment indicator for stock volatility, excess re- turns, and trading volume. Specifically, analyst reports with strong positive sentiment will increase excess return and intraday volatility, and vice versa, reports with strong negative sentiment also increase volatility and trading volume, but decrease future excess return. The magnitude of this effect is greater for positive sentiment reports than for negative sentiment reports. This article contributes to the empirical literature exploring sentiment anal- ysis and the response of the stock market to news on the Chinese stock market.
  • 详情 Go with the Flow? Local Industrial Policymaking and its Influence on Firm Productivity
    This study examines factors that determine prefectural industrial policies and their impact on firm total factor productivity (TFP), utilizing a natural language processing algorithm and data from the Report on the Work of the Government in China. We find that compliance with upper-level governments is crucial in shaping prefectural industrial policies. When an industry is favored by the upper-level government, the probability of the prefectural government’s favoring that industry increases. However, prefectural policies driven by political compliance have a minimal positive impact on TFP, due to inadequate implementation of policy measures like tax deductions, preferential loans, and land price discounts.
  • 详情 Analyst Reports and Stock Performance: Evidence from the Chinese Market
    This article applies natural language processing (NLP) to extract and quan- tify textual information to predict stock performance. Leveraging an exten- sive dataset of Chinese analyst reports and employing a customized BERT deep learning model for Chinese text, this study categorizes the sentiment of the reports as positive, neutral, or negative. The findings underscore the predictive capacity of this sentiment indicator for stock volatility, excess re- turns, and trading volume. Specifically, analyst reports with strong positive sentiment will increase excess return and intraday volatility, and vice versa, reports with strong negative sentiment also increase volatility and trading volume, but decrease future excess return. The magnitude of this effect is greater for positive sentiment reports than for negative sentiment reports. This article contributes to the empirical literature exploring sentiment anal- ysis and the response of the stock market to news on the Chinese stock market.
  • 详情 Go with the flow? Local industrial policymaking and its influence on firm productivity
    This study examines factors that determine prefectural industrial policies and their impact on firm total factor productivity (TFP), utilizing a natural language processing algorithm and data from the Report on the Work of the Government in China. We find that compliance with upper-level governments is crucial in shaping prefectural industrial policies. When an industry is favored by the upper-level government, the probability of the prefectural government’s favoring that industry increases. However, prefectural policies driven by political compliance have a minimal positive impact on TFP, due to inadequate implementation of policy measures like tax deductions, preferential loans, and land price discounts.
  • 详情 Mixed Frequency Deep Factor Asset Pricing with Multi-Source Heterogeneous Information on Policy Guidance
    In the era of big data, asset pricing is influenced by various factors, which are extracted from multi-source heterogeneous information, such as high frequency market and sentiment information, low frequency firm characteristic and macroeconomic information. Especially, low frequency policy information plays a significant role in the long-term pricing in China but it is barely investigated due to its textual form. To this end, we first extract policy variables from major national development plans (“Five-Year Plans”, “Government Work Reports”, and “Monetary Policy Reports”) using Natural Language Processing (NLP) technique and Dynamic Topic Model (DTM). However, traditional models are inadequate for mixed frequency data modeling and feature extraction. Then, we propose a mixed frequency deep factor asset pricing model (MIDAS-DF) that solves the asset pricing problems under the mixed frequency data environment through mixed data sampling (MIDAS) technique and deep learning architecture. Time-varying latent factors and factor loadings can be modeled from mixed frequency data directly in a nonlinear and data-driven way. Thus, the MIDAS-DF model is able to learn the nonlinear joint-patterns hidden in multi-source heterogeneous information. Our empirical studies of 4939 stocks on the Chinese A-share market from January 2003 to July 2022 demonstrate that low frequency policy information has profound impacts on asset pricing, which anchors the long-term pricing direction, and high frequency market and sentiment information have significant influences on stock prices, which optimize the short-term pricing accuracy, they together enhance the pricing effects. Consequently, pricing effects the MIDAS-DF model outperform the five competing models on individual stocks, various test portfolios, and investment portfolios. Our research about heterogeneous information provides implications to the government and regulators for decision-support in policy-making and our investment portfolio is of great importance for investors’ financial decisions.