Heterogeneous data

  • 详情 Optimizing Tourism Resource Allocation Efficiency and Pathways to High-Quality Development in the Age of Artificial Intelligence
    In the context of digital transformation, artificial intelligence (AI) has emerged as a pivotal driver for enhancing tourism resource allocation efficiency and promoting the high-quality development of the tourism industry. Grounded in the Technology–Organization–Environment (TOE) framework, this study constructs a multidimensional indicator system by integrating heterogeneous data sources, including Baidu search indices, corporate annual reports, and policy documents. Using a balanced panel dataset covering 31 provincial-level regions in China from 2015 to 2023, we empirically examine the mechanisms through which AI penetration affects the efficiency of tourism resource allocation. The super-efficiency SBM-DEA model is employed to measure allocation efficiency, while the spatial Durbin model (SDM) and geographically weighted regression (GWR) are used to identify spatial spillover effects and regional heterogeneity. Furthermore, tourist satisfaction is quantified using a natural language processing (NLP)-based sentiment index derived from online reviews. The results indicate that AI penetration significantly improves tourism resource allocation efficiency, with stronger effects observed in regions with advanced technological infrastructure. Smart tourism pilot policies demonstrate significant spatial spillover effects, positively influencing scenic areas within a 100-kilometer radius. However, diminishing marginal returns are evident, highlighting capacity absorption thresholds and institutional constraints. Based on the empirical findings, the study proposes targeted policy recommendations, including the establishment of provincial tourism data hubs, promotion of AI toolkit systems, enhancement of scenic area evaluation mechanisms, and reinforcement of collaborative governance between government and enterprises. These insights aim to provide both theoretical and practical guidance for the intelligent transformation and coordinated regional development of China’s tourism industry.
  • 详情 Integrated Multivariate Segmentation Tree for the Analysis of Heterogeneous Credit Data in Small and Medium-Sized Enterprises
    Traditional decision tree models, which rely exclusively on numerical variables, often encounter difficulties in handling high-dimensional data and fail to effectively incorporate textual information. To address these limitations, we propose the Integrated Multivariate Segmentation Tree (IMST), a comprehensive framework designed to enhance credit evaluation for small and medium-sized enterprises (SMEs) by integrating financial data with textual sources. The methodology comprises three core stages: (1) transforming textual data into numerical matrices through matrix factorization; (2) selecting salient financial features using Lasso regression; and (3) constructing a multivariate segmentation tree based on the Gini index or Entropy, with weakest-link pruning applied to regulate model complexity. Experimental results derived from a dataset of 1,428 Chinese SMEs demonstrate that IMST achieves an accuracy of 88.9%, surpassing baseline decision trees (87.4%) as well as conventional models such as logistic regression and support vector machines (SVM). Furthermore, the proposed model exhibits superior interpretability and computational efficiency, featuring a more streamlined architecture and enhanced risk detection capabilities.