出 处:《Science China(Information Sciences)》2017年第9期32-45,共14页中国科学(信息科学)(英文版)
基 金:supported by National Basic Research Program of China(Grant No.2014CB340503);National Natural Science Foundation of China(Grant Nos.71532004,61133012,61472107)
摘 要:In this article, we revisit the task of movie box-office revenues prediction using multi-type features. The movie box-office revenues are affected by numerous factors. Previous work with discriminative models assumes these factors are identically and independently distributed. The correlations between these factors are rarely considered, which limited the performances of discriminative models in this task. To address these problems, we investigate a novel Gaussian copula regression model. Based on this model, we do not need to make any prior assumptions about the marginal distributions of the features. In particular, we perform a cumulative probability estimation on each of the smoothed features. The estimation learns the marginal distributions and maps all features into a uniform vector space. Sequentially, we bridge the marginal distributions with a copula function to create their joint distribution, and learn the dependency structure between them. Moreover, we propose a computational-efficient approximate algorithm for responsible variable inference. Experimental results on two movie datasets from Chinese and U.S. market show that our approach outperforms strong discriminative regression baselines.In this article, we revisit the task of movie box-office revenues prediction using multi-type features. The movie box-office revenues are affected by numerous factors. Previous work with discriminative models assumes these factors are identically and independently distributed. The correlations between these factors are rarely considered, which limited the performances of discriminative models in this task. To address these problems, we investigate a novel Gaussian copula regression model. Based on this model, we do not need to make any prior assumptions about the marginal distributions of the features. In particular, we perform a cumulative probability estimation on each of the smoothed features. The estimation learns the marginal distributions and maps all features into a uniform vector space. Sequentially, we bridge the marginal distributions with a copula function to create their joint distribution, and learn the dependency structure between them. Moreover, we propose a computational-efficient approximate algorithm for responsible variable inference. Experimental results on two movie datasets from Chinese and U.S. market show that our approach outperforms strong discriminative regression baselines.
关 键 词:Gaussian copula movie box-office revenue multi-variate regression text regression social media
分 类 号:O212.8[理学—概率论与数理统计]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...