检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:胡庆辉[1,2] 魏士伟[2] 解忠乾 任亚峰[1]
机构地区:[1]武汉大学计算机学院,武汉430072 [2]桂林航天工业学院广西高校机器人与焊接技术重点实验室培育基地,广西桂林541004
出 处:《计算机应用》2016年第1期154-157,187,共5页journal of Computer Applications
基 金:国家自然科学基金资助项目(11301106);广西自然科学基金资助项目(2014GXNSFAA1183105);广西高校科研资助项目(ZD2014147;YB2014431)~~
摘 要:针对现有广告短语相关性研究成果多采用字面匹配,忽略了短语所包含的深层语义信息,限制了任务的性能等问题,提出了采用深度学习算法研究广告短语的相关性,采用递归自编码器(RAE)对短语进行深层结构分析,使得短语向量包含深层的语义信息,以此来构建广告语境下的短语相关性计算方法。具体地,给定一个包含若干词的序列,序列中所有相邻的两个元素尝试合并产生一个重构误差,遍历将重构误差最小的元素两两合并,形成类似哈夫曼树结构的短语树。采用梯度下降法最小化短语树的重构误差,采用余弦距离度量短语之间的相关性。实验结果显示,通过引入词语权重信息,加大了重要词语在最终短语向量表示中贡献的信息量,使得RAE更适合短语计算;比起传统LDA和BM25算法,在50%召回率的条件下,提出的算法的准确率分别提高了4.59个百分点和3.21个百分点,这证明了所提算法的有效性。Focusing on the issue that most research results on correlation between advertising phrases stay in the literal level, and can not exploit deep semantic information of the phrases, which limits the performance of the task, a novel method was proposed to calculate the correlation between the phrases by using deep learning technique. Recursive Auto Encoder( RAE) was developed to make full use of semantic information in the word order and phrase, which made the phrase vector contain more deep semantic information, and built the calculating method of correlation under the advertising situation.Specifically, for a given list of a few phrases, reconstruction error was produced by merging the adjacent two elements. Phrase tree, which similar to the Huffman tree, was produced by merging two elements with smallest reconstruction error in turn.Gradient descent and Cosine distance were used to minimize the reconstruction error of phrase tree and measure the correlation between the phrases respectively. The experimental results show that the contribution of the important phrases is increased in the representation of the final phrase vector by introducing weight information, and RAE is more suitable for phrase calculation. The proposed method increases the accuracy by 4. 59% and 3. 21% respectively compared with LDA( Latent Dirichlet Allocation) and BM25 algorithm under the same condition of 50% recall rate, which proves its effectiveness.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.188