检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:张姝楠 曹峰 郭倩[1,2] 钱宇华 ZHANG Shu-nan;CAO Feng;GUO Qian;QIAN Yu-hua(Institute of Big Data Science and Industry,Shanxi University,Taiyuan 030006,China;School of Computer and Information Technology,Shanxi University,Taiyuan 030006,China;Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education,Shanxi University,Taiyuan 030006,China)
机构地区:[1]山西大学大数据科学与产业研究院,太原030006 [2]山西大学计算机与信息技术学院,太原030006 [3]山西大学计算智能与中文信息处理教育部重点实验室,太原030006
出 处:《计算机科学》2021年第5期239-246,共8页Computer Science
基 金:国家自然科学基金项目(61672332,61802238,61603228,62006146,F060308);山西省拔尖创新人才支持计划;山西省重点研发计划(国际科技合作)项目(201903D421003);山西省三晋学者;山西省回国留学人员科研项目(2017023,2018172,HGKY2019001);山西省青年基金项目(201901D211171,201901D211169)。
摘 要:逻辑推理是人类智能的核心,是人工智能领域一个富有挑战性的研究课题。人类的IQ测试问题是衡量人类智商水平高低和逻辑推理能力的常用手段之一,如何让计算机学习拥有类似人类的逻辑推理能力是一个非常重要的研究内容,其目的是使计算机从给定的图像中直接学习逻辑推理模式,而无需事先为计算机设计先验推理模式。基于此目的,提出了一种新的数据集Fashion-IQ,该数据集中的每个样本包含7张输入图片和1个标签,这7张图片分别为3张包含一种或多种逻辑的问题输入图片和4张选项输入图片,目的是利用机器学习3张问题输入图片中包含的逻辑来预测下一张图片,从而选择正确的选项。为了解决这个问题,提出了一种时序关系模型。针对每个选项,该模型首先使用卷积神经网络提取前3张输入图片和选项图片的空间特征;接着采用关系网络将这4个空间特征两两组合;然后采用LSTM提取前3张问题输入图片和该选项的时序特征,将时序特征与组合好的空间特征相结合得到时序-空间融合特征;最后对前3张输入图片与每个选项得到的时序-空间融合特征进行进一步推理,采用softmax函数进行打分,得分最高的选项就是正确答案。实验结果证明,该模型在此数据集上实现了比较高的推理准确度。Logical reasoning is the core of human intelligence and a challenging research topic in the field of artificial intelligence.Human IQ test is one of the common methods to measure the level of human IQ and logical reasoning ability.How to let the computer learn to have the logical reasoning ability similar to human is a very important research content,the purpose is to make the computer from a given image directly learn the logical reasoning mode without having to design a priori reasoning mode for the computer in advance.For this purpose,a new data set Fashion-IQ is proposed.Each sample of the data set contains seven input pictures and a label.The seven pictures are three question input pictures that contain one or more logics,and four option input pictures.The purpose is to let the machine learn to predict the next picture based on the logic contained in the three question input pictures,so as to select the correct option.In order to solve this problem,the temporal relationship model is proposed.For each option,the model first uses a convolutional neural network to extract the spatial features of the first three input pictures and option pictures,and then uses a relation network to combine these four spatial features in pairs.Then,it uses LSTM to extract the first three question input pictures combining the time series feature with the time series feature of this option,the time series feature and the combined space feature are combined to obtain the time series-space fusion feature.Finally,the first three input pictures and the temporal-spatial fusion features obtained from each option are further reasoned,and the softmax function is used for scoring.The option with the highest score is the correct answer.Experiments prove that the model has achieved a relatively high inference accuracy on this data set.
关 键 词:逻辑推理 IQ测试 推理模式 时序关系网络 时序-空间融合特征
分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.139.237.218