机构地区:[1]School of Electronic and Information Engineering,Suzhou University of Science and Technology,Suzhou 215009,China [2]College of Intelligence and Computing,Tianjin University,Tianjin 300072,China [3]Suzhou Institute of Trade and Commerce,Suzhou 215009,China [4]Department of Computer Science,Texas Tech University,Lubbock TX 79409,USA
出 处:《Chinese Journal of Electronics》2019年第6期1118-1126,共9页电子学报(英文版)
基 金:supported by the National Natural Science Foundation of China(No.61876121,No.61472267,No.61728205,No.61502329,No.61672371);Primary Research&Developement Plan of Jiangsu Province(No.BE2017663);Natural Science Foundation of the Higher Education Institutions of Jiangsu Province(No.19KJB520054);Foundation of Key Laboratory in Science and Technology Development Project of Suzhou(No.SZS201609,No.SZS201813)
摘 要:Great efforts have been made by using deep neural networks to recognize multi-label images.Since multi-label image classification is very complicated,many studies seek to use the attention mechanism as a kind of guidance.Conventional attention-based methods always analyzed images directly and aggressively,which is difficult to well understand complicated scenes.We propose a global/local attention method that can recognize a multi-label image from coarse to fine by mimicking how human-beings observe images.Our global/local attention method first concentrates on the whole image,and then focuses on its local specific objects.We also propose a joint max-margin objective function,which enforces that the minimum score of positive labels should be larger than the maximum score of negative labels horizontally and vertically.This function further improve our multi-label image classification method.We evaluate the effectiveness of our method on two popular multi-label image datasets(i.e.,Pascal VOC and MS-COCO).Our experimental results show that our method outperforms state-of-the-art methods.Great efforts have been made by using deep neural networks to recognize multi-label images.Since multi-label image classification is very complicated,many studies seek to use the attention mechanism as a kind of guidance. Conventional attention-based methods always analyzed images directly and aggressively, which is difficult to well understand complicated scenes. We propose a global/local attention method that can recognize a multi-label image from coarse to fine by mimicking how human-beings observe images. Our global/local attention method first concentrates on the whole image, and then focuses on its local specific objects. We also propose a joint max-margin objective function, which enforces that the minimum score of positive labels should be larger than the maximum score of negative labels horizontally and vertically. This function further improve our multi-label image classification method. We evaluate the effectiveness of our method on two popular multi-label image datasets(i.e., Pascal VOC and MS-COCO). Our experimental results show that our method outperforms state-of-the-art methods.
关 键 词:MULTI-LABEL classification Convolutional NEURAL NETWORK RECURRENT NEURAL NETWORK ATTENTION
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...