supported in part to Dr.Wen-Gang Zhou by the Fundamental Research Funds for the Central Universities of China under Grant Nos.WK2100060014 and WK2100060011;the Start-Up Funding from the University of Science and Technology of China under Grant No.KY2100000036;the Open Project of Beijing Multimedia and Intelligent Software Key Laboratory in Beijing University of Technology,and the sponsor from Intel ICRI MNC project;in part to Dr.Hou-Qiang Li by the National Natural Science Foundation of China(NSFC)under Grant Nos.61325009,61390514,and 61272316;in part to Dr.Yijuan Lu by the Army Research Office(ARO)of USA under Grant No.W911NF-12-1-0057;the National Science Foundation of USA under Grant No.CRI 1305302;in part to Dr.Qi Tian by ARO under Grant No.W911NF-12-1-0057;the Faculty Research Award by NEC Laboratories of America,respectively;was supported in part by NSFC under Grant No.61128007
Many recent state-of-the-art image retrieval approaches are based on Bag-of-Visual-Words model and represent an image with a set of visual words by quantizing local SIFT(scale invariant feature transform) features. ...