检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:Muhammad Islam Satti Jawad Ahmed Hafiz Syed Muhammad Muslim Akber Abid Gardezi Shafiq Ahmad Abdelaty Edrees Sayed Salman Naseer Muhammad Shafiq
机构地区:[1]Faculty of Computing,Riphah International University,I-14 Campus,Islamabad,44000,Pakistan [2]Department of Computer Science Zabsolutions,SZABIST,Islamabad,Pakistan [3]Department of Computer Science,COMSATS University Islamabad,Islamabad,45550,Pakistan [4]Industrial Engineering Department,College of Engineering,King Saud University,P.O.Box 800,Riyadh,11421,Saudi Arabia [5]Department of Information Technology,University of the Punjab Gujranwala Campus,Gujranwala,52250,Pakistan [6]Department of Information and Communication Engineering,Yeungnam University,Gyeongsan,38541,Korea
出 处:《Computers, Materials & Continua》2023年第2期3913-3929,共17页计算机、材料和连续体(英文)
基 金:King Saud University through Researchers Supporting Project number(RSP-2021/387),King Saud University,Riyadh,Saudi Arabia.
摘 要:Daily newspapers publish a tremendous amount of information disseminated through the Internet.Freely available and easily accessible large online repositories are not indexed and are in an un-processable format.The major hindrance in developing and evaluating existing/new monolingual text in an image is that it is not linked and indexed.There is no method to reuse the online news images because of the unavailability of standardized benchmark corpora,especially for South Asian languages.The corpus is a vital resource for developing and evaluating text in an image to reuse local news systems in general and specifically for the Urdu language.Lack of indexing,primarily semantic indexing of the daily news items,makes news items impracticable for any querying.Moreover,the most straightforward search facility does not support these unindexed news resources.Our study addresses this gap by associating and marking the newspaper images with one of the widely spoken but under-resourced languages,i.e.,Urdu.The present work proposed a method to build a benchmark corpus of news in image form by introducing a web crawler.The corpus is then semantically linked and annotated with daily news items.Two techniques are proposed for image annotation,free annotation and fixed cross examination annotation.The second technique got higher accuracy.Build news ontology in protégéusing OntologyWeb Language(OWL)language and indexed the annotations under it.The application is also built and linked with protégéso that the readers and journalists have an interface to query the news items directly.Similarly,news items linked together will provide complete coverage and bring together different opinions at a single location for readers to do the analysis themselves.
关 键 词:ANNOTATIONS CORPUS information retrieval semantic ontology
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.49