检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:Lelisa Adeba Jilcha Jin Kwak
机构地区:[1]ISAA Lab.,Department of AI Convergence Network,Ajou University,Suwon,16499,Korea [2]Department of Cyber Security,Ajou University,Suwon,16499,Korea
出 处:《Computers, Materials & Continua》2022年第5期2883-2899,共17页计算机、材料和连续体(英文)
基 金:This research project was supported by the Ministry of Culture,Sports,and Tourism(MCST)and the Korea Copyright Commission in 2021(2019-PF-9500).
摘 要:In the contemporary world, digital content that is subject to copyright is facing significant challenges against the act of copyright infringement.Billions of dollars are lost annually because of this illegal act. The currentmost effective trend to tackle this problem is believed to be blocking thosewebsites, particularly through affiliated government bodies. To do so, aneffective detection mechanism is a necessary first step. Some researchers haveused various approaches to analyze the possible common features of suspectedpiracy websites. For instance, most of these websites serve online advertisement, which is considered as their main source of revenue. In addition, theseadvertisements have some common attributes that make them unique ascompared to advertisements posted on normal or legitimate websites. Theyusually encompass keywords such as click-words (words that redirect to installmalicious software) and frequently used words in illegal gambling, illegal sexual acts, and so on. This makes them ideal to be used as one of the key featuresin the process of successfully detecting websites involved in the act of copyrightinfringement. Research has been conducted to identify advertisements servedon suspected piracy websites. However, these studies use a static approachthat relies mainly on manual scanning for the aforementioned keywords. Thisbrings with it some limitations, particularly in coping with the dynamic andever-changing behavior of advertisements posted on these websites. Therefore,we propose a technique that can continuously fine-tune itself and is intelligentenough to effectively identify advertisement (Ad) banners extracted fromsuspected piracy websites. We have done this by leveraging the power ofmachine learning algorithms, particularly the support vector machine with theword2vec word-embedding model. After applying the proposed technique to1015 Ad banners collected from 98 suspected piracy websites and 90 normal orlegitimate websites, we were able to successfully identify Ad banners extractedfrom su
关 键 词:Copyright infringement piracy website detection online advertisement advertisement banners machine learning support vector machine word embedding word2vec
分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.119.110.206