检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李天举 谢志峰[1] 张侃弘 陶亦筠 范杰[2] 汤臻 LI Tian-ju;XIE Zhi-feng;ZHANG Kan-hong;TAO Yi-jun;FAN Jie;TANG Zhen(Shanghai University,Shanghai 200072,China;Shanghai Tobacco Group Co.,Ltd.,Shanghai 200082,China;Shanghai Tobacco Monopoly Administration,Shanghai 200120,China)
机构地区:[1]上海大学,上海200072 [2]上海烟草集团有限责任公司,上海200082 [3]上海市烟草专卖局,上海200120
出 处:《计算机技术与发展》2020年第11期128-135,共8页Computer Technology and Development
基 金:国家自然科学基金(61303093);上海市自然科学基金(19ZR1419100)。
摘 要:为了推动上海市烟草专卖市场监管方式转型,实现市场监管水平的有效提升,通过引入异常数据挖掘方法,从而强化市场异动预测和分析。结合目前机器学习前沿理论的研究,提出了基于多模型Stacking集成学习的烟草异常数据挖掘模型,运用Stacking集成学习的方式,充分发挥各个算法模型的优势。数据集使用的是2016年1月到2019年4月的烟草专卖数据,通过数据预处理等方式将数据指标化,并使用数据增强等手段一定程度上缓解了数据不平衡的问题。使用该数据对模型进行了验证分析,其结果很好地证明了Stacking模型中单个机器学习算法的学习能力越强,关联程度越低,集成后的模型预测结果越好。最后通过实证稽查环节,充分验证了模型的有效性,经过全市实证后,市场检查对零售户的问题查实率能从现有的5%左右提升至15%以上。In order to promote the transformation of the Shanghai tobacco monopoly market supervision method and achieve an effective improvement in the level of market supervision,the introduction of abnormal data mining methods has strengthened the prediction and analysis of market movements.Combined with the current research on cutting-edge theories of machine learning,a tobacco anomaly data mining model based on multi-model Stacking ensemble learning is proposed,and the advantages of each algorithm model are brought into full play by using Stacking ensemble learning.The data set uses tobacco monopoly data from January 2016 to April 2019.The data is indexed through data preprocessing and other methods,and data enhancement is used to alleviate the problem of data imbalance to some extent.The model is verified and analyzed by these data.The results well prove that the stronger the learning ability of a single machine learning algorithm in the Stacking model,the lower the degree of association,and the better the prediction result of the integrated model.Finally,the effectiveness of the model is fully verified through the empirical inspection link.After the city’s empirical verification,the market inspection of the retailer’s problem verification rate can be increased from the existing 5%to more than 15%.
关 键 词:异常数据挖掘 集成学习 数据预处理 数据增强 Stacking模型
分 类 号:TP399[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.147