检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李林睿 常舒予 乔一鸣 LI Lin-rui;CHANG Shu-yu;QIAO Yi-ming(Nanjing University of Posts and Telecommucations,Nanjing 210023,China)
机构地区:[1]南京邮电大学,江苏南京210023
出 处:《电脑知识与技术》2021年第22期90-93,共4页Computer Knowledge and Technology
基 金:江苏省大学生创新创业训练计划项目(201910293065Y,SYB2019015)。
摘 要:LAMOST(郭守敬望远镜)提供了大量的天文光谱数据,而天体分类是天文学中得到广泛关注的问题,由于天体数量大,数据维度高,如何使用机器学习的方法对光谱进行处理,成为近些年的热点。针对天体分类问题,提出了HSODM(High-dimensional Spectral with Outlier Data Mining),这是一种改进的高维离群数据识别方法,其采用无监督学习方式,基于随机距离将大量高维光谱数据中的极少数未知天体或离群数据识别出来,便于后续天体分类、离群数据挖掘等相关处理。项目中运用数据预处理、主成分分析降维、长短期记忆神经网络模型建立与训练、参数调优、结果预测与分析,最终通过评估方法和数据可视化等手段对模型进行评价与展示。研究中提出的改进方法和优化的神经网络可以缩短训练时间,提高模型预测准确度。经过实验发现,改进方法对ROC(receiver operating characteristic)曲线面积、P-R曲线面积、F1分数和G-mean分数都有相应的提高。LAMOST(Large Sky Area Multi-Object Fiber Spectroscopy Telescope)Telescope provides a large amount of astronomical spectral data,and astronomical classification is a problem that has received widespread attention in astronomy.Due to the large number of celestial bodies and the high dimensionality of data,how to use machine learning methods to process spectra has become a problem in recent years.Hot spot.Aiming at the problem of celestial body classification,HSODM(High-dimensional Spectral with Outlier Data Mining)is proposed,which is an improved method for identifying high-dimensional outlier data.It uses an unsupervised learning method and combines a large number of high-dimensional spectral data based on random distance.A very small number of unknown celestial bodies or outlier data can be identified to facilitate subsequent celestial body classification,outlier data mining and other related processing.In the project,data preprocessing,principal component analysis and dimensionality reduction,long and short-term memory neural network model establishment and training,parameter tuning,result prediction and analysis are used in the project,and the model is finally evaluated and displayed by means of evaluation methods and data visualization.The improved method and optimized neural network proposed in the research can shorten the training time and improve the accuracy of model prediction.After experimentation,it is found that the improved method has corresponding improvement on ROC curve area,P-R curve area,F1 score and G-mean score.
分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.7