基于Gauss分布和Gram-Schmidt正交化的朴素贝叶斯分类算法  被引量:4

Naive Bayes classification algorithm based on Gauss distribution and Gram-Schmidt orthogonalization

在线阅读下载全文

作  者:黄小杰[1] 刘芝秀[1] 邓梓杨 刘红军 吴春 HUANG Xiaojie;LIU Zhixiu;Deng ziyang;LIU Hongjun;WU Chun(Department of Science,Nanchang Institute of Technology,Nanchang 330099,China;School of Mathematics and Computer Science,Nanchang University,Nanchang 330031,China;School of Mathematical Sciences,Guizhou Normal University,Guiyang 550025,China;School of Mathematical Sciences,Chongqing Normal University,Chongqing 401331,China)

机构地区:[1]南昌工程学院理学院,江西南昌330099 [2]南昌大学数学与计算机学院,江西南昌330031 [3]贵州师范大学数学科学学院,贵州贵阳550025 [4]重庆师范大学数学科学学院,重庆401331

出  处:《南昌大学学报(理科版)》2023年第3期213-217,共5页Journal of Nanchang University(Natural Science)

基  金:江西省高校人文社会科学研究项目(JY22202);贵州省自然科学技术基金项目([2020]1Y003);重庆市自然科学技术基金项目(cstc2019jcyj-msxmX0390);陕西铁路工程职业技术学院基金项目(KY2019-46)。

摘  要:朴素贝叶斯分类算法是一种简单实用的分类方法,人们对它的属性间条件独立性假设做了许多研究,致力于消除冗余属性、减少属性间的关联性,以获得一些新属性来使用朴素贝叶斯算法,但新属性间的独立性却不易度量,因而改进之处的理论支撑有所不足,改进后的朴素贝叶斯算法的效果更多的是由数据实验进行佐证。本文定义了Gauss分布型数据,提出了经Gram-Schmidt正交化方法改进的朴素贝叶斯算法,使其可以方便地使用于Gauss分布型数据的分类。该改进方法不同以往显式的构造新属性集或属性变换矩阵,而是直接正交化属性的样本数据,并证明了正交后的属性数据所对应的抽象新属性的独立性。这说明对于Gauss分布型数据的分类,原朴素贝叶斯算法中的条件独立性的假设不会给算法的使用造成障碍,经Gram-Schmidt正交化后即可满足这个约束条件。The naive Bayes classification algorithm is a simple and practical method for classification.There were a lot of studieson the assumption of conditional independence between attributes.The researchwas committed to eliminate redundant attributes and to reduce the correlation between attributes,with the aim to obtain some new attributes beingmore independentof and adapted to naive Bayes algorithm.However,the independence between new attributes was not easy to measure.Therefore,the improvement of the naive Bayes algorithm was not supported sufficiently by theory but was more supported by data experiments.This paper definedGauss distributed data and proposedan improved naive Bayes algorithm using the Gram-Schmidt orthogonalization method,making it convenient for classification with Gaussian distribution data.The improved method was different from the previous method.Unlike previous methods that explicitly construct new attributes set or attributes transformation matrix,the approach in the current paper directly orthogonalized the sample data of attributes.It was also provedthat the abstract new attributes corresponding to the orthogonalized attributes datawere independence.This showedthat the assumption of conditional independence in the original naive Bayes algorithm will not cause obstacles to the use of the algorithm for the classi-fication of Gauss distributed data,as this constraint can be satisfied after Gram-Schmidt orthogonalization.

关 键 词:Gauss分布型数据 Gram-Schmidt正交化 朴素贝叶斯 分类 

分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象