检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:Shuichi Shinmura
机构地区:[1]Faculty of Economics, Seikei University, Japan
出 处:《Journal of Statistical Science and Application》2016年第4期165-178,共14页统计科学与应用(英文版)
摘 要:There are four serious problems in the discriminant analysis. We developed an optimal linear discriminant function (optimal LDF) based on the minimum number of misclassification (minimum NM) using integer programming (IP). We call this LDF as Revised IP-OLDF. Only this LDF can discriminate the cases on the discriminant hyperplane (Probleml). This LDF and a hard-margin SVM (H-SVM) can discriminate the lineary separable data (LSD) exactly. Another LDFs may not discriminate the LSD theoretically (Problem2). When Revised IP-OLDF discriminate the Swiss banknote data with six variables, we find MNM of two-variables model such as (X4, X6) is zero. Because MNMk decreases monotounusly (MNMk 〉= MNM(k+1)), sixteen MNMs including (X4, X6) are zero. Until now, because there is no research of the LSD, we surveyed another three linear separable data sets such as: 18 exam scores data sets, the Japanese 44 cars data and six microarray datasets. When we discriminate the exam scores with MNM=0, we find the generalized inverse matrix technique causes the serious Problem3 and confirmed this fact by the cars data. At last, we claim the discriminant analysis is not the inferential statistics because there is no standard errors (SEs) of error rates and discriminant coefficients (Problem4). Therefore, we poroposed the "100-fold cross validation for the small sample" method (the method). By this break-through, we can choose the best model having minimum mean of error rate (M2) in the validation sample and obtaine two 95% confidence intervals (CIs) of error rate and discriminant coefficients. When we discriminate the exam scores by this new method, we obtaine the surprising results seven LDFs except for Fisher's LDF are almost the same as the trivial LDFs. In this research, we discriminate the Japanese 44 cars data because we can discuss four problems. There are six independent variables to discriminate 29 regular cars and 15 small cars. This data is linear separab
关 键 词:Model Selection Procedure Means of Error Rates Fisher's LDF Logistic Regression Support VectorMachine (SVM) Minimum Number of Misclassifications (minimum NM MNM) Revised IP-OLDF based onMNM criterion Revised IPLP-OLDF Revised LP-OLDF Linear Separable Data and Model K-fold Crossvalidation.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222