ML之Classification:以六类机器学习算法(kNN、逻辑回归、SVM、决策树、随机森林、提升树、神经网络)对糖尿病数据集(8→1)实现二分类预测案例来理解和认知机器学习分类预测的模板流程
目录
六类机器学习算法(kNN、逻辑回归、SVM、决策树、随机森林、提升树、神经网络)对糖尿病数据集(8→1)实现二分类预测
数据集理解
1、kNN
2、逻辑回归
3、SVM
4、决策树
5、随机森林
6、提升树
7、神经网络
相关文章
ML之Classification:以六类机器学习算法(kNN、逻辑回归、SVM、决策树、随机森林、提升树、神经网络)对糖尿病数据集(8→1)实现二分类预测案例来理解和认知机器学习分类预测的模板流程
ML之Classification:以六类机器学习算法(kNN、逻辑回归、SVM、决策树、随机森林、提升树、神经网络)对糖尿病数据集(8→1)实现二分类预测案例来理解和认知机器学习分类预测全部
六类机器学习算法(kNN、逻辑回归、SVM、决策树、随机森林、提升树、神经网络)对糖尿病数据集(8→1)实现二分类预测
数据集理解
data.shape: (768, 9)
data.columns:
Index(['Pregnancies', 'Glucose', 'BloodPressure', 'SkinThickness', 'Insulin',
'BMI', 'DiabetesPedigreeFunction', 'Age', 'Outcome'],
dtype='object')
data.head:
Pregnancies Glucose BloodPressure ... DiabetesPedigreeFunction Age Outcome
0 6 148 72 ... 0.627 50 1
1 1 85 66 ... 0.351 31 0
2 8 183 64 ... 0.672 32 1
3 1 89 66 ... 0.167 21 0
4 0 137 40 ... 2.288 33 1
[5 rows x 9 columns]
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 768 entries, 0 to 767
Data columns (total 9 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Pregnancies 768 non-null int64
1 Glucose 768 non-null int64
2 BloodPressure 768 non-null int64
3 SkinThickness 768 non-null int64
4 Insulin 768 non-null int64
5 BMI 768 non-null float64
6 DiabetesPedigreeFunction 768 non-null float64
7 Age 768 non-null int64
8 Outcome 768 non-null int64
dtypes: float64(2), int64(7)
memory usage: 54.1 KB
data.info:
None
8
data_column_X: ['Pregnancies', 'Glucose', 'BloodPressure', 'SkinThickness', 'Insulin', 'BMI', 'DiabetesPedigreeFunction', 'Age']
['Pregnancies', 'Glucose', 'BloodPressure', 'SkinThickness', 'Insulin', 'BMI', 'DiabetesPedigreeFunction', 'Age']
1、kNN
kNNC(n_neighbors=9):Training set accuracy: 0.792
kNNC(n_neighbors=9):Test set accuracy: 0.776
2、逻辑回归
LoR(c_regular=1):Training set accuracy: 0.785
LoR(c_regular=1):Test set accuracy: 0.771
3、SVM
SVMC_Init:Training set accuracy: 0.769
SVMC_Init:Test set accuracy: 0.755
SVMC_Best(max_dept=1,learning_rate=0.1):Training set accuracy: 0.788
SVMC_Best(max_dept=1,learning_rate=0.1):Test set accuracy: 0.781
DTC(max_dept=3):Training set accuracy: 0.773
DTC(max_dept=3):Test set accuracy: 0.740
4、决策树
DTC(max_dept=3):Training set accuracy: 0.773
DTC(max_dept=3):Test set accuracy: 0.740
5、随机森林
RFC_Best:Training set accuracy: 0.764
RFC_Best:Test set accuracy: 0.750
6、提升树
GBC(max_dept=1,learning_rate=0.1):Training set accuracy: 0.804
GBC(max_dept=1,learning_rate=0.1):Test set accuracy: 0.781
7、神经网络
MLPC_Init:Training set accuracy: 0.743
MLPC_Init:Test set accuracy: 0.672