sklearn(三)计算recall:使用metrics.recall_score()计算二分类的召回率

Post author:xfxia
Post published:2023年9月10日
Post category:其他

1.召回率的计算原理

从

常用评价指标

文章中摘出来：

召回率是覆盖面的度量，度量有多个正例被分为正例，recall=TP/(TP+FN)=TP/P。被预测为正的样本占正样本总量的比例。Recall体现了模型对正样本的识别能力，Recall越高，模型对正样本的识别能力越强。

2.sklearn.metrics.recall_score()的使用方法

(二分类)

使用方式：

sklearn.metrics.recall_score(y_true, y_pred, *, labels=None, pos_label=1, average='binary', sample_weight=None, zero_division='warn')

输入参数：

y_true：

真实标签。

y_pred

：预测标签。

labels

：可选参数，是一个list。二分类时，用不上这个参数。

pos_label：字符串或者int类型，

默认值是1.

average：字符串类型，

取值为 [None, ‘binary’ (default), ‘micro’, ‘macro’, ‘samples’, ‘weighted’]。默认为二分类，给出正样本的召回率，也就是pos_label默认参数1时的召回率。

sample_weight

：（没用它）

zero_division

：

输出：

正样本召回率，浮点型。

3.例子

3.1数据格式：

把数据存放在统计目录:new_two.xlsx。

id	label	pred_label	pred_score	model_predict_scores
1537	0	0	0.98361117	[0.98361117 0.01638886]
1548	0	0	0.9303047	[0.9303047 0.06969527]
1540	0	0	0.978964	[0.978964 0.02103605]
15525	1	1	0.9876039	[0.01239602 0.9876039 ]

# -*- encoding:utf-8 -*-
import requests, xlrd, re, xlwt, json
from collections import defaultdict
from sklearn import metrics
def calculate_auc(read_path):
    workbook = xlrd.open_workbook(read_path)  # 打开工作簿
    sheets = workbook.sheet_names()  # 获取工作簿中的所有表格
    worksheet = workbook.sheet_by_name(sheets[0])  # 获取工作簿中所有表格中的的第一个表格
    label = [] # 真实标签
    pred = [] # 预测标签
    score = [] #跟预测标签对应的模型打分
    first = [] # 模型打分结果中类别0的概率，是一个n行 ，1列的数组
    preds = []  # 模型的打分结果中类别1的概率，是一个n行 ，1列的数组
    for i in range(0, 100):
        value = worksheet.cell_value(i, 1)
        value1 = worksheet.cell_value(i, 2)
        label.append(int(value))
        pred.append(int(value1))
        score.append(float(worksheet.cell_value(i, 3)))
        a = worksheet.cell_value(i, 4)
        d = a.replace('[', '')
        d = d.replace(']', '')
        d = d.strip()
        d = d.split(" ")
        l = len(d)
        print(' len ', l)
        g = []
        h = []
        h.append(float(d[0]))
        g.append(float(d[l - 1]))
        preds.append(g)
        first.append(h)
    recall_score = metrics.recall_score(label, pred)
    print('--recall_score:', recall_score)
 
if __name__ == '__main__':
    read_path = './new_two.xlsx'
    calculate_auc(read_path)

输出结果：

–recall_score: 0.8775510204081632

参考:

1.官方文档：

https://scikit-learn.org/stable/modules/generated/sklearn.metrics.recall_score.html?highlight=recall_score

原文链接：https://blog.csdn.net/pearl8899/article/details/109873555

1.召回率的计算原理

2.sklearn.metrics.recall_score()的使用方法 (二分类)

3.例子

3.1数据格式：

你可能也喜欢

2.sklearn.metrics.recall_score()的使用方法

(二分类)