COCO2017 数据集分类统计

  • Post author:
  • Post category:其他


最近用到coco2017数据集做目标检测,顺便整理一下数据集。

coco数据集用专门的python api 方便我们直接来读取图片数据,详细的可以去看

https://github.com/cocodataset/cocoapi

我们这里主要是统计数据集的类别,这样就清楚自己的训练数据是否足够,不同的类别分布是否均衡等问题。

我们使用以下代码来统计类别、图片数、标注框数:

from pycocotools.coco import COCO

dataDir='./COCO'
dataType='val2017'
#dataType='train2017'
annFile='{}/annotations/instances_{}.json'.format(dataDir, dataType)

# initialize COCO api for instance annotations
coco=COCO(annFile)

# display COCO categories and supercategories
cats = coco.loadCats(coco.getCatIds())
cat_nms=[cat['name'] for cat in cats]
print('number of categories: ', len(cat_nms))
print('COCO categories: \n', cat_nms)

# 统计各类的图片数量和标注框数量
for cat_name in cat_nms:
    catId = coco.getCatIds(catNms=cat_name)     # 1~90
    imgId = coco.getImgIds(catIds=catId)        # 图片的id  
    annId = coco.getAnnIds(catIds=catId)        # 标注框的id

    print("{:<15} {:<6d}     {:<10d}".format(cat_name, len(imgId), len(annId)))

测试集输出:

类别 图片数量 标注框数量
person 2693 11004
bicycle 149 316
car 535 1932
motorcycle 159 371
airplane 97 143
bus 189 285
train 157 190
truck 250 415
boat 121 430
traffic light 191 637
fire hydrant 86 101
stop sign 69 75
parking meter 37 60
bench 235 413
bird 125 440
cat 184 202
dog 177 218
horse 128 273
sheep 65 361
cow 87 380
elephant 89 255
bear 49 71
zebra 85 268
giraffe 101 232
backpack 228 371
umbrella 174 413
handbag 292 540
tie 145 254
suitcase 105 303
frisbee 84 115
skis 120 241
snowboard 49 69
sports ball 169 263
kite 91 336
baseball bat 97 146
baseball glove 100 148
skateboard 127 179
surfboard 149 269
tennis racket 167 225
bottle 379 1025
wine glass 110 343
cup 390 899
fork 155 215
knife 181 326
spoon 153 253
bowl 314 626
banana 103 379
apple 76 239
sandwich 98 177
orange 85 287
broccoli 71 316
carrot 3 2303
hot dog 0 345
pizza 153 285
donut 62 338
cake 124 316
chair 580 1791
couch 195 261
potted plant 172 343
bed 149 163
dining table 501 697
toilet 149 179
tv 207 288
laptop 183 231
mouse 88 106
remote 145 283
keyboard 106 153
cell phone 214 262
microwave 54 55
oven 115 143
toaster 8 9
sink 187 225
refrigerator 101 126
book 230 1161
clock 204 267
vase 137 277
scissors 28 36
teddy bear 0 262
hair drier 9 11
toothbrush 34 57

训练集输出:

类别 图片数量 标注框数量
person 64115 262465
bicycle 3252 7113
car 12251 43867
motorcycle 3502 8725
airplane 2986 5135
bus 3952 6069
train 3588 4571
truck 6127 9973
boat 3025 10759
traffic light 4139 12884
fire hydrant 1711 1865
stop sign 1734 1983
parking meter 705 1285
bench 5570 9838
bird 3237 10806
cat 4114 4768
dog 4385 5508
horse 2941 6587
sheep 1529 9509
cow 1968 8147
elephant 2143 5513
bear 960 1294
zebra 1916 5303
giraffe 2546 5131
backpack 5528 8720
umbrella 3968 11431
handbag 6841 12354
tie 3810 6496
suitcase 2402 6192
frisbee 2184 2682
skis 3082 6646
snowboard 1654 2685
sports ball 4262 6347
kite 2261 9076
baseball bat 2506 3276
baseball glove 2629 3747
skateboard 3476 5543
surfboard 3486 6126
tennis racket 3394 4812
bottle 8501 24342
wine glass 2533 7913
cup 9189 20650
fork 3555 5479
knife 4326 7770
spoon 3529 6165
bowl 7111 14358
banana 2243 9458
apple 1586 5851
sandwich 2365 4373
orange 1699 6399
broccoli 1939 7308
carrot 24 51719
hot dog 11 8426
pizza 3166 5821
donut 1523 7179
cake 2925 6353
chair 12774 38491
couch 4423 5779
potted plant 4452 8652
bed 3682 4192
dining table 11837 15714
toilet 3353 4157
tv 4561 5805
laptop 3524 4970
mouse 1876 2262
remote 3076 5703
keyboard 2115 2855
cell phone 4803 6434
microwave 1547 1673
oven 2877 3334
toaster 217 225
sink 4678 5610
refrigerator 2360 2637
book 5332 24715
clock 4659 6334
vase 3593 6613
scissors 947 1481
teddy bear 16 6087
hair drier 189 198
toothbrush 1007 1954



版权声明:本文为u012505617原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。