最近用到coco2017数据集做目标检测,顺便整理一下数据集。
coco数据集用专门的python api 方便我们直接来读取图片数据,详细的可以去看
https://github.com/cocodataset/cocoapi
,
我们这里主要是统计数据集的类别,这样就清楚自己的训练数据是否足够,不同的类别分布是否均衡等问题。
我们使用以下代码来统计类别、图片数、标注框数:
from pycocotools.coco import COCO
dataDir='./COCO'
dataType='val2017'
#dataType='train2017'
annFile='{}/annotations/instances_{}.json'.format(dataDir, dataType)
# initialize COCO api for instance annotations
coco=COCO(annFile)
# display COCO categories and supercategories
cats = coco.loadCats(coco.getCatIds())
cat_nms=[cat['name'] for cat in cats]
print('number of categories: ', len(cat_nms))
print('COCO categories: \n', cat_nms)
# 统计各类的图片数量和标注框数量
for cat_name in cat_nms:
catId = coco.getCatIds(catNms=cat_name) # 1~90
imgId = coco.getImgIds(catIds=catId) # 图片的id
annId = coco.getAnnIds(catIds=catId) # 标注框的id
print("{:<15} {:<6d} {:<10d}".format(cat_name, len(imgId), len(annId)))
测试集输出:
类别 | 图片数量 | 标注框数量 |
person | 2693 | 11004 |
bicycle | 149 | 316 |
car | 535 | 1932 |
motorcycle | 159 | 371 |
airplane | 97 | 143 |
bus | 189 | 285 |
train | 157 | 190 |
truck | 250 | 415 |
boat | 121 | 430 |
traffic light | 191 | 637 |
fire hydrant | 86 | 101 |
stop sign | 69 | 75 |
parking meter | 37 | 60 |
bench | 235 | 413 |
bird | 125 | 440 |
cat | 184 | 202 |
dog | 177 | 218 |
horse | 128 | 273 |
sheep | 65 | 361 |
cow | 87 | 380 |
elephant | 89 | 255 |
bear | 49 | 71 |
zebra | 85 | 268 |
giraffe | 101 | 232 |
backpack | 228 | 371 |
umbrella | 174 | 413 |
handbag | 292 | 540 |
tie | 145 | 254 |
suitcase | 105 | 303 |
frisbee | 84 | 115 |
skis | 120 | 241 |
snowboard | 49 | 69 |
sports ball | 169 | 263 |
kite | 91 | 336 |
baseball bat | 97 | 146 |
baseball glove | 100 | 148 |
skateboard | 127 | 179 |
surfboard | 149 | 269 |
tennis racket | 167 | 225 |
bottle | 379 | 1025 |
wine glass | 110 | 343 |
cup | 390 | 899 |
fork | 155 | 215 |
knife | 181 | 326 |
spoon | 153 | 253 |
bowl | 314 | 626 |
banana | 103 | 379 |
apple | 76 | 239 |
sandwich | 98 | 177 |
orange | 85 | 287 |
broccoli | 71 | 316 |
carrot | 3 | 2303 |
hot dog | 0 | 345 |
pizza | 153 | 285 |
donut | 62 | 338 |
cake | 124 | 316 |
chair | 580 | 1791 |
couch | 195 | 261 |
potted plant | 172 | 343 |
bed | 149 | 163 |
dining table | 501 | 697 |
toilet | 149 | 179 |
tv | 207 | 288 |
laptop | 183 | 231 |
mouse | 88 | 106 |
remote | 145 | 283 |
keyboard | 106 | 153 |
cell phone | 214 | 262 |
microwave | 54 | 55 |
oven | 115 | 143 |
toaster | 8 | 9 |
sink | 187 | 225 |
refrigerator | 101 | 126 |
book | 230 | 1161 |
clock | 204 | 267 |
vase | 137 | 277 |
scissors | 28 | 36 |
teddy bear | 0 | 262 |
hair drier | 9 | 11 |
toothbrush | 34 | 57 |
训练集输出:
类别 | 图片数量 | 标注框数量 |
person | 64115 | 262465 |
bicycle | 3252 | 7113 |
car | 12251 | 43867 |
motorcycle | 3502 | 8725 |
airplane | 2986 | 5135 |
bus | 3952 | 6069 |
train | 3588 | 4571 |
truck | 6127 | 9973 |
boat | 3025 | 10759 |
traffic light | 4139 | 12884 |
fire hydrant | 1711 | 1865 |
stop sign | 1734 | 1983 |
parking meter | 705 | 1285 |
bench | 5570 | 9838 |
bird | 3237 | 10806 |
cat | 4114 | 4768 |
dog | 4385 | 5508 |
horse | 2941 | 6587 |
sheep | 1529 | 9509 |
cow | 1968 | 8147 |
elephant | 2143 | 5513 |
bear | 960 | 1294 |
zebra | 1916 | 5303 |
giraffe | 2546 | 5131 |
backpack | 5528 | 8720 |
umbrella | 3968 | 11431 |
handbag | 6841 | 12354 |
tie | 3810 | 6496 |
suitcase | 2402 | 6192 |
frisbee | 2184 | 2682 |
skis | 3082 | 6646 |
snowboard | 1654 | 2685 |
sports ball | 4262 | 6347 |
kite | 2261 | 9076 |
baseball bat | 2506 | 3276 |
baseball glove | 2629 | 3747 |
skateboard | 3476 | 5543 |
surfboard | 3486 | 6126 |
tennis racket | 3394 | 4812 |
bottle | 8501 | 24342 |
wine glass | 2533 | 7913 |
cup | 9189 | 20650 |
fork | 3555 | 5479 |
knife | 4326 | 7770 |
spoon | 3529 | 6165 |
bowl | 7111 | 14358 |
banana | 2243 | 9458 |
apple | 1586 | 5851 |
sandwich | 2365 | 4373 |
orange | 1699 | 6399 |
broccoli | 1939 | 7308 |
carrot | 24 | 51719 |
hot dog | 11 | 8426 |
pizza | 3166 | 5821 |
donut | 1523 | 7179 |
cake | 2925 | 6353 |
chair | 12774 | 38491 |
couch | 4423 | 5779 |
potted plant | 4452 | 8652 |
bed | 3682 | 4192 |
dining table | 11837 | 15714 |
toilet | 3353 | 4157 |
tv | 4561 | 5805 |
laptop | 3524 | 4970 |
mouse | 1876 | 2262 |
remote | 3076 | 5703 |
keyboard | 2115 | 2855 |
cell phone | 4803 | 6434 |
microwave | 1547 | 1673 |
oven | 2877 | 3334 |
toaster | 217 | 225 |
sink | 4678 | 5610 |
refrigerator | 2360 | 2637 |
book | 5332 | 24715 |
clock | 4659 | 6334 |
vase | 3593 | 6613 |
scissors | 947 | 1481 |
teddy bear | 16 | 6087 |
hair drier | 189 | 198 |
toothbrush | 1007 | 1954 |
版权声明:本文为u012505617原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。