目标检测 YoloV5

2021-08-28 Introduction PV:

环境配置

GPU环境配置

查看当前环境列表
conda env list

创建yolov5 python环境
conda create -n yolov5gpu python=3.7

查看下载源地址
conda config –show-sources

添加镜像源地址
conda config –add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
conda config –add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main

########### 设置搜索时显示通道地址
conda config –set show_channel_urls yes

切换默认源地址
conda config –remove-key channels

链接数据库

Python链接hive

JDBC链接hive数据库

Jar包下载地址
https://mvnrepository.com/
https://mvnrepository.com/artifact/org.slf4j/slf4j-simple/1.7.25

Jdk安装包

TypeError: Class org.apache.hive.jdbc.HiveDriver is not found
加上jdbc配置文件

下载图片到本地

Mmdetection安装

1.10.1+cu113

mmcv-full==1.3.9

https://github.com/yaochenglouis/mmdetection/blob/yolov4zhang/docs/en/get_started.md

mmlab安装版本选择

安装语句
pip install mmcv-full==1.3.9 -f https://download.openmmlab.com/mmcv/dist/cu113/torch1.10.0/index.html

参数调整与优化

模型配置文件

yolov5s模型配置文件如下

# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
# Parameters
nc: 80  # number of classes
depth_multiple: 0.33  # model depth multiple
width_multiple: 0.50  # layer channel multiple
anchors:
  - [10,13, 16,30, 33,23]  # P3/8
  - [30,61, 62,45, 59,119]  # P4/16
  - [116,90, 156,198, 373,326]  # P5/32

# YOLOv5 v6.0 backbone
# backbone:
  # [from, number, module, args]
  [[-1, 1, Conv, [64, 6, 2, 2]],  # 0-P1/2
   [-1, 1, Conv, [128, 3, 2]],  # 1-P2/4
   [-1, 3, C3, [128]],
   [-1, 1, Conv, [256, 3, 2]],  # 3-P3/8
   [-1, 6, C3, [256]],
   [-1, 1, Conv, [512, 3, 2]],  # 5-P4/16
   [-1, 9, C3, [512]],
   [-1, 1, Conv, [1024, 3, 2]],  # 7-P5/32
   [-1, 3, C3, [1024]],
   [-1, 1, SPPF, [1024, 5]],  # 9
  ]

# YOLOv5 v6.0 head
# head:
  [[-1, 1, Conv, [512, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 6], 1, Concat, [1]],  # cat backbone P4
   [-1, 3, C3, [512, False]],  # 13

   [-1, 1, Conv, [256, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 4], 1, Concat, [1]],  # cat backbone P3
   [-1, 3, C3, [256, False]],  # 17 (P3/8-small)

   [-1, 1, Conv, [256, 3, 2]],
   [[-1, 14], 1, Concat, [1]],  # cat head P4
   [-1, 3, C3, [512, False]],  # 20 (P4/16-medium)

   [-1, 1, Conv, [512, 3, 2]],
   [[-1, 10], 1, Concat, [1]],  # cat head P5
   [-1, 3, C3, [1024, False]],  # 23 (P5/32-large)

   [[17, 20, 23], 1, Detect, [nc, anchors]],  # Detect(P3, P4, P5)
  ]

from：输入来自那一层，-1代表上一次，1代表第1层，3代表第3层
number：模块的数量，最终数量需要乘width，然后四舍五入取整，如果小于1，取1。
module：子模块
args：模块参数，channel，kernel_size，stride，padding，bias等
Focus：对特征图进行切片操作，[64,3]得到[3,32,3]，即输入channel=3（RGB），输出为640.5(width_multiple)=32，3为卷积核尺寸。
Conv：nn.conv(kenel_size=1，stride=1，groups=1，bias=False) + Bn + Leaky_ReLu。[-1, 1, Conv, [128, 3, 2]]：输入来自上一层，模块数量为1个，子模块为Conv，网络中最终有1280.5=32个卷积核，卷积核尺寸为3，stride=2,。
BottleNeckCSP：借鉴CSPNet网络结构，由3个卷积层和X个残差模块Concat组成，若有False，则没有残差模块，那么组成结构为nn.conv+Bn+Leaky_ReLu
SPP：[-1, 1, SPP, [1024, [5, 9, 13]]]表示5×5，9×9，13×13的最大池化方式，进行多尺度融合

超参文件

yolov5/data/hyps/hyp.scratch-low.yaml ：YOLOv5 COCO训练从头优化，数据增强低
yolov5/data/hyps/hyp.scratch-mdeia.yaml（数据增强中）
yolov5/data/hyps/hyp.scratch-high.yaml（数据增强高）

lr0: 0.01  # 初始学习率 (SGD=1E-2, Adam=1E-3)
lrf: 0.2  # 循环学习率 (lr0 * lrf)
momentum: 0.937  # SGD momentum/Adam beta1 学习率动量
weight_decay: 0.0005  # 权重衰减系数
warmup_epochs: 3.0  # 预热学习 (fractions ok)
warmup_momentum: 0.8  # 预热学习动量
warmup_bias_lr: 0.1  # 预热初始学习率
box: 0.05  # iou损失系数
cls: 0.5  # cls损失系数
cls_pw: 1.0  # cls BCELoss正样本权重
obj: 1.0  # 有无物体系数(scale with pixels)
obj_pw: 1.0  # 有无物体BCELoss正样本权重
iou_t: 0.20  # IoU训练时的阈值
anchor_t: 4.0  # anchor的长宽比（长:宽 = 4:1）
# anchors: 3  # 每个输出层的anchors数量(0 to ignore)

#以下系数是数据增强系数，包括颜色空间和图片空间

fl_gamma: 0.0  # focal loss gamma (efficientDet default gamma=1.5)
hsv_h: 0.015  # 色调 (fraction)
hsv_s: 0.7  # 饱和度 (fraction)
hsv_v: 0.4  # 亮度 (fraction)
degrees: 0.0  # 旋转角度 (+/- deg)
translate: 0.1  # 平移(+/- fraction)
scale: 0.5  # 图像缩放 (+/- gain)
shear: 0.0  # 图像剪切 (+/- deg)
perspective: 0.0  # 透明度 (+/- fraction), range 0-0.001
flipud: 0.0  # 上下翻转概率 (probability)
fliplr: 0.5  # 左右翻转概率 (probability)
mosaic: 1.0  # 进行Mosaic概率 (probability)
mixup: 0.0  # 进行图像混叠概率（即，多张图像重叠在一起） (probability)

代码中参数

train.py参数解析

parser = argparse.ArgumentParser()
    parser.add_argument('--weights', type=str, default=ROOT / 'yolov5s.pt', help='initial weights path')
    parser.add_argument('--cfg', type=str, default='', help='model.yaml path')
    parser.add_argument('--data', type=str, default=ROOT / 'data/coco128.yaml', help='dataset.yaml path')
    parser.add_argument('--hyp', type=str, default=ROOT / 'data/hyps/hyp.scratch-low.yaml', help='hyperparameters path')
    parser.add_argument('--epochs', type=int, default=300, help='total training epochs')
    parser.add_argument('--batch-size', type=int, default=16, help='total batch size for all GPUs, -1 for autobatch')
    parser.add_argument('--imgsz', '--img', '--img-size', type=int, default=640, help='train, val image size (pixels)')
    parser.add_argument('--rect', action='store_true', help='rectangular training')
    parser.add_argument('--resume', nargs='?', const=True, default=False, help='resume most recent training')
    parser.add_argument('--nosave', action='store_true', help='only save final checkpoint')
    parser.add_argument('--noval', action='store_true', help='only validate final epoch')
    parser.add_argument('--noautoanchor', action='store_true', help='disable AutoAnchor')
    parser.add_argument('--noplots', action='store_true', help='save no plot files')
    parser.add_argument('--evolve', type=int, nargs='?', const=300, help='evolve hyperparameters for x generations')
    parser.add_argument('--bucket', type=str, default='', help='gsutil bucket')
    parser.add_argument('--cache', type=str, nargs='?', const='ram', help='--cache images in "ram" (default) or "disk"')
    parser.add_argument('--image-weights', action='store_true', help='use weighted image selection for training')
    parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
    parser.add_argument('--multi-scale', action='store_true', help='vary img-size +/- 50%%')
    parser.add_argument('--single-cls', action='store_true', help='train multi-class data as single-class')
    parser.add_argument('--optimizer', type=str, choices=['SGD', 'Adam', 'AdamW'], default='SGD', help='optimizer')
    parser.add_argument('--sync-bn', action='store_true', help='use SyncBatchNorm, only available in DDP mode')
    parser.add_argument('--workers', type=int, default=8, help='max dataloader workers (per RANK in DDP mode)')
    parser.add_argument('--project', default=ROOT / 'runs/train', help='save to project/name')
    parser.add_argument('--name', default='exp', help='save to project/name')
    parser.add_argument('--exist-ok', action='store_true', help='existing project/name ok, do not increment')
    parser.add_argument('--quad', action='store_true', help='quad dataloader')
    parser.add_argument('--cos-lr', action='store_true', help='cosine LR scheduler')
    parser.add_argument('--label-smoothing', type=float, default=0.0, help='Label smoothing epsilon')
    parser.add_argument('--patience', type=int, default=100, help='EarlyStopping patience (epochs without improvement)')
    parser.add_argument('--freeze', nargs='+', type=int, default=[0], help='Freeze layers: backbone=10, first3=0 1 2')
    parser.add_argument('--save-period', type=int, default=-1, help='Save checkpoint every x epochs (disabled if < 1)')
    parser.add_argument('--seed', type=int, default=0, help='Global training seed')
    parser.add_argument('--local_rank', type=int, default=-1, help='Automatic DDP Multi-GPU argument, do not modify')

    # Logger arguments
    parser.add_argument('--entity', default=None, help='Entity')
    parser.add_argument('--upload_dataset', nargs='?', const=True, default=False, help='Upload data, "val" option')
    parser.add_argument('--bbox_interval', type=int, default=-1, help='Set bounding-box image logging interval')
    parser.add_argument('--artifact_alias', type=str, default='latest', help='Version of dataset artifact to use')

weights：指定预训练权重路径；如果这里设置为空的话，就是自己从头开始进行训练
cfg：模型配置文件，比如models/yolov5s.yaml。
data：数据集对应的yaml参数文件；里面主要存放数据集的类别和路径信息，例如：

yaml:
names:
- crazing
- inclusion
- pitted_surface
- scratches
- patches
- rolled-in_scale
nc: 6
path: /kaggle/working/
train: /kaggle/working/train.txt
val: /kaggle/working/val.txt

rect:是否使用矩阵推理的方式训练模型
resume：断点续训：即是否在之前训练的一个模型基础上继续训练，default 值默认是 false。一种方式是先将train.py中这一行default=False 改为 default=True：
1
parser.add_argument('--resume', nargs='?', const=True, default=True, help='resume most recent training')

然后执行代码：

1	python train.py --resume \runs\train\exp\weights\last.pt

或者参考其他写法

nosave：是否只保存最后一轮的pt文件；我们默认是保存best.pt和last.pt两个的。
noval：是否只在最后一轮测试
正常情况下每个epoch都会计算mAP，但如果开启了这个参数，那么就只在最后一轮上进行测试，不建议开启。
noautoanchor：是否禁用自动锚框；默认是开启的。
noplots：开启这个参数后将不保存绘图文件
evolve：yolov5使用遗传超参数进化，提供的默认参数是通过在COCO数据集上使用超参数进化得来的（也就是hpy文件夹下默认的超参数）。由于超参数进化会耗费大量的资源和时间，所以建议大家不要动这个参数。（开了貌似不评估mertic了）
bucket：谷歌云盘；通过这个参数可以下载谷歌云盘上的一些东西，但是现在没必要使用了
cache：是否提前缓存图片到内存，以加快训练速度，默认False；开启这个参数就会对图片进行缓存，从而更好的训练模型
image-weights：是否启用加权图像策略，默认是不开启的
主要是为了解决样本不平衡问题。开启后会对于上一轮训练效果不好的图片，在下一轮中增加一些权重
multi-scale：是否启用多尺度训练，默认是不开启的
多尺度训练是指设置几种不同的图片输入尺度，训练时每隔一定iterations随机选取一种尺度训练，这样训练出来的模型鲁棒性更强。
single-cls：训练数据集是否是单类别，默认False
optimizer：优化器；默认为SGD，可选SGD，Adam，AdamW。
sync-bn：是否开启跨卡同步BN
开启后，可使用 SyncBatchNorm 进行多 GPU分布式训练
workers：进程数
project：指定模型的保存路径；默认在runs/train。
name：模型保存的文件夹名，默认在exp文件夹。
exist-ok：每次模型预测结果是否保存在原来的文件夹
如果指定了这个参数的话，那么本次预测的结果还是保存在上一次保存的文件夹里
如果不指定，就是每预测一次结果，就保存在一个新的文件夹里。
quad：暂不明
cos-lr：是否开启余弦学习率。（下面是开启前后学习率曲线对比图）
label-smoothing：是否启用标签平滑处理，默认不启用
patience：早停轮数，默认100。
如果模型在100轮里没有提升，则停止训练模型
freeze：指定冻结层数量；可以在yolov5s.yaml中查看主干网络层数
save-period：多少个epoch保存一下checkpoint，default=-1。
seed：随机种子。v6.2版本更新的一个非常重要的参数，使用torch>=1.12.0的单GPU YOLOv5训练现在完全可再现
local_rank：DistributedDataParallel 单机多卡训练，单GPU设备不需要设置
entity：在线可视化工具wandb
upload_dataset：是否上传dataset到wandb tabel，默认False
启用后，将数据集作为交互式 dsviz表在浏览器中查看、查询、筛选和分析数据集
bbox_interval：设置界框图像记录间隔 Set bounding-box image logging interval for W&B 默认-1
artifact_alias：使用数据的版本

detact.py参数解析

def parse_opt():
    parser = argparse.ArgumentParser()
    parser.add_argument('--weights', nargs='+', type=str, default=ROOT / 'yolov5s.pt', help='model path or triton URL')
    parser.add_argument('--source', type=str, default=ROOT / 'data/images', help='file/dir/URL/glob/screen/0(webcam)')
    parser.add_argument('--data', type=str, default=ROOT / 'data/coco128.yaml', help='(optional) dataset.yaml path')
    parser.add_argument('--imgsz', '--img', '--img-size', nargs='+', type=int, default=[640], help='inference size h,w')
    parser.add_argument('--conf-thres', type=float, default=0.25, help='confidence threshold')
    parser.add_argument('--iou-thres', type=float, default=0.45, help='NMS IoU threshold')
    parser.add_argument('--max-det', type=int, default=1000, help='maximum detections per image')
    parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
    parser.add_argument('--view-img', action='store_true', help='show results')
    parser.add_argument('--save-txt', action='store_true', help='save results to *.txt')
    parser.add_argument('--save-conf', action='store_true', help='save confidences in --save-txt labels')
    parser.add_argument('--save-crop', action='store_true', help='save cropped prediction boxes')
    parser.add_argument('--nosave', action='store_true', help='do not save images/videos')
    parser.add_argument('--classes', nargs='+', type=int, help='filter by class: --classes 0, or --classes 0 2 3')
    parser.add_argument('--agnostic-nms', action='store_true', help='class-agnostic NMS')
    parser.add_argument('--augment', action='store_true', help='augmented inference')
    parser.add_argument('--visualize', action='store_true', help='visualize features')
    parser.add_argument('--update', action='store_true', help='update all models')
    parser.add_argument('--project', default=ROOT / 'runs/detect', help='save results to project/name')
    parser.add_argument('--name', default='exp', help='save results to project/name')
    parser.add_argument('--exist-ok', action='store_true', help='existing project/name ok, do not increment')
    parser.add_argument('--line-thickness', default=3, type=int, help='bounding box thickness (pixels)')
    parser.add_argument('--hide-labels', default=False, action='store_true', help='hide labels')
    parser.add_argument('--hide-conf', default=False, action='store_true', help='hide confidences')
    parser.add_argument('--half', action='store_true', help='use FP16 half-precision inference')
    parser.add_argument('--dnn', action='store_true', help='use OpenCV DNN for ONNX inference')
    parser.add_argument('--vid-stride', type=int, default=1, help='video frame-rate stride')
    opt = parser.parse_args()
    opt.imgsz *= 2 if len(opt.imgsz) == 1 else 1  # expand
    print_args(vars(opt))
    return opt

source：测试集文件/文件夹
data：配置文件路径，和train.py里面的data是一样的
conf-thres：置信度的阈值超过这个阈值的预测框就会被预测出来。比如conf-thres参数依次设置成“0”, “0.25”，“0.8”
iou-thres：iou阈值
max-det：每张图最大检测数量，默认是最多检测1000个目标
view-img：检测的时候是否实时的把检测结果显示出来
如果输入代码python detect.py –view-img，在检测的时候系统要把我检测的结果实时的显示出来，假如我文件夹有5张图片，那么模型每检测出一张就会显示出一张，直到所有图片检测完成。
save-txt：是否把检测结果保存成一个.txt的格式
txt默认保存物体的类别索引和预测框坐标（YOLO格式），每张图一个txt，txt中每行表示一个物体
save-conf：上面保存的txt中是否包含置信度
save-crop：是否把模型检测的物体裁剪下来
开启了这个参数会在crops文件夹下看到几个以类别命名的文件夹，里面保存的都是裁剪下来的图片。
nosave：不保存预测的结果
但是还会生成exp文件夹，只不过是一个空的exp。这个参数应该是和“–view-img”配合使用的
classes：指定检测某几种类别。
比如coco128.yaml中person是第一个类别，classes指定“0”，则表示只检测图片中的person。
agnostic-nms：跨类别nms
比如待检测图像中有一个长得很像排球的足球，pt文件的分类中有足球和排球两种，那在识别时这个足球可能会被同时框上2个框：一个是足球，一个是排球。开启agnostic-nms后，那只会框出一个框
augment：数据增强。下面是启用前后的对比示例
visualize：是否可视化特征图。
如果开启了这和参数可以看到exp文件夹下又多了一些文件，这里.npy格式的文件就是保存的模型文件，可以使用numpy读写。还有一些png文件。下面来看一下保存
update：如果指定这个参数，则对所有模型进行strip_optimizer操作，去除pt文件中的优化器等信息
project：预测结果保存的路径
name：预测结果保存文件夹名
exist-ok：每次预测模型的结果是否保存在原来的文件夹
如果指定了这个参数的话，那么本次预测的结果还是保存在上一次保存的文件夹里；如果不指定就是每次预测结果保存一个新的文件夹下
line-thickness：调节预测框线条粗细的，default=3
因为有的时候目标重叠太多会产生遮挡，比如python detect.py –line-thickness 10
hide-labels：隐藏预测图片上的标签（只有预测框）
hide-conf：隐藏置信度（还有预测框和类别信息，但是没有置信度）
half：是否使用 FP16 半精度推理。
在training阶段，梯度的更新往往是很微小的，需要相对较高的精度，一般要用到FP32以上。在inference的时候，精度要求没有那么高，一般F16（半精度）就可以，甚至可以用INT8（8位整型），精度影响不会很大。同时低精度的模型占用空间更小了，有利于部署在嵌入式模型里面。
dnn：是否使用 OpenCV DNN 进行 ONNX 推理。

val.py参数解析

val.py的作用：
我们在训练结束后会打印出每个类别的一些评价指标，但是如果当时忘记记录，那么我们就可以通过这个文件再次打印这些评价指标
可以打印出测试集评价指标，测试集的图片也是需要标注的。

def parse_opt():
    parser = argparse.ArgumentParser()
    parser.add_argument('--data', type=str, default=ROOT / 'data/coco128.yaml', help='dataset.yaml path')
    parser.add_argument('--weights', nargs='+', type=str, default=ROOT / 'yolov5s.pt', help='model path(s)')
    parser.add_argument('--batch-size', type=int, default=32, help='batch size')
    parser.add_argument('--imgsz', '--img', '--img-size', type=int, default=640, help='inference size (pixels)')
    parser.add_argument('--conf-thres', type=float, default=0.001, help='confidence threshold')
    parser.add_argument('--iou-thres', type=float, default=0.6, help='NMS IoU threshold')
    parser.add_argument('--max-det', type=int, default=300, help='maximum detections per image')
    parser.add_argument('--task', default='val', help='train, val, test, speed or study')
    parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
    parser.add_argument('--workers', type=int, default=8, help='max dataloader workers (per RANK in DDP mode)')
    parser.add_argument('--single-cls', action='store_true', help='treat as single-class dataset')
    parser.add_argument('--augment', action='store_true', help='augmented inference')
    parser.add_argument('--verbose', action='store_true', help='report mAP by class')
    parser.add_argument('--save-txt', action='store_true', help='save results to *.txt')
    parser.add_argument('--save-hybrid', action='store_true', help='save label+prediction hybrid results to *.txt')
    parser.add_argument('--save-conf', action='store_true', help='save confidences in --save-txt labels')
    parser.add_argument('--save-json', action='store_true', help='save a COCO-JSON results file')
    parser.add_argument('--project', default=ROOT / 'runs/val', help='save to project/name')
    parser.add_argument('--name', default='exp', help='save to project/name')
    parser.add_argument('--exist-ok', action='store_true', help='existing project/name ok, do not increment')
    parser.add_argument('--half', action='store_true', help='use FP16 half-precision inference')
    parser.add_argument('--dnn', action='store_true', help='use OpenCV DNN for ONNX inference')
    opt = parser.parse_args()
    opt.data = check_yaml(opt.data)  # check YAML
    opt.save_json |= opt.data.endswith('coco.yaml')
    opt.save_txt |= opt.save_hybrid
    print_args(vars(opt))
    return opt

前六个参数和detect.py意义一样。

task：可以是train, val, test。比如：python val.py –task test表示打印测试集指标
augment：测试是否使用TTA Test Time Augment，指定这个参数后各项指标会明显提升几个点。
verbose：是否打印出每个类别的mAP，默认False。
save-hybrid：将标签+预测混合结果保存到 .txt
save-json：是否按照coco的json格式保存预测框，并且使用cocoapi做评估（需要同样coco的json格式的标签）默认False
half：是否使用半精度推理默认False
其它参数内容同detect.py。

添加注意力机制

Epoch    GPU_mem   box_loss   obj_loss   cls_loss  Instances       Size
98/99      3.81G    0.03403     0.0175   0.002002         72        256: 100% 75/75 [00:29<00:00,  2.52it/s]
            Class     Images  Instances          P          R      mAP50   mAP50-95: 100% 7/7 [00:02<00:00,  2.93it/s]
              all        200        420      0.765      0.753      0.794      0.445