mmdet3d预处理(下)| train pipeline

news/2024/5/3 22:08:32/文章来源:https://blog.csdn.net/moneymyone/article/details/131779388

mmdet3d预处理(下)—— train pipeline

文章目录

  • mmdet3d预处理(下)—— train pipeline
    • 基类 BaseTransform
    • LoadPointsFromFile
    • LoadAnnotations3D
      • 标签信息:
      • 源码
    • ObjectSample
      • 源码
    • ObjectNoise
      • 输入参数
      • 源码
      • RandomFlip3D
      • 参数
      • 源码
    • GlobalRotScaleTrans
      • 参数
      • 源码
    • PointsRangeFilter
      • 源码
    • ObjectRangeFilter
      • 源码
    • ObjectNameFilter
      • 源码
    • PointShuffle
      • 源码
    • Pack3DDetInputs
      • 源码
    • Reference
      • >>>>> 欢迎关注公众号【三戒纪元】 <<<<<

本篇着重分析训练pipeline中各函数的作用及源码剖析

data_pipeline

基类 BaseTransform

MMDet3D 1.1中的数据转换都继承自MMCV>=2.0.0 r0中的BaseTransform

  • 一些数据转换的功能(例如,“Resize”)被分解成几个转换,以简化和澄清用法。
  • 根据数据集的新数据结构,每次数据转换处理的数据字典格式都会发生变化。将一些低效的数据转换(如归一化normalization和填充padding)移到模型的数据预处理中,以改进数据加载和训练速度

BaseTransform类的继承关系

LoadPointsFromFile

mmdet3d/datasets/transforms/loading.py

从点云文件加载点

@TRANSFORMS.register_module()
class LoadPointsFromFile(BaseTransform):"""Load Points From File.Required Keys:- lidar_points (dict)- lidar_path (str)Added Keys:- points (np.float32)Args:coord_type (str): The type of coordinates of points cloud.Available options includes:- 'LIDAR': Points in LiDAR coordinates.- 'DEPTH': Points in depth coordinates, usually for indoor dataset.- 'CAMERA': Points in camera coordinates.load_dim (int): The dimension of the loaded points. Defaults to 6.use_dim (list[int] | int): Which dimensions of the points to use.Defaults to [0, 1, 2]. For KITTI dataset, set use_dim=4or use_dim=[0, 1, 2, 3] to use the intensity dimension.shift_height (bool): Whether to use shifted height. Defaults to False.use_color (bool): Whether to use color features. Defaults to False.norm_intensity (bool): Whether to normlize the intensity. Defaults toFalse.backend_args (dict, optional): Arguments to instantiate thecorresponding backend. Defaults to None."""def __init__(self,coord_type: str,load_dim: int = 6,use_dim: Union[int, List[int]] = [0, 1, 2],shift_height: bool = False,use_color: bool = False,norm_intensity: bool = False,backend_args: Optional[dict] = None) -> None:self.shift_height = shift_heightself.use_color = use_colorif isinstance(use_dim, int):use_dim = list(range(use_dim))assert max(use_dim) < load_dim, \f'Expect all used dimensions < {load_dim}, got {use_dim}'assert coord_type in ['CAMERA', 'LIDAR', 'DEPTH']self.coord_type = coord_typeself.load_dim = load_dimself.use_dim = use_dimself.norm_intensity = norm_intensityself.backend_args = backend_argsdef _load_points(self, pts_filename: str) -> np.ndarray:"""Private function to load point clouds data.Args:pts_filename (str): Filename of point clouds data.Returns:np.ndarray: An array containing point clouds data."""try:pts_bytes = get(pts_filename, backend_args=self.backend_args)points = np.frombuffer(pts_bytes, dtype=np.float32)except ConnectionError:mmengine.check_file_exist(pts_filename)if pts_filename.endswith('.npy'):points = np.load(pts_filename)else:points = np.fromfile(pts_filename, dtype=np.float32)return pointsdef transform(self, results: dict) -> dict:"""Method to load points data from file.Args:results (dict): Result dict containing point clouds data.Returns:dict: The result dict containing the point clouds data.Added key and value are described below.- points (:obj:`BasePoints`): Point clouds data."""pts_file_path = results['lidar_points']['lidar_path']points = self._load_points(pts_file_path)points = points.reshape(-1, self.load_dim)points = points[:, self.use_dim]if self.norm_intensity:assert len(self.use_dim) >= 4, \f'When using intensity norm, expect used dimensions >= 4, got {len(self.use_dim)}'  # noqa: E501points[:, 3] = np.tanh(points[:, 3])attribute_dims = Noneif self.shift_height:floor_height = np.percentile(points[:, 2], 0.99)height = points[:, 2] - floor_heightpoints = np.concatenate([points[:, :3],np.expand_dims(height, 1), points[:, 3:]], 1)attribute_dims = dict(height=3)if self.use_color:assert len(self.use_dim) >= 6if attribute_dims is None:attribute_dims = dict()attribute_dims.update(dict(color=[points.shape[1] - 3,points.shape[1] - 2,points.shape[1] - 1,]))points_class = get_points_type(self.coord_type)points = points_class(points, points_dim=points.shape[-1], attribute_dims=attribute_dims)results['points'] = pointsreturn results

_load_points:从文件中加载点云,转换为numpy格式

transform

  • 首先正常从文件加载点云
  • 根据是否对强度值规范化 ,对强度值进行处理(tanh)
  • 是否偏移高度,是否将点云高度减去地面高度
  • 是否使用颜色
  • 返回带点云数据的字典

LoadAnnotations3D

加载点的实例掩码和语义掩码,并将条目封装到相关字段中。

说白点就是加载GT框的相关信息。

标签信息:

keydescription
gt_bboxes_3d (:obj:LiDARInstance3DBoxes |:obj:DepthInstance3DBoxes | :obj:CameraInstance3DBoxes)gt 3d 包围盒。 仅当“with_bbox_3d”为 True 时
gt_labels_3d (np.int64)gt 3d 标签。仅当“with_label_3d”为 True 时。
gt_bboxes (np.float32)gt bboxes 包围盒。仅当“with_bbox”为 True 时。
gt_labels (np.ndarray)gt 标签。仅当“with_label”为 True 时。
depths (np.ndarray)深度。仅当with_bbox_depth 为 True。
center_2d (np.ndarray)2d中心。仅当with_bbox_depth 为 True。
attr_labels (np.ndarray)实例的属性标签。仅当“with_attr_label”为 True 时。

源码

@TRANSFORMS.register_module()
class LoadAnnotations3D(LoadAnnotations):"""Load Annotations3D.Load instance mask and semantic mask of points andencapsulate the items into related fields.def _load_bboxes_3d(self, results: dict) -> dict:"""Private function to move the 3D bounding box annotation from`ann_info` field to the root of `results`.Args:results (dict): Result dict from :obj:`mmdet3d.CustomDataset`.Returns:dict: The dict containing loaded 3D bounding box annotations."""results['gt_bboxes_3d'] = results['ann_info']['gt_bboxes_3d']return resultsdef _load_bboxes_depth(self, results: dict) -> dict:"""Private function to load 2.5D bounding box annotations.Args:results (dict): Result dict from :obj:`mmdet3d.CustomDataset`.Returns:dict: The dict containing loaded 2.5D bounding box annotations."""results['depths'] = results['ann_info']['depths']results['centers_2d'] = results['ann_info']['centers_2d']return resultsdef _load_labels_3d(self, results: dict) -> dict:"""Private function to load label annotations.Args:results (dict): Result dict from :obj:`mmdet3d.CustomDataset`.Returns:dict: The dict containing loaded label annotations."""results['gt_labels_3d'] = results['ann_info']['gt_labels_3d']return resultsdef _load_attr_labels(self, results: dict) -> dict:"""Private function to load label annotations.Args:results (dict): Result dict from :obj:`mmdet3d.CustomDataset`.Returns:dict: The dict containing loaded label annotations."""results['attr_labels'] = results['ann_info']['attr_labels']return resultsdef _load_masks_3d(self, results: dict) -> dict:"""Private function to load 3D mask annotations.Args:results (dict): Result dict from :obj:`mmdet3d.CustomDataset`.Returns:dict: The dict containing loaded 3D mask annotations."""pts_instance_mask_path = results['pts_instance_mask_path']try:mask_bytes = get(pts_instance_mask_path, backend_args=self.backend_args)pts_instance_mask = np.frombuffer(mask_bytes, dtype=np.int64)except ConnectionError:mmengine.check_file_exist(pts_instance_mask_path)pts_instance_mask = np.fromfile(pts_instance_mask_path, dtype=np.int64)results['pts_instance_mask'] = pts_instance_mask# 'eval_ann_info' will be passed to evaluatorif 'eval_ann_info' in results:results['eval_ann_info']['pts_instance_mask'] = pts_instance_maskreturn resultsdef _load_semantic_seg_3d(self, results: dict) -> dict:"""Private function to load 3D semantic segmentation annotations.Args:results (dict): Result dict from :obj:`mmdet3d.CustomDataset`.Returns:dict: The dict containing the semantic segmentation annotations."""pts_semantic_mask_path = results['pts_semantic_mask_path']try:mask_bytes = get(pts_semantic_mask_path, backend_args=self.backend_args)# add .copy() to fix read-only bugpts_semantic_mask = np.frombuffer(mask_bytes, dtype=self.seg_3d_dtype).copy()except ConnectionError:mmengine.check_file_exist(pts_semantic_mask_path)pts_semantic_mask = np.fromfile(pts_semantic_mask_path, dtype=np.int64)if self.dataset_type == 'semantickitti':pts_semantic_mask = pts_semantic_mask.astype(np.int64)pts_semantic_mask = pts_semantic_mask % self.seg_offset# nuScenes loads semantic and panoptic labels from different files.results['pts_semantic_mask'] = pts_semantic_mask# 'eval_ann_info' will be passed to evaluatorif 'eval_ann_info' in results:results['eval_ann_info']['pts_semantic_mask'] = pts_semantic_maskreturn resultsdef _load_panoptic_3d(self, results: dict) -> dict:"""Private function to load 3D panoptic segmentation annotations.Args:results (dict): Result dict from :obj:`mmdet3d.CustomDataset`.Returns:dict: The dict containing the panoptic segmentation annotations."""pts_panoptic_mask_path = results['pts_panoptic_mask_path']try:mask_bytes = get(pts_panoptic_mask_path, backend_args=self.backend_args)# add .copy() to fix read-only bugpts_panoptic_mask = np.frombuffer(mask_bytes, dtype=self.seg_3d_dtype).copy()except ConnectionError:mmengine.check_file_exist(pts_panoptic_mask_path)pts_panoptic_mask = np.fromfile(pts_panoptic_mask_path, dtype=np.int64)if self.dataset_type == 'semantickitti':pts_semantic_mask = pts_panoptic_mask.astype(np.int64)pts_semantic_mask = pts_semantic_mask % self.seg_offsetelif self.dataset_type == 'nuscenes':pts_semantic_mask = pts_semantic_mask // self.seg_offsetresults['pts_semantic_mask'] = pts_semantic_mask# We can directly take panoptic labels as instance ids.pts_instance_mask = pts_panoptic_mask.astype(np.int64)results['pts_instance_mask'] = pts_instance_mask# 'eval_ann_info' will be passed to evaluatorif 'eval_ann_info' in results:results['eval_ann_info']['pts_semantic_mask'] = pts_semantic_maskresults['eval_ann_info']['pts_instance_mask'] = pts_instance_maskreturn resultsdef _load_bboxes(self, results: dict) -> None:"""Private function to load bounding box annotations.The only difference is it remove the proceess for`ignore_flag`Args:results (dict): Result dict from :obj:`mmcv.BaseDataset`.Returns:dict: The dict contains loaded bounding box annotations."""results['gt_bboxes'] = results['ann_info']['gt_bboxes']def _load_labels(self, results: dict) -> None:"""Private function to load label annotations.Args:results (dict): Result dict from :obj :obj:`mmcv.BaseDataset`.Returns:dict: The dict contains loaded label annotations."""results['gt_bboxes_labels'] = results['ann_info']['gt_bboxes_labels']def transform(self, results: dict) -> dict:"""Function to load multiple types annotations.Args:results (dict): Result dict from :obj:`mmdet3d.CustomDataset`.Returns:dict: The dict containing loaded 3D bounding box, label, mask andsemantic segmentation annotations."""results = super().transform(results)if self.with_bbox_3d:results = self._load_bboxes_3d(results)if self.with_bbox_depth:results = self._load_bboxes_depth(results)if self.with_label_3d:results = self._load_labels_3d(results)if self.with_attr_label:results = self._load_attr_labels(results)if self.with_panoptic_3d:results = self._load_panoptic_3d(results)if self.with_mask_3d:results = self._load_masks_3d(results)if self.with_seg_3d:results = self._load_semantic_seg_3d(results)return results

ObjectSample

数据增强。

create data步骤建立了一个关于数据集的gt object 库,即db_infos.pkl ,ObjectSample 在当前样本较少的情况下,从库里采样一些 gt object 填充到当前样本中。

mmdet3d/datasets/transforms/transforms_3d.py

源码

@TRANSFORMS.register_module()
class ObjectSample(BaseTransform):"""Sample GT objects to the data.Required Keys:- points- ann_info- gt_bboxes_3d- gt_labels_3d- img (optional)- gt_bboxes (optional)Modified Keys:- points- gt_bboxes_3d- gt_labels_3d- img (optional)- gt_bboxes (optional)Added Keys:- plane (optional)Args:db_sampler (dict): Config dict of the database sampler.sample_2d (bool): Whether to also paste 2D image patch to the images.This should be true when applying multi-modality cut-and-paste.Defaults to False.use_ground_plane (bool): Whether to use ground plane to adjust the3D labels. Defaults to False."""def __init__(self,db_sampler: dict,sample_2d: bool = False,use_ground_plane: bool = False) -> None:self.sampler_cfg = db_samplerself.sample_2d = sample_2dif 'type' not in db_sampler.keys():db_sampler['type'] = 'DataBaseSampler'self.db_sampler = TRANSFORMS.build(db_sampler)self.use_ground_plane = use_ground_planeself.disabled = False@staticmethoddef remove_points_in_boxes(points: BasePoints,boxes: np.ndarray) -> np.ndarray:"""Remove the points in the sampled bounding boxes.Args:points (:obj:`BasePoints`): Input point cloud array.boxes (np.ndarray): Sampled ground truth boxes.Returns:np.ndarray: Points with those in the boxes removed."""masks = box_np_ops.points_in_rbbox(points.coord.numpy(), boxes)points = points[np.logical_not(masks.any(-1))]return pointsdef transform(self, input_dict: dict) -> dict:"""Transform function to sample ground truth objects to the data.Args:input_dict (dict): Result dict from loading pipeline.Returns:dict: Results after object sampling augmentation,'points', 'gt_bboxes_3d', 'gt_labels_3d' keys are updatedin the result dict."""if self.disabled:return input_dictgt_bboxes_3d = input_dict['gt_bboxes_3d']gt_labels_3d = input_dict['gt_labels_3d']if self.use_ground_plane:ground_plane = input_dict.get('plane', None)assert ground_plane is not None, '`use_ground_plane` is True ' \'but find plane is None'else:ground_plane = None# change to float for blending operationpoints = input_dict['points']if self.sample_2d:img = input_dict['img']gt_bboxes_2d = input_dict['gt_bboxes']# Assume for now 3D & 2D bboxes are the samesampled_dict = self.db_sampler.sample_all(gt_bboxes_3d.numpy(),gt_labels_3d,gt_bboxes_2d=gt_bboxes_2d,img=img)else:sampled_dict = self.db_sampler.sample_all(gt_bboxes_3d.numpy(),gt_labels_3d,img=None,ground_plane=ground_plane)if sampled_dict is not None:sampled_gt_bboxes_3d = sampled_dict['gt_bboxes_3d']sampled_points = sampled_dict['points']sampled_gt_labels = sampled_dict['gt_labels_3d']gt_labels_3d = np.concatenate([gt_labels_3d, sampled_gt_labels],axis=0)gt_bboxes_3d = gt_bboxes_3d.new_box(np.concatenate([gt_bboxes_3d.numpy(), sampled_gt_bboxes_3d]))points = self.remove_points_in_boxes(points, sampled_gt_bboxes_3d)# check the points dimensionpoints = points.cat([sampled_points, points])if self.sample_2d:sampled_gt_bboxes_2d = sampled_dict['gt_bboxes_2d']gt_bboxes_2d = np.concatenate([gt_bboxes_2d, sampled_gt_bboxes_2d]).astype(np.float32)input_dict['gt_bboxes'] = gt_bboxes_2dinput_dict['img'] = sampled_dict['img']input_dict['gt_bboxes_3d'] = gt_bboxes_3dinput_dict['gt_labels_3d'] = gt_labels_3d.astype(np.int64)input_dict['points'] = pointsreturn input_dict

使用db_sample 类进行操作

sample_all:对所有种类的bbox进行采样:

sample_class_v2:对特定种类的包围盒进行采样

  1. 初始化 db_sampler ,在 db_sampler 初始化里面按照 difficulty 和 min_points 过滤掉 一些 gt object.
  2. 是否选择 sample_2d,调用相应的 db_sampler.sample_all
  3. db_sampler.sample_all() 的过程
    1. 计算每个类别需要 sample 的个数: 要求个数 减去 目前 gt label 中该类别个数
    2. 如果该类别的 gt objects 已经足够多,即需要 sample 的个数 <= 0,则不做任何 sample 操作,返回的 sample 结果为 None
    3. 如果该类别的 gt objects 比较少,则从 db_info 里面对应的类别 sample 所需数量的 object ,也就是从其他文件去 sample 一些 object 出来填充到当前文件的 gt 数据。比如用000005.bin 的一些 object 点云 补充到 000003.bin 去,并补充相应的 gt boxes.
  4. 得到 sample 结果之后,判断是否和已有的 gt boxes and sample boxes 冲突 (注意这里是将某类别的 sample 结果和目前所有类别的已有 boxes 判断冲突),保留那些没有冲突的 sample 结果作为该类别的 sample 结果
  5. concat sample 结果和原本的 gt labels and bboxes, 替换 gt point cloud 里面落在 sample boxes 里的 point 为 sample points

ObjectNoise

为每个场景的GT objects 增加一些噪声

输入参数

translation_std: List[float] = [0.25, 0.25, 0.25], # 平移标准差
global_rot_range: List[float] = [0.0, 0.0], # 旋转范围
rot_range: List[float] = [-0.15707963267, 0.15707963267], # 旋转角度范围
num_try: int = 100 # 个数

源码

ObjectNoise 调用 noise_per_object_v3_ 对每个 gt object 进行旋转、平移等操作,生成新的 gt object

def noise_per_object_v3_(gt_boxes,points=None,valid_mask=None,rotation_perturb=np.pi / 4,center_noise_std=1.0,global_random_rot_range=np.pi / 4,num_try=100)

RandomFlip3D

随机左右或前后翻转点云或者 bbox

参数

argsdescription
sync_2d (bool)是否根据2D应用翻转图片。 如果为 True,它将应用与 2D 图像相同的翻转。如果为False,则会决定是否随机独立翻转到 2D 图像。 默认为 True。
Flip_ratio_bev_horizontal (float)翻转概率在水平方向。 默认为 0.0。
Flip_ratio_bev_vertical (float)翻转概率在垂直方向。 默认为 0.0。
Flip_box3d (bool)是否翻转边界框。 在大多数情况下,盒子应该被翻转。 在基于凸轮的 bev 检测中,此设置为 False,因为 2D 图像的翻转不会影响 3D盒子。 默认为 True。

源码

    def random_flip_data_3d(self,input_dict: dict,direction: str = 'horizontal') -> None:"""Flip 3D data randomly.`random_flip_data_3d` should take these situations into consideration:- 1. LIDAR-based 3d detection- 2. LIDAR-based 3d segmentation- 3. vision-only detection- 4. multi-modality 3d detection.Args:input_dict (dict): Result dict from loading pipeline.direction (str): Flip direction. Defaults to 'horizontal'.Returns:dict: Flipped results, 'points', 'bbox3d_fields' keys areupdated in the result dict."""assert direction in ['horizontal', 'vertical']if self.flip_box3d:if 'gt_bboxes_3d' in input_dict:if 'points' in input_dict:input_dict['points'] = input_dict['gt_bboxes_3d'].flip(direction, points=input_dict['points'])else:# vision-only detectioninput_dict['gt_bboxes_3d'].flip(direction)else:input_dict['points'].flip(direction)if 'centers_2d' in input_dict:assert self.sync_2d is True and direction == 'horizontal', \'Only support sync_2d=True and horizontal flip with images'w = input_dict['img_shape'][1]input_dict['centers_2d'][..., 0] = \w - input_dict['centers_2d'][..., 0]# need to modify the horizontal position of camera center# along u-axis in the image (flip like centers2d)# ['cam2img'][0][2] = c_u# see more details and examples at# https://github.com/open-mmlab/mmdetection3d/pull/744input_dict['cam2img'][0][2] = w - input_dict['cam2img'][0][2]def _flip_on_direction(self, results: dict) -> None:"""Function to flip images, bounding boxes, semantic segmentation mapand keypoints.Add the override feature that if 'flip' is already in results, use itto do the augmentation."""if 'flip' not in results:cur_dir = self._choose_direction()else:# `flip_direction` works only when `flip` is True.# For example, in `MultiScaleFlipAug3D`, `flip_direction` is# 'horizontal' but `flip` is False.if results['flip']:assert 'flip_direction' in results, 'flip and flip_direction ''must exist simultaneously'cur_dir = results['flip_direction']else:cur_dir = Noneif cur_dir is None:results['flip'] = Falseresults['flip_direction'] = Noneelse:results['flip'] = Trueresults['flip_direction'] = cur_dirself._flip(results)def transform(self, input_dict: dict) -> dict:"""Call function to flip points, values in the ``bbox3d_fields`` andalso flip 2D image and its annotations.Args:input_dict (dict): Result dict from loading pipeline.Returns:dict: Flipped results, 'flip', 'flip_direction','pcd_horizontal_flip' and 'pcd_vertical_flip' keys are addedinto result dict."""# flip 2D image and its annotationsif 'img' in input_dict:super(RandomFlip3D, self).transform(input_dict)if self.sync_2d and 'img' in input_dict:input_dict['pcd_horizontal_flip'] = input_dict['flip']input_dict['pcd_vertical_flip'] = Falseelse:if 'pcd_horizontal_flip' not in input_dict:flip_horizontal = True if np.random.rand() < self.flip_ratio_bev_horizontal else Falseinput_dict['pcd_horizontal_flip'] = flip_horizontalif 'pcd_vertical_flip' not in input_dict:flip_vertical = True if np.random.rand() < self.flip_ratio_bev_vertical else Falseinput_dict['pcd_vertical_flip'] = flip_verticalif 'transformation_3d_flow' not in input_dict:input_dict['transformation_3d_flow'] = []if input_dict['pcd_horizontal_flip']:self.random_flip_data_3d(input_dict, 'horizontal')input_dict['transformation_3d_flow'].extend(['HF'])if input_dict['pcd_vertical_flip']:self.random_flip_data_3d(input_dict, 'vertical')input_dict['transformation_3d_flow'].extend(['VF'])return input_dict

主要使用了flip 的方法进行取反颠倒操作

class LiDARPoints(BasePoints):"""Points of instances in LIDAR coordinates.# ...def flip(self, bev_direction: str = 'horizontal') -> None:"""Flip the points along given BEV direction.Args:bev_direction (str): Flip direction (horizontal or vertical).Defaults to 'horizontal'."""assert bev_direction in ('horizontal', 'vertical')if bev_direction == 'horizontal':self.tensor[:, 1] = -self.tensor[:, 1]elif bev_direction == 'vertical':self.tensor[:, 0] = -self.tensor[:, 0]

GlobalRotScaleTrans

数据增广的一种方式,对输入点云进行随机旋转和放缩变换

将全局旋转、缩放和平移应用于 3D 场景

参数

argsdescription
rot_range (list[float])旋转角度范围。默认为 [-0.78539816, 0.78539816](接近 [-pi/4, pi/4])。
scale_ratio_range (list[float])比例范围。默认为 [0.95, 1.05]。
Translation_std (list[float])标准差应用于场景的平移噪声,其中从高斯分布中采样,其标准差由“translation_std”设置。 默认
shift_height (bool)是否移动高度。(室内点的第四维)缩放时。默认为 False。

源码

    def _trans_bbox_points(self, input_dict: dict) -> None:"""Private function to translate bounding boxes and points.Args:input_dict (dict): Result dict from loading pipeline.Returns:dict: Results after translation, 'points', 'pcd_trans'and `gt_bboxes_3d` is updated in the result dict."""translation_std = np.array(self.translation_std, dtype=np.float32)trans_factor = np.random.normal(scale=translation_std, size=3).Tinput_dict['points'].translate(trans_factor)input_dict['pcd_trans'] = trans_factorif 'gt_bboxes_3d' in input_dict:input_dict['gt_bboxes_3d'].translate(trans_factor)def _rot_bbox_points(self, input_dict: dict) -> None:"""Private function to rotate bounding boxes and points.Args:input_dict (dict): Result dict from loading pipeline.Returns:dict: Results after rotation, 'points', 'pcd_rotation'and `gt_bboxes_3d` is updated in the result dict."""rotation = self.rot_rangenoise_rotation = np.random.uniform(rotation[0], rotation[1])if 'gt_bboxes_3d' in input_dict and \len(input_dict['gt_bboxes_3d'].tensor) != 0:# rotate points with bboxespoints, rot_mat_T = input_dict['gt_bboxes_3d'].rotate(noise_rotation, input_dict['points'])input_dict['points'] = pointselse:# if no bbox in input_dict, only rotate pointsrot_mat_T = input_dict['points'].rotate(noise_rotation)input_dict['pcd_rotation'] = rot_mat_Tinput_dict['pcd_rotation_angle'] = noise_rotationdef _scale_bbox_points(self, input_dict: dict) -> None:"""Private function to scale bounding boxes and points.Args:input_dict (dict): Result dict from loading pipeline.Returns:dict: Results after scaling, 'points' and`gt_bboxes_3d` is updated in the result dict."""scale = input_dict['pcd_scale_factor']points = input_dict['points']points.scale(scale)if self.shift_height:assert 'height' in points.attribute_dims.keys(), \'setting shift_height=True but points have no height attribute'points.tensor[:, points.attribute_dims['height']] *= scaleinput_dict['points'] = pointsif 'gt_bboxes_3d' in input_dict and \len(input_dict['gt_bboxes_3d'].tensor) != 0:input_dict['gt_bboxes_3d'].scale(scale)def _random_scale(self, input_dict: dict) -> None:"""Private function to randomly set the scale factor.Args:input_dict (dict): Result dict from loading pipeline.Returns:dict: Results after scaling, 'pcd_scale_factor'are updated in the result dict."""scale_factor = np.random.uniform(self.scale_ratio_range[0],self.scale_ratio_range[1])input_dict['pcd_scale_factor'] = scale_factordef transform(self, input_dict: dict) -> dict:"""Private function to rotate, scale and translate bounding boxes andpoints.Args:input_dict (dict): Result dict from loading pipeline.Returns:dict: Results after scaling, 'points', 'pcd_rotation','pcd_scale_factor', 'pcd_trans' and `gt_bboxes_3d` are updatedin the result dict."""if 'transformation_3d_flow' not in input_dict:input_dict['transformation_3d_flow'] = []self._rot_bbox_points(input_dict)if 'pcd_scale_factor' not in input_dict:self._random_scale(input_dict)self._scale_bbox_points(input_dict)self._trans_bbox_points(input_dict)input_dict['transformation_3d_flow'].extend(['R', 'S', 'T'])return input_dict

PointsRangeFilter

通过设置点云范围,PointsRangeFilter用于过滤点云及其掩模(语义和实例)

源码

def transform(self, input_dict: dict) -> dict:"""Transform function to filter points by the range.Args:input_dict (dict): Result dict from loading pipeline.Returns:dict: Results after filtering, 'points', 'pts_instance_mask'and 'pts_semantic_mask' keys are updated in the result dict."""points = input_dict['points']points_mask = points.in_range_3d(self.pcd_range)clean_points = points[points_mask]input_dict['points'] = clean_pointspoints_mask = points_mask.numpy()pts_instance_mask = input_dict.get('pts_instance_mask', None)pts_semantic_mask = input_dict.get('pts_semantic_mask', None)if pts_instance_mask is not None:input_dict['pts_instance_mask'] = pts_instance_mask[points_mask]if pts_semantic_mask is not None:input_dict['pts_semantic_mask'] = pts_semantic_mask[points_mask]return input_dict

通过获取范围内点云的mask,对点进行过滤。本质上还是通过范围比较进行

def in_range_3d(self, box_range: Union[Tensor, np.ndarray,Sequence[float]]) -> Tensor:"""Check whether the boxes are in the given range.Args:box_range (Tensor or np.ndarray or Sequence[float]): The range ofbox (x_min, y_min, z_min, x_max, y_max, z_max).Note:In the original implementation of SECOND, checking whether a box inthe range checks whether the points are in a convex polygon, we tryto reduce the burden for simpler cases.Returns:Tensor: A binary vector indicating whether each point is inside thereference range."""in_range_flags = ((self.tensor[:, 0] > box_range[0])& (self.tensor[:, 1] > box_range[1])& (self.tensor[:, 2] > box_range[2])& (self.tensor[:, 0] < box_range[3])& (self.tensor[:, 1] < box_range[4])& (self.tensor[:, 2] < box_range[5]))return in_range_flags

ObjectRangeFilter

ObjectRangeFilter用于过滤3D边界框。

源码

class ObjectRangeFilter(BaseTransform):"""Filter objects by the range.Required Keys:- gt_bboxes_3dModified Keys:- gt_bboxes_3dArgs:point_cloud_range (list[float]): Point cloud range."""def __init__(self, point_cloud_range: List[float]) -> None:self.pcd_range = np.array(point_cloud_range, dtype=np.float32)def transform(self, input_dict: dict) -> dict:"""Transform function to filter objects by the range.Args:input_dict (dict): Result dict from loading pipeline.Returns:dict: Results after filtering, 'gt_bboxes_3d', 'gt_labels_3d'keys are updated in the result dict."""# Check points instance type and initialise bev_rangeif isinstance(input_dict['gt_bboxes_3d'],(LiDARInstance3DBoxes, DepthInstance3DBoxes)):bev_range = self.pcd_range[[0, 1, 3, 4]]elif isinstance(input_dict['gt_bboxes_3d'], CameraInstance3DBoxes):bev_range = self.pcd_range[[0, 2, 3, 5]]gt_bboxes_3d = input_dict['gt_bboxes_3d']gt_labels_3d = input_dict['gt_labels_3d']mask = gt_bboxes_3d.in_range_bev(bev_range)gt_bboxes_3d = gt_bboxes_3d[mask]# mask is a torch tensor but gt_labels_3d is still numpy array# using mask to index gt_labels_3d will cause bug when# len(gt_labels_3d) == 1, where mask=1 will be interpreted# as gt_labels_3d[1] and cause out of index errorgt_labels_3d = gt_labels_3d[mask.numpy().astype(bool)]# limit rad to [-pi, pi]gt_bboxes_3d.limit_yaw(offset=0.5, period=2 * np.pi)input_dict['gt_bboxes_3d'] = gt_bboxes_3dinput_dict['gt_labels_3d'] = gt_labels_3dreturn input_dict

通过判断 gt_bboxes_3d是否在 bev的范围内对 box及label进行筛选

def in_range_bev(self, box_range: Union[Tensor, np.ndarray,Sequence[float]]) -> Tensor:"""Check whether the boxes are in the given range.Args:box_range (Tensor or np.ndarray or Sequence[float]): The range ofbox in order of (x_min, y_min, x_max, y_max).Note:The original implementation of SECOND checks whether boxes in arange by checking whether the points are in a convex polygon, wereduce the burden for simpler cases.Returns:Tensor: A binary vector indicating whether each box is inside thereference range."""in_range_flags = ((self.bev[:, 0] > box_range[0])& (self.bev[:, 1] > box_range[1])& (self.bev[:, 0] < box_range[2])& (self.bev[:, 1] < box_range[3]))return in_range_flags

ObjectNameFilter

通过名字对 GT objects 进行过滤。即通过输入为训练而保留的gt labels,对所有的输入的 gt 框进行过滤。

源码

@TRANSFORMS.register_module()
class ObjectNameFilter(BaseTransform):"""Filter GT objects by their names.Required Keys:- gt_labels_3dModified Keys:- gt_labels_3dArgs:classes (list[str]): List of class names to be kept for training."""def __init__(self, classes: List[str]) -> None:self.classes = classesself.labels = list(range(len(self.classes)))def transform(self, input_dict: dict) -> dict:"""Transform function to filter objects by their names.Args:input_dict (dict): Result dict from loading pipeline.Returns:dict: Results after filtering, 'gt_bboxes_3d', 'gt_labels_3d'keys are updated in the result dict."""gt_labels_3d = input_dict['gt_labels_3d']# 此处为对labels 进行过滤,生成maskgt_bboxes_mask = np.array([n in self.labels for n in gt_labels_3d],dtype=bool)input_dict['gt_bboxes_3d'] = input_dict['gt_bboxes_3d'][gt_bboxes_mask]input_dict['gt_labels_3d'] = input_dict['gt_labels_3d'][gt_bboxes_mask]return input_dict

PointShuffle

随机排列输入点

源码

@TRANSFORMS.register_module()
class PointShuffle(BaseTransform):"""Shuffle input points."""def transform(self, input_dict: dict) -> dict:"""Call function to shuffle points.Args:input_dict (dict): Result dict from loading pipeline.Returns:dict: Results after filtering, 'points', 'pts_instance_mask'and 'pts_semantic_mask' keys are updated in the result dict."""idx = input_dict['points'].shuffle()idx = idx.numpy()pts_instance_mask = input_dict.get('pts_instance_mask', None)pts_semantic_mask = input_dict.get('pts_semantic_mask', None)if pts_instance_mask is not None:input_dict['pts_instance_mask'] = pts_instance_mask[idx]if pts_semantic_mask is not None:input_dict['pts_semantic_mask'] = pts_semantic_mask[idx]return input_dict

通过调用 shuffle 函数实现

def shuffle(self) -> Tensor:"""Shuffle the points.Returns:Tensor: The shuffled index."""idx = torch.randperm(self.__len__(), device=self.tensor.device)self.tensor = self.tensor[idx]return idx

Pack3DDetInputs

打包输入数据的方法。 当字典中的值是一个列表时,表示该值通常应用在增强测试中

在最新版本代码中,DefaultFormatBundle3DCollect3D 替换为 PackDet3DInputs,将数据打包为元素格式作为模型输入。

源码

    def transform(self, results: Union[dict,List[dict]]) -> Union[dict, List[dict]]:"""Method to pack the input data. when the value in this dict is alist, it usually is in Augmentations Testing.Args:results (dict | list[dict]): Result dict from the data pipeline.Returns:dict | List[dict]:- 'inputs' (dict): The forward data of models. It usually containsfollowing keys:- points- img- 'data_samples' (:obj:`Det3DDataSample`): The annotation info ofthe sample."""# augtestif isinstance(results, list):if len(results) == 1:# simple testreturn self.pack_single_results(results[0])pack_results = []for single_result in results:pack_results.append(self.pack_single_results(single_result))return pack_results# norm training and simple testingelif isinstance(results, dict):return self.pack_single_results(results)else:raise NotImplementedErrordef pack_single_results(self, results: dict) -> dict:"""Method to pack the single input data. when the value in this dict isa list, it usually is in Augmentations Testing.Args:results (dict): Result dict from the data pipeline.Returns:dict: A dict contains- 'inputs' (dict): The forward data of models. It usually containsfollowing keys:- points- img- 'data_samples' (:obj:`Det3DDataSample`): The annotation infoof the sample."""# Format 3D dataif 'points' in results:if isinstance(results['points'], BasePoints):results['points'] = results['points'].tensorif 'img' in results:if isinstance(results['img'], list):# process multiple imgs in single frameimgs = np.stack(results['img'], axis=0)if imgs.flags.c_contiguous:imgs = to_tensor(imgs).permute(0, 3, 1, 2).contiguous()else:imgs = to_tensor(np.ascontiguousarray(imgs.transpose(0, 3, 1, 2)))results['img'] = imgselse:img = results['img']if len(img.shape) < 3:img = np.expand_dims(img, -1)# To improve the computational speed by by 3-5 times, apply:# `torch.permute()` rather than `np.transpose()`.# Refer to https://github.com/open-mmlab/mmdetection/pull/9533# for more detailsif img.flags.c_contiguous:img = to_tensor(img).permute(2, 0, 1).contiguous()else:img = to_tensor(np.ascontiguousarray(img.transpose(2, 0, 1)))results['img'] = imgfor key in ['proposals', 'gt_bboxes', 'gt_bboxes_ignore', 'gt_labels','gt_bboxes_labels', 'attr_labels', 'pts_instance_mask','pts_semantic_mask', 'centers_2d', 'depths', 'gt_labels_3d']:if key not in results:continueif isinstance(results[key], list):results[key] = [to_tensor(res) for res in results[key]]else:results[key] = to_tensor(results[key])if 'gt_bboxes_3d' in results:if not isinstance(results['gt_bboxes_3d'], BaseInstance3DBoxes):results['gt_bboxes_3d'] = to_tensor(results['gt_bboxes_3d'])if 'gt_semantic_seg' in results:results['gt_semantic_seg'] = to_tensor(results['gt_semantic_seg'][None])if 'gt_seg_map' in results:results['gt_seg_map'] = results['gt_seg_map'][None, ...]data_sample = Det3DDataSample()gt_instances_3d = InstanceData()gt_instances = InstanceData()gt_pts_seg = PointData()data_metas = {}for key in self.meta_keys:if key in results:data_metas[key] = results[key]elif 'images' in results:if len(results['images'].keys()) == 1:cam_type = list(results['images'].keys())[0]# single-view imageif key in results['images'][cam_type]:data_metas[key] = results['images'][cam_type][key]else:# multi-view imageimg_metas = []cam_types = list(results['images'].keys())for cam_type in cam_types:if key in results['images'][cam_type]:img_metas.append(results['images'][cam_type][key])if len(img_metas) > 0:data_metas[key] = img_metaselif 'lidar_points' in results:if key in results['lidar_points']:data_metas[key] = results['lidar_points'][key]data_sample.set_metainfo(data_metas)inputs = {}for key in self.keys:if key in results:if key in self.INPUTS_KEYS:inputs[key] = results[key]elif key in self.INSTANCEDATA_3D_KEYS:gt_instances_3d[self._remove_prefix(key)] = results[key]elif key in self.INSTANCEDATA_2D_KEYS:if key == 'gt_bboxes_labels':gt_instances['labels'] = results[key]else:gt_instances[self._remove_prefix(key)] = results[key]elif key in self.SEG_KEYS:gt_pts_seg[self._remove_prefix(key)] = results[key]else:raise NotImplementedError(f'Please modified 'f'`Pack3DDetInputs` 'f'to put {key} to 'f'corresponding field')data_sample.gt_instances_3d = gt_instances_3ddata_sample.gt_instances = gt_instancesdata_sample.gt_pts_seg = gt_pts_segif 'eval_ann_info' in results:data_sample.eval_ann_info = results['eval_ann_info']else:data_sample.eval_ann_info = Nonepacked_results = dict()packed_results['data_samples'] = data_samplepacked_results['inputs'] = inputsreturn packed_results

通过预先设定的keys 和 meta_keys ,整理结果数据,输出data pipeline 的字典结果

keys = ['proposals', 'gt_bboxes', 'gt_bboxes_ignore', 'gt_labels','gt_bboxes_labels', 'attr_labels', 'pts_instance_mask','pts_semantic_mask', 'centers_2d', 'depths', 'gt_labels_3d']meta_keys: tuple = ('img_path', 'ori_shape', 'img_shape', 'lidar2img','depth2img', 'cam2img', 'pad_shape','scale_factor', 'flip', 'pcd_horizontal_flip','pcd_vertical_flip', 'box_mode_3d', 'box_type_3d','img_norm_cfg', 'num_pts_feats', 'pcd_trans','sample_idx', 'pcd_scale_factor', 'pcd_rotation','pcd_rotation_angle', 'lidar_path','transformation_3d_flow', 'trans_mat','affine_aug', 'sweep_img_metas', 'ori_cam2img','cam2global', 'crop_offset', 'img_crop_offset','resize_img_shape', 'lidar2cam', 'ori_lidar2img','num_ref_frames', 'num_views', 'ego2global')

Reference

  • [开源框架]mmdetection3d学习(三):数据加载

>>>>> 欢迎关注公众号【三戒纪元】 <<<<<

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.luyixian.cn/news_show_332623.aspx

如若内容造成侵权/违法违规/事实不符,请联系dt猫网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

Loadrunner结合Fiddler实现脚本的录制

Loadrunner一直被业内认为是最好用的性能测试工具&#xff0c;行业大哥大, 但是用过Loadrunner的朋友都知道&#xff0c;工具功能的确牛&#xff0c;但实际使用过程中总会有一些困扰新手的问题&#xff0c;无法录制脚本&#xff0c; 如遇到Loadrunner不支持的IE版本、对Chrome、…

2023年 大二,我拿到了 3 家大厂 offer,为什么我要安利你去实习?

关于 2023年 大二&#xff0c;我拿到了 3 家大厂 offer 这件事 2023年&#xff0c;在大二那年寒假的时候&#xff0c;提前自学完&#xff0c;觉得自己知识储备差不多了&#xff0c;开始投递软件开发实习&#xff0c;刚开始的时候真的是屡遭打击&#xff0c;首先因为本身是双非二…

如何通过边缘智能网关实现暴雨灾害监测预警

随着台风季来临&#xff0c;暴雨灾害也进入到频发阶段&#xff0c;给村镇和城市居民都造成诸多人身和财产损失。针对南方台风季的水灾防治&#xff0c;物联网技术派上大用场&#xff0c;本篇就基于边缘智能网关的数采方案&#xff0c;简单介绍对暴雨导致的洪涝、内涝的监测和预…

2023Testing Expo| 怿星科技展品抢先看(第一弹)

8月9日-11日&#xff0c;2023汽车测试及质量监控博览会将于上海世博展览馆1号馆举行&#xff0c;本次展会将展示测试和验证技术在整车、零部件和系统开发领域中的新发展、新产品和新解决方案。怿星科技将携最新的ETH测试、智驾测试、PPS测试等方案亮相测试展&#xff0c;届时欢…

【文末送书 - 数据分析之pandas篇④】- DataFrame数据合并

向阳花花花花 - 个人主页 迄今所有人生都大写着失败&#xff0c;但并不妨碍我继续向前 Python 数据分析专栏 正在火热更新中 &#x1f525; 文章目录 一、concat二、append三、merge3.1 没有属性相同时3.2 只有一个属性相同时1.一对一合并2.一对多合并3.多对多合并 3.3 有多个…

品牌营销策略:如何有效打造品牌知名度与口碑?

品牌营销策略是企业在市场竞争中脱颖而出的重要手段&#xff0c;它能够帮助企业树立品牌形象&#xff0c;提升品牌知名度&#xff0c;增强品牌影响力&#xff0c;从而获得更多的市场份额和利润。那么&#xff0c;如何制定一套有效的品牌营销策略呢&#xff1f;以下是一秒推小编…

Spring【AOP】

AOP-面向切面编程 AOP&#xff1a;面向切面编程&#xff0c;通过预编译方式和运行期动态代理实现程序功能的统一维护的一种技术。 SpringAop中&#xff0c;通过Advice定义横切逻辑&#xff0c;并支持5种类型的Advice&#xff1a; 导入依赖 <dependency><groupId>…

webpack打包之 copy-webpack-plugin

copy-webpack-plugin 打包复制文件插件。 1、什么时候要使用&#xff1f; 在离线应用中&#xff0c;前端所有文件都需在在本地&#xff0c;有些文件&#xff08;比如iconFont以及一些静态img)需要转为离线文件&#xff0c;这些文件可以直接引用更方便些&#xff0c;这就需要在打…

Redis学习(三)持久化机制、分布式缓存、多级缓存、Redis实战经验

文章目录 分布式缓存Redis持久化RDB持久化AOF持久化 Redis主从Redis数据同步原理全量同步增量同步 Redis哨兵哨兵的作用和原理sentinel&#xff08;哨兵&#xff09;的三个作用是什么&#xff1f;sentinel如何判断一个Redis实例是否健康&#xff1f;master出现故障后&#xff0…

QT之智能指针

如果没有智能指针&#xff0c;程序员必须保证new对象能在正确的时机delete&#xff0c;四处编写异常捕获代码以释放资源&#xff0c;而智能指针则可以在退出作用域时(不管是正常流程离开或是因异常离开)总调用delete来析构在堆上动态分配的对象。 来看看一个野指针例子 程序将会…

vue的生命周期和执行顺序

1&#xff0c;Vue 生命周期都有哪些&#xff1f; 序号生命周期描述1beforecreate创建前vue实例初始化阶段&#xff0c;不可以访问data,methods&#xff1b; 此时打印出的this是undefined&#xff1b;2created创建后vue实例初始化完成&#xff0c;可以访问data&#xff0c;meth…

truffle 进行智能合约测试

0字 本方法使用了可视化软件Ganache 前两步与不使用可视化工具的步骤是一样的&#xff08;有道云笔记&#xff09;&#xff0c;到第三步的时候需要注意&#xff1a; 在truffle插件下找到networks目录&#xff0c;提前打开Ganache软件 在Ganache中选择连接或者新建&#xff0…

Django实现接口自动化平台(十二)自定义函数模块DebugTalks 序列化器及视图【持续更新中】

上一章&#xff1a; Django实现接口自动化平台&#xff08;十一&#xff09;项目模块Projects序列化器及视图【持续更新中】_做测试的喵酱的博客-CSDN博客 本章是项目的一个分解&#xff0c;查看本章内容时&#xff0c;要结合整体项目代码来看&#xff1a; python django vue…

Redis : zmalloc.h:50:31: 致命错误:jemalloc/jemalloc.h:没有那个文件或目录

In file included from adlist.c:34:0: zmalloc.h:50:31: 致命错误&#xff1a;jemalloc/jemalloc.h&#xff1a;没有那个文件或目录 #include <jemalloc/jemalloc.h> 解决 : 如上图使用命令 make MALLOClibc

视频卡通化技术路线

神经风格迁移&#xff1a;用深度神经网络生成风格化图像&#xff0c;通过捕获到的图像中的内容表征对输出图像做约束&#xff0c;就是输入图像和目标图像通过vgg提取的特征做约束。 GAN图像翻译&#xff1a;Image-to-image translation&#xff0c;主要基于conditional GAN&am…

Redis分布式锁的演变历程

什么时候用分布式锁 当并发去读写一个【共享资源】的时候&#xff0c;我们为了保证数据的正确&#xff0c;需要控制同一时刻只有一个线程访问。 分布式锁就是用来控制同一时刻&#xff0c;只有一个 JVM 进程中的一个线程可以访问被保护的资源。 分布式锁入门 分布式锁应该满足…

【C++修炼之路】list 模拟实现

&#x1f451;作者主页&#xff1a;安 度 因 &#x1f3e0;学习社区&#xff1a;StackFrame &#x1f4d6;专栏链接&#xff1a;C修炼之路 文章目录 一、读源码二、成员三、默认成员函数1、构造2、析构3、拷贝构造4、赋值重载 四、迭代器五、其他接口 如果无聊的话&#xff0c;…

RocketMQ环境搭建

环境搭建 环境准备 下载地址: https://downloads.apache.org/rocketmq/4.9.5/安装 上传至服务器 mkdir /usr/soft #上传至此目录/usr/softmkdir /usr/soft 解压 cd /usr/soft unzip rocketmq-all-4.9.5-bin-release.zip移动 mkdir /usr/local/rocketmq cd /usr/soft mv r…

Kubernetes - kubeadm部署

Kubernetes - kubeadm部署 1 环境准备1.1 在各个节点上配置主机名&#xff0c;并配置 Hosts 文件1.2 关闭防护墙&#xff0c;禁用selinux&#xff0c;关闭swap1.3 配置免密登录1.4 配置内核参数1.5 配置br_netfilter 2. 安装K8s2.1 安装docker(各节点)2.2 安装K8s组件(各节点)2…

安达发|如何选择更适合我们的APS高级排程软件

如何选择aps高级排程公司更适合我们?在选购aps高级排程的时候&#xff0c;一些朋友由于不清楚其中的选购技巧&#xff0c;许多时候会掉入些许选择误区&#xff0c;导致我们买不了合适我们选择的aps高级排程。因此选择适合我们的aps高级排程就变得十分重要&#xff0c;唯有明白…