mmtrack.apis¶
mmtrack.datasets¶
datasets¶
api_wrappers¶
- class mmtrack.datasets.api_wrappers.CocoVID(*args: Any, **kwargs: Any)[source]¶
Inherit official COCO class in order to parse the annotations of bbox- related video tasks.
- Parameters
annotation_file (str) – location of annotation file. Defaults to None.
load_img_as_vid (bool) – If True, convert image data to video data, which means each image is converted to a video. Defaults to False.
- get_img_ids_from_ins_id(insId)[source]¶
Get image ids from given instance id.
- Parameters
insId (int) – The given instance id.
- Returns
Image ids of given instance id.
- Return type
list[int]
- get_img_ids_from_vid(vidId)[source]¶
Get image ids from given video id.
- Parameters
vidId (int) – The given video id.
- Returns
Image ids of given video id.
- Return type
list[int]
- get_ins_ids_from_vid(vidId)[source]¶
Get instance ids from given video id.
- Parameters
vidId (int) – The given video id.
- Returns
Instance ids of given video id.
- Return type
list[int]
samplers¶
- class mmtrack.datasets.samplers.EntireVideoBatchSampler(sampler: torch.utils.data.sampler.Sampler, batch_size: int = 1, drop_last: bool = False)[source]¶
A sampler wrapper for grouping images from one video into a same batch.
- Parameters
sampler (Sampler) – Base sampler.
batch_size (int) – Size of mini-batch. Here, we take a video as a batch. Defaults to 1.
drop_last (bool) – If
True
, the sampler will drop the last batch if its size would be less thanbatch_size
. Defaults to False.
- class mmtrack.datasets.samplers.QuotaSampler(dataset: Sized, samples_per_epoch: int, replacement: bool = False, seed: int = 0)[source]¶
Sampler that gets fixed number of samples per epoch.
It is especially useful in conjunction with
torch.nn.parallel.DistributedDataParallel
. In such case, each process can pass a DistributedSampler instance as a DataLoader sampler, and load a subset of the original dataset that is exclusive to it.Note
Dataset is assumed to be of constant size.
- Parameters
dataset (Sized) – Dataset used for sampling.
samples_per_epoch (int) – The number of samples per epoch.
replacement (bool) – samples are drawn with replacement if
True
, Default: False.seed (int, optional) – random seed used to shuffle the sampler if
shuffle=True
. This number should be identical across all processes in the distributed group. Default: 0.
transforms¶
mmtrack.engine¶
hooks¶
- class mmtrack.engine.hooks.SiamRPNBackboneUnfreezeHook(backbone_start_train_epoch: int = 10, backbone_train_layers: List = ['layer2', 'layer3', 'layer4'])[source]¶
Start to train the backbone of SiamRPN++ from a certrain epoch.
- Parameters
backbone_start_train_epoch (int) – Start to train the backbone at backbone_start_train_epoch-th epoch. Note the epoch in this class counts from 0, while the epoch in the log file counts from 1.
backbone_train_layers (list(str)) – List of str denoting the stages needed be trained in backbone.
- class mmtrack.engine.hooks.TrackVisualizationHook(draw: bool = False, interval: int = 30, score_thr: float = 0.3, show: bool = False, wait_time: float = 0.0, test_out_dir: Optional[str] = None, file_client_args: dict = {'backend': 'disk'})[source]¶
Tracking Visualization Hook. Used to visualize validation and testing process prediction results.
In the testing phase:
- If
show
is True, it means that only the prediction results are visualized without storing data, so
vis_backends
needs to be excluded.
- If
- If
test_out_dir
is specified, it means that the prediction results need to be saved to
test_out_dir
. In order to avoid vis_backends also storing data, sovis_backends
needs to be excluded.
- If
vis_backends
takes effect if the user does not specifyshow
and test_out_dir`. You can set
vis_backends
to WandbVisBackend or TensorboardVisBackend to store the prediction result in Wandb or Tensorboard.
- Parameters
draw (bool) – whether to draw prediction results. If it is False, it means that no drawing will be done. Defaults to False.
interval (int) – The interval of visualization. Defaults to 30.
score_thr (float) – The threshold to visualize the bboxes and masks. Defaults to 0.3.
show (bool) – Whether to display the drawn image. Default to False.
wait_time (float) – The interval of show (s). Defaults to 0.
test_out_dir (str, optional) – directory where painted images will be saved in testing process.
file_client_args (dict) – Arguments to instantiate a FileClient. See
mmcv.fileio.FileClient
for details. Defaults todict(backend='disk')
.
- after_test_iter(runner: mmengine.runner.runner.Runner, batch_idx: int, data_batch: dict, outputs: Sequence[mmtrack.structures.track_data_sample.TrackDataSample]) → None[source]¶
Run after every testing iteration.
- Parameters
runner (
Runner
) – The runner of the testing process.batch_idx (int) – The index of the current batch in the val loop.
data_batch (dict) – Data from dataloader.
outputs (Sequence[
TrackDataSample
]) – Outputs from model.
- after_val_iter(runner: mmengine.runner.runner.Runner, batch_idx: int, data_batch: dict, outputs: Sequence[mmtrack.structures.track_data_sample.TrackDataSample]) → None[source]¶
Run after every
self.interval
validation iteration.- Parameters
runner (
Runner
) – The runner of the validation process.batch_idx (int) – The index of the current batch in the val loop.
data_batch (dict) – Data from dataloader.
outputs (Sequence[
TrackDataSample
]) – Outputs from model.
- class mmtrack.engine.hooks.YOLOXModeSwitchHook(num_last_epochs: int = 15, skip_type_keys: Sequence[str] = ('Mosaic', 'RandomAffine', 'MixUp'))[source]¶
Switch the mode of YOLOX during training.
This hook turns off the mosaic and mixup data augmentation and switches to use L1 loss in bbox_head.
The difference between this class and the class in mmdet is that the class in mmdet use model.bbox_head.use_l1=True to switch mode, while this class will check whether there is a detector module in the model firstly, then use model.detector.bbox_head.use_l1=True or model.bbox_head.use_l1=True to switch mode.
schedulers¶
- class mmtrack.engine.schedulers.SiamRPNExpLR(optimizer, *args, **kwargs)[source]¶
- Decays the parameter value of each parameter group by exponentially
changing small multiplicative factor until the number of epoch reaches a pre-defined milestone:
end
.Notice that such decay can happen simultaneously with other changes to the parameter value from outside this scheduler.
\[X_{t} = X_{t-1} imes (\]
rac{end}{begin})^{ rac{1}{epochs}}
- Args:
optimizer (Optimizer): Wrapped optimizer. start_factor (float): The number we multiply parameter value in the
first epoch. The multiplication factor changes towards end_factor in the following epochs. Defaults to 0.1.
- end_factor (float): The number we multiply parameter value at the end
of linear changing process. Defaults to 1.0.
- begin (int): Step at which to start updating the parameters.
Defaults to 0.
- end (int): Step at which to stop updating the parameters.
Defaults to INF.
- endpoint (bool): If true, end_factor` is included in the
end
. Otherwise, it is not included. Default is True.
- last_step (int): The index of last step. Used for resume without
state dict. Defaults to -1.
- by_epoch (bool): Whether the scheduled parameters are updated by
epochs. Defaults to True.
- verbose (bool): Whether to print the value for each update.
Defaults to False.
- class mmtrack.engine.schedulers.SiamRPNExpParamScheduler(optimizer: torch.optim.optimizer.Optimizer, param_name: str, start_factor: float = 0.1, end_factor: float = 1.0, begin: int = 0, end: int = 1000000000, endpoint: bool = True, last_step: int = - 1, by_epoch: bool = True, verbose: bool = False)[source]¶
- Decays the parameter value of each parameter group by exponentially
changing small multiplicative factor until the number of epoch reaches a pre-defined milestone:
end
.Notice that such decay can happen simultaneously with other changes to the parameter value from outside this scheduler.
\[X_{t} = X_{t-1} imes (\]
rac{end}{begin})^{ rac{1}{epochs}}
- Args:
optimizer (Optimizer): Wrapped optimizer. param_name (str): Name of the parameter to be adjusted, such as
lr
,momentum
.- start_factor (float): The number we multiply parameter value in the
first epoch. The multiplication factor changes towards end_factor in the following epochs. Defaults to 0.1.
- end_factor (float): The number we multiply parameter value at the end
of linear changing process. Defaults to 1.0.
- begin (int): Step at which to start updating the parameters.
Defaults to 0.
- end (int): Step at which to stop updating the parameters.
Defaults to INF.
- endpoint (bool): If true, end_factor` is included in the
end
. Otherwise, it is not included. Default is True.
- last_step (int): The index of last step. Used for resume without
state dict. Defaults to -1.
- by_epoch (bool): Whether the scheduled parameters are updated by
epochs. Defaults to True.
- verbose (bool): Whether to print the value for each update.
Defaults to False.
- classmethod build_iter_from_epoch(*args, begin: int = 0, end: int = 1000000000, by_epoch: bool = True, epoch_length: Optional[int] = None, **kwargs)[source]¶
Build an iter-based instance of this scheduler from an epoch-based config.
- Parameters
begin (int, optional) – Step at which to start updating the parameters. Defaults to 0.
end (int, optional) – Step at which to stop updating the parameters. Defaults to INF.
by_epoch (bool, optional) – Whether the scheduled parameters are updated by epochs. Defaults to True.
epoch_length (Optional[int], optional) – The length of each epoch. Defaults to None.
- Returns
The instantiated object of
SiamRPNExpParamScheduler
.- Return type
Object
mmtrack.evaluation¶
functional¶
metrics¶
mmtrack.models¶
aggregators¶
backbones¶
data_preprocessors¶
filter¶
layers¶
losses¶
mot¶
motion¶
reid¶
roi_heads¶
sot¶
task_modules¶
track_heads¶
trackers¶
vid¶
vis¶
mmtrack.structures¶
structures¶
- class mmtrack.structures.ReIDDataSample(*, metainfo: Optional[dict] = None, **kwargs)[source]¶
A data structure interface of ReID task.
It’s used as interfaces between different components.
- Meta field:
- img_shape (Tuple): The shape of the corresponding input image.
Used for visualization.
- ori_shape (Tuple): The original shape of the corresponding image.
Used for visualization.
- num_classes (int): The number of all categories.
Used for label format conversion.
- Data field:
gt_label (LabelData): The ground truth label. pred_label (LabelData): The predicted label. scores (torch.Tensor): The outputs of model.
- set_gt_label(value: Union[numpy.ndarray, torch.Tensor, Sequence[numbers.Number], numbers.Number]) → mmtrack.structures.reid_data_sample.ReIDDataSample[source]¶
Set label of
gt_label
.
- set_gt_score(value: torch.Tensor) → mmtrack.structures.reid_data_sample.ReIDDataSample[source]¶
Set score of
gt_label
.
- class mmtrack.structures.TrackDataSample(*, metainfo: Optional[dict] = None, **kwargs)[source]¶
A data structure interface of MMTracking. They are used as interfaces between different components.
The attributes in
TrackDataSample
are divided into several parts:- ``gt_instances``(InstanceData): Ground truth of instance annotations
in key frames.
- ``ignored_instances``(InstanceData): Instances to be ignored during
training/testing in key frames.
- ``proposals``(InstanceData): Region proposals used in two-stage
detectors in key frames.
- ``ref_gt_instances``(InstanceData): Ground truth of instance
annotations in reference frames.
- ``ref_ignored_instances``(InstanceData): Instances to be ignored
during training/testing in reference frames.
- ``ref_proposals``(InstanceData): Region proposals used in two-stage
detectors in reference frames.
- ``pred_det_instances``(InstanceData): Detection instances of model
predictions in key frames.
- ``pred_track_instances``(InstanceData): Tracking instances of model
predictions in key frames.
bbox¶
mmtrack.utils¶
utils¶
- class mmtrack.utils.DataLoaderBenchmark(cfg: mmengine.config.config.Config, distributed: bool, dataset_type: str, max_iter: int = 2000, log_interval: int = 50, num_warmup: int = 5, logger: Optional[mmengine.logging.logger.MMLogger] = None)[source]¶
The dataloader benchmark class. It will be statistical inference FPS and CPU memory information.
- Parameters
cfg (mmengine.Config) – config.
distributed (bool) – distributed testing flag.
dataset_type (str) – benchmark data type, only supports
train
,val
andtest
.max_iter (int) – maximum iterations of benchmark. Defaults to 2000.
log_interval (int) – interval of logging. Defaults to 50.
num_warmup (int) – Number of Warmup. Defaults to 5.
logger (MMLogger, optional) – Formatted logger used to record messages.
- class mmtrack.utils.DatasetBenchmark(cfg: mmengine.config.config.Config, dataset_type: str, max_iter: int = 2000, log_interval: int = 50, num_warmup: int = 5, logger: Optional[mmengine.logging.logger.MMLogger] = None)[source]¶
The dataset benchmark class. It will be statistical inference FPS, FPS pre transform and CPU memory information.
- Parameters
cfg (mmengine.Config) – config.
dataset_type (str) – benchmark data type, only supports
train
,val
andtest
.max_iter (int) – maximum iterations of benchmark. Defaults to 2000.
log_interval (int) – interval of logging. Defaults to 50.
num_warmup (int) – Number of Warmup. Defaults to 5.
logger (MMLogger, optional) – Formatted logger used to record messages.
- class mmtrack.utils.InferenceBenchmark(cfg: mmengine.config.config.Config, checkpoint: str, distributed: bool, is_fuse_conv_bn: bool, max_iter: int = 2000, log_interval: int = 50, num_warmup: int = 5, logger: Optional[mmengine.logging.logger.MMLogger] = None)[source]¶
The inference benchmark class. It will be statistical inference FPS, CUDA memory and CPU memory information.
- Parameters
cfg (mmengine.Config) – config.
checkpoint (str) – Accept local filepath, URL,
torchvision://xxx
,open-mmlab://xxx
.distributed (bool) – distributed testing flag.
is_fuse_conv_bn (bool) – Whether to fuse conv and bn, this will slightly increase the inference speed.
max_iter (int) – maximum iterations of benchmark. Defaults to 2000.
log_interval (int) – interval of logging. Defaults to 50.
num_warmup (int) – Number of Warmup. Defaults to 5.
logger (MMLogger, optional) – Formatted logger used to record messages.
- mmtrack.utils.convert_data_sample_type(data_sample: mmtrack.structures.track_data_sample.TrackDataSample, num_ref_imgs: int = 1) → Tuple[List[mmtrack.structures.track_data_sample.TrackDataSample], List[dict]][source]¶
Convert the type of
data_sample
from dict[list] to list[dict].- Note: This function is mainly used to be compatible with the
interface of MMDetection. It make sure that the information of each reference image can be independently packed into
data_sample
in which all the keys are without prefix “ref_”.
- Parameters
data_sample (TrackDataSample) – Data sample input.
num_ref_imgs (int, optional) – The numbe of reference images in the
data_sample
. Defaults to 1.
- Returns
- The first element is the
list of object of TrackDataSample. The second element is the list of meta information of reference images.
- Return type
Tuple[List[TrackDataSample], List[dict]]
- mmtrack.utils.crop_image(image, crop_region, crop_size, padding=(0, 0, 0))[source]¶
Crop image based on crop_region and crop_size.
- Parameters
image (ndarray) – of shape (H, W, 3).
crop_region (ndarray) – of shape (4, ) in [x1, y1, x2, y2] format.
crop_size (int) – Crop size.
padding (tuple | ndarray) – of shape (3, ) denoting the padding values.
- Returns
Cropped image of shape (crop_size, crop_size, 3).
- Return type
ndarray
- mmtrack.utils.format_video_level_show(video_names: List, eval_results: List[numpy.ndarray], sort_by_first_metric: bool = True, show_indices: Optional[Tuple[int, List]] = None) → List[List][source]¶
Format video-level performance show.
- Parameters
video_names (List) – The names of the videos.
eval_results (List[np.ndarray]) – The evaluation results.
sort_by_first_metric (bool, optional) – Whether to sort the results by the first metric. Defaults to True.
show_indices (Optional[Tuple[int, List]], optional) – The video indices to be shown. Defaults to None, i.e., all videos.
- Returns
- The formatted video-level evaluation results. For example:
- [[video-2, 48.2, 49.2, 51.9],
[video-1, 46.2, 48.2, 50.2]]
- Return type
List[List]
- mmtrack.utils.gauss_blur(image: torch.Tensor, kernel_size: Sequence, sigma: Sequence) → torch.Tensor[source]¶
The gauss blur transform.
- Parameters
image (Tensor) – of shape (n, c, h, w)
kernel_size (Tensor) – The argument kernel size for gauss blur.
sigma (Sequence) – The argument sigma for gauss blur.
- Returns
The blurred image.
- Return type
Tensor
- mmtrack.utils.imrenormalize(img: Union[torch.Tensor, numpy.ndarray], img_norm_cfg: dict, new_img_norm_cfg: dict) → Union[torch.Tensor, numpy.ndarray][source]¶
Re-normalize the image.
- Parameters
img (Tensor | ndarray) – Input image. If the input is a Tensor, the shape is (1, C, H, W). If the input is a ndarray, the shape is (H, W, C).
img_norm_cfg (dict) – Original configuration for the normalization.
new_img_norm_cfg (dict) – New configuration for the normalization.
- Returns
Output image with the same type and shape of the input.
- Return type
Tensor | ndarray
- mmtrack.utils.imshow_mot_errors(*args, backend: str = 'cv2', **kwargs)[source]¶
Show the wrong tracks on the input image.
- Parameters
backend (str, optional) – Backend of visualization. Defaults to ‘cv2’.
- mmtrack.utils.max_last2d(input: torch.Tensor) → Tuple[torch.Tensor, torch.Tensor][source]¶
Computes the value and position of maximum in the last two dimensions.
- Parameters
input (Tensor) – of shape (…, H, W)
- Returns
The maximum value. argmax (Tensor): The position of maximum in [row, col] format.
- Return type
max_val (Tensor)
- mmtrack.utils.plot_norm_precision_curve(norm_precision: numpy.ndarray, tracker_names: List, plot_opts: Optional[dict] = None, plot_save_path: Optional[str] = None, show: bool = False)[source]¶
Plot curves of Norm Precision for SOT.
- Parameters
norm_precision (np.ndarray) – The content of viualized indicators. It has shape (N, M), where N is the number of trackers and M is the number of
Norm Precision
corresponding to the X.tracker_names (List) – The names of trackers.
plot_opts (Optional[dict], optional) – The options for plot. Defaults to None.
plot_save_path (Optional[str], optional) – The saved path of the figure. Defaults to None.
show (bool, optional) – Whether to show. Defaults to False.
- mmtrack.utils.plot_precision_curve(precision: numpy.ndarray, tracker_names: List, plot_opts: Optional[dict] = None, plot_save_path: Optional[str] = None, show: bool = False)[source]¶
Plot curves of Precision for SOT.
- Parameters
precision (np.ndarray) – The content of viualized indicators. It has shape (N, M), where N is the number of trackers and M is the number of
Precision
corresponding to the X.tracker_names (List) – The names of trackers.
plot_opts (Optional[dict], optional) – The options for plot. Defaults to None.
plot_save_path (Optional[str], optional) – The saved path of the figure. Defaults to None.
show (bool, optional) – Whether to show. Defaults to False.
- mmtrack.utils.plot_success_curve(success: numpy.ndarray, tracker_names: List, plot_opts: Optional[dict] = None, plot_save_path: Optional[str] = None, show: bool = False)[source]¶
Plot curves of Success for SOT.
- Parameters
success (np.ndarray) – The content of viualized indicators. It has shape (N, M), where N is the number of trackers and M is the number of
Success
corresponding to the X.tracker_names (List) – The names of trackers.
plot_opts (Optional[dict], optional) – The options for plot. Defaults to None.
plot_save_path (Optional[str], optional) – The saved path of the figure. Defaults to None.
show (bool, optional) – Whether to show. Defaults to False.
- mmtrack.utils.register_all_modules(init_default_scope: bool = True) → None[source]¶
Register all modules in mmtrack into the registries.
- Parameters
init_default_scope (bool) – Whether initialize the mmtrack default scope. When init_default_scope=True, the global default scope will be set to mmtrack, and all registries will build modules from mmtrack’s registry node. To understand more about the registry, please refer to https://github.com/open-mmlab/mmengine/blob/main/docs/en/tutorials/registry.md Defaults to True.
- mmtrack.utils.stack_batch(tensors: List[torch.Tensor], pad_size_divisor: int = 0, pad_value: Union[int, float] = 0) → torch.Tensor[source]¶
Stack multiple tensors to form a batch and pad the images to the max shape use the right bottom padding mode in these images. If
pad_size_divisor > 0
, add padding to ensure the common height and width is divisible bypad_size_divisor
.- Parameters
tensors (List[Tensor]) – The input multiple tensors. each is a TCHW 4D-tensor. T denotes the number of key/reference frames.
pad_size_divisor (int) – If
pad_size_divisor > 0
, add padding to ensure the common height and width is divisible bypad_size_divisor
. This depends on the model, and many models need a divisibility of 32. Defaults to 0pad_value (int, float) – The padding value. Defaults to 0
- Returns
The NTCHW 5D-tensor. N denotes the batch size.
- Return type
Tensor