Pose Annotations





Included in this dataset are 15,000 pairs of top- and front-view frames from videos of pairs of interacting mice. Each frame has been manually annotated by five individuals for a total of nine (top-view) or thirteen (front-view) keypoints: the nose, ears, base of neck, hips, base of tail, middle of tail, and end of tail, and additionally the four paws (front-view only), for a total of 3,300,000 labels (15,000 x (9 + 13 keypoints) x 2 views x 5 workers.)

Frames are sampled from 64 videos from several years of experimental projects. 5,000 of the extracted frames include resident mice with a head-attached fiberoptic cable or head-mounted microendoscope with cable, which introduces an additional source of occlusion.

Note that the middle and end of tail keypoints were quite noisy, therefore these parts were omitted during training of the original MARS models.


Behavior annotations





Included in this dataset are approximately 14 hours of top-view video of pairs of interacting mice, accompanied by MARS pose estimator output (keypoint locations, bounding boxes, and derived features) and manual frame-by-frame annotations. This data was used to train the attack, mount, and close investigation classifiers in end-user version of MARS.

The videos are split into train, validation, test-1, and test-2 sets, which reflect the train/test splits used in the MARS paper.

Each video folder contains top and front videos in .seq and .avi format, behavior annotations and an output_v# folder. Some videos contain two versions of behavior annotations: one where only aggression, investigation and mounting are annotated, where the other provides in total 9 behaviors.
The output_v# folder is the outcome from MARS and # stands for the MARS version. It containes bounding boxes and pose extracted from both view, and extracted features only from top view, top view with front view pixel change features (pcf) and front view only.

Pose files (.json) contains:
— bounding boxes [2 x num_frames x 4]
— pose [2 x num_frames x num_keypoints_view]


Inter-Annotator Variability





Included in this dataset are 10x2 top- and front-view videos of pairs of interacting mice, accompanied by MARS pose estimator output (keypoint locations, bounding boxes, and derived features) and manual frame-by-frame behavior annotations by each of eight trained individuals. All annotations are for attack, mount, and close investigation behaviors.

Two of the ten videos were annotated a second time by each individual for quantification of intra-annotator variability.