OptiFlex: video-based animal pose estimation using deep learning enhanced by optical flow

BioRxiv : the Preprint Server for Biology
X. LiuChris I De Zeeuw


Deep learning based animal pose estimation tools have greatly improved animal behaviour quantification. However, those tools all make predictions on individual video frames and do not account for variability of animal body shape in their model designs. Here, we introduce the first video-based animal pose estimation architecture, referred to as OptiFlex, which integrates a flexible base model to account for variability in animal body shape with an optical flow model to incorporate temporal context from nearby video frames. This approach can be combined with multi-view information, generating prediction enhancement using all four dimensions (3D space and time). To evaluate OptiFlex, we adopted datasets of four different lab animal species (mouse, fruit fly, zebrafish, and monkey) and proposed a more intuitive evaluation metric - percentage of correct key points (aPCK). Our evaluations show that OptiFlex provides the best prediction accuracy amongst current deep learning based tools, and that it can be readily applied to analyse a wide range of behaviours.

