Hierarchical Deep Stereo Matching on High-resolution Images

CVPR 2019

Gengshan Yang1 Joshua Manela2 Michael Happold2 Deva Ramanan1,2
1Robotics Institute, Carnegie Mellon University
2Argo AI


We explore the problem of real-time stereo matching on high-res imagery. Many state-of-the-art (SOTA) methods struggle to process high-res imagery because of memory constraints or fail to meet real-time needs. To address this issue, we propose an end-to-end framework that searches for correspondences incrementally over a coarse-to-fine hierarchy. Because high-res stereo datasets are relatively rare, we introduce a large-scale dataset of high-res stereo pairs for both training and evaluation. At the time of submission, our approach achieved SOTA performance on Middlebury-v3 and KITTI-15 while running significantly faster than its competitors. The hierarchical design also naturally allows for anytime on-demand reports of disparity by capping intermediate coarse results, allowing us to accurately predict disparity for near-range structures with low latency (30ms). We demonstrate that the performance-vs-speed tradeoff afforded by on-demand hierarchies may address sensing needs for time-critical applications such as autonomous driving.


  • On-demand depth estimation on a coarse-to-fine hierarchy (high-res version on youtube):

  • Able to handle large view change in high-res photo (used as a submodule in Open4D Bansal et al, CVPR 2020.):

  • Results on high-res Middlebury dataset (higher-res version on youtube):
  • Bibtex

    @inproceedings{yang2019hierarchical, title={Hierarchical deep stereo matching on high-resolution images}, author={Yang, Gengshan and Manela, Joshua and Happold, Michael and Ramanan, Deva}, booktitle={CVPR}, pages={5515--5524}, year={2019} }


    This work was supported by the CMU Argo AI Center for Autonomous Vehicle Research.