Corners for Layout: End-to-End Layout Recovery from 360 Images

Clara Fernandez-Labrador *,1,2
José M. Fácil *,1
Alejandro Perez-Yus1
Cédric Demonceaux2
Javier Civera1
José J. Guerrero1
* Equal contribution
1 University of Zaragoza
2 Université Bourgogne Franche-Comté
[Paper]
[Dataset]
[Code]
[Video]
The problem of 3D layout recovery in indoor scenes has been a core research topic for over a decade. However, there are still several major challenges that remain unsolved. Among the most relevant ones, a major part of the state-of-the-art methods make implicit or explicit assumptions on the scenes --e.g. box-shaped or Manhattan layouts. Also, current methods are computationally expensive and not suitable for real-time applications like robot navigation and AR/VR. In this work we present CFL (Corners for Layout), the first end-to-end model for 3D layout recovery on 360º images. Our experimental results show that we outperform the state of the art relaxing assumptions about the scene and at a lower cost. We also show that our model generalizes better to camera position variations than conventional approaches by using EquiConvs, a novel type of convolution applied directly on the sphere projection and hence invariant to the equirectangular distortions.


SUN360 Dataset for Layout Recovery


In this work we use ∼500 panoramas from the SUN360 dataset that were labeled by Zhang et al.. Since all panoramas were labeled as box-type rooms, we hand-label and substitute some of them representing more faithfully the actual shapes of the rooms. We do line thickening and Gaussian blur for easier convergence during training since it makes the loss progression continuous instead of binary. In the figure below we show an example.

[Download Dataset]

Snow
Forest
Mountains


Source Code



We have released the TensorFlow based implementation on our github page. Try our code and learn everything about EquiConvs, a novel implementation of the convolution for 360º images that adapts the shape of the kernel accordingly to the equirectangular projection distortions.

[GitHub]





Video

Padding effect using StdConvs and EquiConvs


The aim of this video is to show, for each tested panorama, the predictions obtained through our two CFL models, StdConvs and EquiConvs, when each panorama is rotated from 0°to 360°.
With EquiConvs, we do not use padding when the kernel reaches the border of the image since offsets take the points to their correct position on the other side of the 360 image. This allows the model to understand the continuity of the scene producing much more consistent predictions along all the performed rotations. StdConvs, instead, by using zero-padding, lose this consistency. As a consequence, in most cases when corners approach the borders, StdConvs predict these corners twice, i.e. at both ends.


Paper and Bibtex

@article{fernandez2019CFL,
  title={Corners for Layout: End-to-End Layout Recovery from 360 Images},
  author={Fernandez-Labrador, Clara and Fácil, José M and Perez-Yus, Alejandro and 
  Demonceaux, Cédric and Civera, Javier and Guerrero, José J},
  journal={arXiv:1903.08094},
  year={2019}
}