Description
Multimodal Scene Understanding: Algorithms, Applications and Deep Learning presents recent advances in multi-modal computing, with a focus on computer vision and photogrammetry. It provides the latest algorithms and applications that involve combining multiple sources of information and describes the role and approaches of multi-sensory data and multi-modal deep learning. The book is ideal for researchers from the fields of computer vision, remote sensing, robotics, and photogrammetry, thus helping foster interdisciplinary interaction and collaboration between these realms.Researchers collecting and analyzing multi-sensory data collections – for example, KITTI benchmark (stereo+laser) - from different platforms, such as autonomous vehicles, surveillance cameras, UAVs, planes and satellites will find this book to be very useful.- Contains state-of-the-art developments on multi-modal computing- Shines a focus on algorithms and applications- Presents novel deep learning topics on multi-sensor fusion and multi-modal deep learning
Table of Contents
1. Introduction to Multimodal Scene UnderstandingMichael Ying Yang, Bodo Rosenhahn and Vittorio Murino2. Multi-modal Deep Learning for Multi-sensory Data FusionAsako Kanezaki, Ryohei Kuga, Yusuke Sugano and Yasuyuki Matsushita3. Multi-Modal Semantic Segmentation: Fusion of RGB and Depth Data in Convolutional Neural NetworksZoltan Koppanyi, Dorota Iwaszczuk, Bing Zha, Can Jozef Saul, Charles K. Toth and Alper Yilmaz4. Learning Convolutional Neural Networks for Object Detection with very little Training DataChristoph Reinders, Hanno Ackermann, Michael Ying Yang and Bodo Rosenhahn5. Multi-modal Fusion Architectures for Pedestrian DetectionDayan Guan, Jiangxin Yang, Yanlong Cao, Michael Ying Yang and Yanpeng Cao6. ThermalGAN: Multimodal Color-to-Thermal Image Translation for Person Re-Identification in Multispectral DatasetVladimir A. Knyaz and Vladimir V. Kniaz7. A Review and Quantitative Evaluation of Direct Visual-Inertia OdometryLukas von Stumberg, Vladyslav Usenko and Daniel Cremers8. Multimodal Localization for Embedded Systems: A SurveyImane Salhi, Martyna Poreba, Erwan Piriou, Valerie Gouet-Brunet and Maroun Ojail9. Self-Supervised Learning from Web Data for Multimodal RetrievalRaul Gomez, Lluis Gomez, Jaume Gibert and Dimosthenis Karatzas10. 3D Urban Scene Reconstruction and Interpretation from Multi-sensor ImageryHai Huang, Andreas Kuhn, Mario Michelini, Matthais Schmitz and Helmut Mayer11. Decision Fusion of Remote Sensing Data for Land Cover ClassificationArnaud Le Bris, Nesrine Chehata, Walid Ouerghemmi, Cyril Wendl, Clement Mallet, Tristan Postadjian and Anne Puissant12. Cross-modal learning by hallucinating missing modalities in RGB-D visionNuno Garcia, Pietro Morerio and Vittorio Murino



