Review on Scene Semantics Extraction for Decision Making System in Autonomous Vehicles
Main Article Content
Abstract
Abstract: It is a worldwide witnessed fact that traditional manual driving mechanism will be superseded by Autonomous Vehicles [AVs] in coming years. Autonomous vehicles are going to be most foreseen development in the automotive industry. That would require Decision Making System which will enable AVs to intuitively interpret the real-time situations around. Most importantly scene recognition on streets & extracting relevant semantics from the scene is challenging task. So, image classification & object detection techniques using Deep Convolutional Neural Networks [DCNN] are going to play vital role in every other methodology designed for scene semantics extraction. As per the extracted scene semantics DMS actuates the necessary devices which control the speed of vehicle & steering angel. So for that matter information extraction from road scene images covering all aspects to take intuitive decisions has huge concern with overall performance of the AV’s.
Â
Downloads
Article Details
COPYRIGHT
Submission of a manuscript implies: that the work described has not been published before, that it is not under consideration for publication elsewhere; that if and when the manuscript is accepted for publication, the authors agree to automatic transfer of the copyright to the publisher.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work
- The journal allows the author(s) to retain publishing rights without restrictions.
- The journal allows the author(s) to hold the copyright without restrictions.
References
L. Chen, W. Zhan, W. Tian, Y. He and Q. Zou, "Deep Integration: A Multi-Label Architecture for Road Scene Recognition," in IEEE Transactions on Image Processing, vol. 28, no. 10, pp. 4883-4898, Oct. 2019. doi: 10.1109/TIP.2019.2913079
A. J. Davison, I. D. Reid, N. D. Molton, and O. Stasse, “Mono SLAM: Real-time single camera SLAM,†IEEE Transactions on Pattern Analysis & Machine Intelligence, no. 6, pp. 1052–1067, 2007.
C. Cadena, L. Carlone, H. Carrillo, Y. Latif, D. Scaramuzza, J. Neira, I. Reid, and J. J. Leonard, “Past, present, and future of simultaneous localization and mapping: Toward the robust-perception age,†IEEE Transactions on Robotics, vol. 32, no. 6, pp. 1309–1332, 2016.
Q. Li, L. Chen, M. Li, S. Shaw, and A. Nuchter, “A sensor-fusion drivable-region and lane-detection system for autonomous vehicle navigation in challenging road scenarios,†IEEE Transactions on Vehicular Technology, vol. 63, no. 2, pp. 540–555, 2014.
D. Gonzlez, J. Prez, V. Milans, and F. Nashashibi, “A review of motion planning techniques for automated vehicles,†IEEE Transactions on Intelligent Transportation Systems, vol. 17, no. 4, pp. 1135–1145, 2016.
L. Chen, L. Fan, G. Xie, K. Huang, and A. Nuchter, “Moving-object detection from consecutive stereo pairs using slanted plane smoothing,†IEEE Transactions on Intelligent Transportation Systems,vol.18,no.11, pp. 3093–3102, 2017.
L. Chen, X. Hu, T. Xu, H. Kuang, and Q. Li, “Turn signal detection during night time by cnn detector and perceptual hashing tracking,†IEEE Transactions on Intelligent Transportation Systems, vol. 18, no. 12, pp. 3303–3314, 2017.
M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele, “The cityscapes dataset for semantic urban scene understanding,†IEEE Conference on Computer Vision and Pattern Recognition, pp. 3213–3223, 2016.
J. Deng, W. Dong, R. Socher, L. Li, K. Li, and L. Feifei, “Imagenet: A large-scale hierarchical image database,†European Conference on Computer Vision, pp. 248–255, 2009.
B. Zhou, A. Lapedriza, A. Khosla, A. Oliva, and A. Torralba, “Places: A 10 million image database for scene recognition,†IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 6, pp. 1452– 1464, 2018.
S. Ren, K. He, R. B. Girshick, and J. Sun, “Faster r-cnn: Towards real time object detection with region proposal networks,†IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp. 1137– 1149, 2017.
J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,†computer vision and pattern recognition, pp. 3431–3440, 2015.
Q. Zou, Z. Zhang, Q. Li, X. Qi, Q. Wang, and S. Wang, “Deepcrack: Learning hierarchical convolutional features for crack detection,†IEEE Transactions on Image Processing, vol. 28, no. 3, pp. 1498–1512, 2019.
N. Mayer, E. Ilg, P. Hausser, P. Fischer, D. Cremers, A. Dosovitskiy, and T. Brox, “A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation,†computer vision and pattern recognition, pp. 4040–4048, 2016.
L. Chen, M. Cui, F. Zhang, B. Hu, and K. Huang, “High speed scene flow on embedded commercial-off-the-shelf systems,†IEEE Transactions on Industrial Informatics, pp. 1–1, 2018.
A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classiï¬cation with deep convolutional neural networks,†in Advances in neural information processing systems, 2012, pp. 1097–1105.
K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,†International Conference on Learning Representations, 2015.
C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. E. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,†IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9, 2015.
J. Xiao, J. Hays, K. A. Ehinger, A. Oliva, and A. Torralba, “Sun database: Large-scale scene recognition from abbey to zoo,†IEEE Conference on Computer Vision and Pattern Recognition, pp. 3485– 3492, 2010.
A. Geiger, P. Lenz, and R. Urtasun, “Are we ready for autonomous driving? the kitti vision benchmark suite,†IEEE Conference on Computer Vision and Pattern Recognition, pp. 3354–3361, 2012.
L. Yang, P. Luo, C. C. Loy, and X. Tang, “A large-scale car dataset for ï¬ne-grained categorization and veriï¬cation,†IEEE Conference on Computer Vision and Pattern Recognition, pp. 3973–3981, 2015.
I. Sikiri´c, K. Brki´c, J. Krapac, and S. ˇSegvi´c, “Image representations on a budget: Trafï¬c scene classiï¬cation in a restricted bandwidth scenario,†IEEE Intelligent Vehicles Symposium, 2014.
Y. Luo, T. Liu, D. Tao, and C. Xu, “Multi view matrix completion for multi label image classiï¬cation,†IEEE Transactions on Image Processing, vol. 24, no. 8, pp. 2355–2368, 2015.
X. Li, X. Zhao, Z. Zhang, F. Wu, Y. Zhuang, J. Wang, and X. Li, “Joint multi label classiï¬cation with community-aware label graph learning,†IEEE Transactions on Image Processing, vol. 25, no. 1, pp. 484–493, 2016.
J. Wang, Y. Yang, J. Mao, Z. Huang, C. Huang, and W. Xu, “Cnnrnn: A uniï¬ed framework for multi-label image classiï¬cation,†IEEE Conference on Computer Vision and Pattern Recognition, pp. 2285– 2294, 2016.
H. Lai, P. Yan, X. Shu, Y. Wei, and S. Yan, “Instance-aware hashing for multi-label image retrieval,†IEEE Transactions on Image Processing, vol. 25, no. 6, pp. 2469–2479, 2016.
M. Oquab, L. Bottou, I. Laptev, and J. Sivic, “Learning and transferring mid-level image representations using convolutional neural networks,†IEEE conference on computer vision and pattern recognition, pp. 1717– 1724, 2014.
L. Wang, S. Guo, W. Huang, Y. Xiong, and Y. Qiao, “Knowledge guided disambiguation for large-scale scene classiï¬cation with multi-resolution cnns,†IEEE Transactions on Image Processing, 2017.
K. He, X. Zhang, S. Ren, and J. Sun, “Spatial pyramid pooling in deep convolutional networks for visual recognition,†IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 37, no. 9, pp. 1904– 1916, 2015.
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,†IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778, 2016.
T. Lin, M. Maire, S. J. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollar, and C. L. Zitnick, “Microsoft coco: Common objects in context,†IEEE Conference on European Conference on Computer Vision, pp. 740–755, 2014.
K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,†International Conference on Learning Representations, 2015.
C. Szegedy,V. Vanhoucke, S. Ioffe,J. Shlens, and Z.Wojna, “Rethinking the inception architecture for computer vision,†IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826, 2016.
C. Huang, Y. Li, C. C. Loy, and X. Tang, “Learning deep representation for imbalanced classiï¬cation,†IEEE conference on computer vision and pattern recognition, pp. 5375–5384, 2016.
C. Szegedy, S. Ioffe, V. Vanhoucke, and A. Alemi. (2016). “Inception-v4, inception-ResNet and the impact of residual connections on learning.†[Online]. Available: https://arxiv.org/abs/1602.07261
A. Oliva and A. Torralba, “Modeling the shape of the scene: A holistic representation of the spatial envelope,†Int. J. Comput. Vis., vol. 42, no. 3, pp. 145–175, 2001.
A. Veit, N. Alldrin, G. Chechik, I. Krasin, A. Gupta, and S. Belongie. (2017). “Learning from noisy large-scale datasets with minimal supervision.†[Online]. Available: https://arxiv.org/abs/1701.01619
L. Li, K. Ota and M. Dong, "Humanlike Driving: Empirical Decision-Making System for Autonomous Vehicles," in IEEE Transactions on Vehicular Technology, vol. 67, no. 8, pp. 6814-6823, Aug. 2018, doi: 10.1109/TVT.2018.2822762.
Yuan, S.; Chen, Y.; Huo, H.; Zhu, L. Analysis and Synthesis of Traffic Scenes from Road Image Sequences. Sensors 2020, 20, 6939. https://doi.org/10.3390/s20236939
https://idd.insaan.iiit.ac.in/dataset/download/
W. Zhiqiang and L. Jun, "A review of object detection based on convolutional neural network," 2017 36th Chinese Control Conference (CCC), 2017, pp. 11104-11109, doi: 10.23919/ChiCC.2017.8029130.