My name is Lin-Zhuo Chen. I am a graduate student at College of Computer Science, Nankai University. Now I am a CV intern at ByteDance AI lab. My research interests are Computer Vision, SLAM, Robotics, and Recommendation system. My CV is avaliable .
School of Electronic Engineering, XiDian University, Xi’an, Sept. 2014 - Jul. 2018
College of Computer Science, NanKai University, TianJin, Sept. 2018 - Jul. 2021(expected)
3D spatial information is known to be beneficial to the semantic segmentation task. Most existing methods take 3D spatial data as an additional input, leading to a two-stream segmentation network that processes RGB and 3D spatial information separately. This solution greatly increases the inference time and severely limits its scope for real-time applications. To solve this problem, we propose Spatial information guided Convolution (S-Conv), which allows efficient RGB feature and 3D spatial information integration. S-Conv is competent to infer the sampling offset of the convolution kernel guided by the 3D spatial information, helping the convolutional layer adjust the receptive field and adapt to geometric transformations. S-Conv also incorporates geometric information into the feature learning process by generating spatially adaptive convolutional weights. The capability of perceiving geometry is largely enhanced without much affecting the amount of parameters and computational cost. We further embed S-Conv into a semantic segmentation network, called Spatial information Guided convolutional Network (SGNet), resulting in real-time inference and state-of-the-art performance on NYUDv2 and SUNRGBD datasets.
The structure of S-Conv and SGNet are shown as bellow:
We propose a new network layer, named Local Spatial Aware (LSA) Layer, to model geometric structure in local region accurately and robustly. Each feature extracting operation in LSA layer is related to Spatial Distribution Weights (SDWs), which are learned based on the spatial distribution in local region, to establish a strong link with inherent geometric shape. The experiments show that our LSA-based network, named LSANet, can achieve on par or better performance than the state-of-the-art methods when evaluating on the challenging benchmark datasets. The network architecture of LSANet and LSA module are shown below.
The visualization of SDWs are shown below:
Interactive Image Segmentation with First Click Attention, Zheng Lin, Zhao Zhang, Lin-Zhuo Chen, Ming-Ming Cheng, Shao-Ping Lu, IEEE CVPR, 2020
Spatial Information Guided Convolution for Real-Time RGBD Semantic Segmentation, Lin-Zhuo Chen, Zheng Lin, Ziqin Wang, Yong-Liang Yang,Ming-Ming Cheng, under TIP review
LSANet: Feature Learning on Point Sets by Local Spatial Aware Layer, Lin-Zhuo Chen, Xuan-Yi Li, Deng-Ping Fan, Kai Wang, Shao-Ping Lu, Ming-Ming Cheng, arXiv preprint arXiv:1905.05442