This paper aims to develop an end-to-end sharpening mixture of experts (SMoE) fusion framework to improve the robustness and accuracy of the perception systems for CAEVs in complex illumination and weather conditions. Three original contributions make our work distinctive from the existing relevant literature. First, we introduce the Complex KITTI dataset which consists of 7481 pairs of modified KITTI RGB images and the generated LiDAR dense depth maps, this dataset is fine annotated in instance-level with our proposed semi-automatic annotation method. Second, the SMoE fusion approach is devised to adaptively learn the robust kernels from complementary modalities. Finally, we implement comprehensive comparative experiments, the results show that our proposed SMoE framework yield significant improvements over the other fusion techniques in adverse environmental conditions.