Enhancers are short genomic segments located in non-coding regions in a genome that help to increase the expressions of the target genes. Despite their significance in transcription regulation, effective methods for classifying enhancer categories and regulatory strengths remain limited. To address the issue, we propose a novel end-to-end deep learning architecture named DeepEnhancerPPO. The model integrates ResNet and Transformer modules to extract local, hierarchical, and long-range contextual features. Following feature fusion, we employ the proximal policy optimization (PPO), a reinforcement learning technique, to reduce the dimensionality of the fused features, retaining the most relevant ones for downstream classification. We evaluate the performance of DeepEnhancerPPO from multiple perspectives, including ablation analysis, independent tests, and interpretability of classification results. Each of these modules contributes positively to the model's performance, with ResNet and PPO being the top contributors. Overall, DeepEnhancerPPO exhibits superb performance on independent datasets compared to other models, outperforming the second-best model by 6.7% in accuracy for enhancer category classification. The model also ranks within the top five classifiers out of 25 in enhancer strength classification without the need to re-optimize the hyperparameters, indicating that the DeepEnhancerPPO framework is highly robust for enhancer classification. Additionally, the inclusion of PPO enhances the interpretability of the classification results. The source code is openly accessible at https://github.com/Mxc666/DeepEnhancerPPO.git.