In order to alleviate the traffic congestion problems caused by urbanization and the increment in vehicles, researchers have proposed various methods for traffic signal optimization based on reinforcement learning. However, most of these methods focus on small-scale and homogeneous road networks. To address complex road network problems, we introduce a new method called Gen-CenLight, which utilizes a centralized control approach based on the Actor-Critic architecture. To achieve scalability and coordination in massive-scale networks, we have designed a Spatial Representation Extraction that can learn the representation of the network from a high-dimensional space. Additionally, to manage heterogeneity in the network, we have proposed an innovative Action Selection module that selects the appropriate traffic signal phases based on the specific structure of each intersection. To the best of our knowledge, our model is the first centralized control model implemented in a road network comprising thousands of intersections. Experimental results demonstrate the effectiveness of our approach in effectively managing massive-scale and heterogeneous road networks.