Artificial Neural Networks (ANN) have emerged as powerful models in machine learning, mimicking the intricate neural connections of the human brain. With their ability to process and transmit information, ANNs learn complex patterns, making predictions, classifying data, and solving a wide range of problems. Their application extends to various domains, particularly in the field of Artificial Intelligence (AI). ANNs serve as fundamental frameworks for developing intelligent systems, exhibiting cognitive abilities similar to humans. They excel in computer vision, natural language processing, robotics, and recommendation systems. Deep Neural Networks (DNNs), composed of multiple layers, excel at capturing intricate relationships and extracting meaningful representations, resulting in breakthroughs in image recognition, speech synthesis, and natural language understanding. ANNs possess remarkable generalization capabilities, adapting to new data and making accurate predictions on unseen examples. Their adaptability and continuous improvement make them invaluable in dynamic and uncertain environments. ANNs are widely adopted in supervised, unsupervised, and reinforcement learning, enabling them to handle diverse problem domains and objectives. In this paper, we explore the influence of ANN size on AI training and performance, investigating optimal configurations to achieve superior results in different scenarios. [1-2] The size of an Artificial Neural Network (ANN) holds significant importance in the context of AI training, building upon the preceding discussion. The number of neurons and layers within an ANN greatly influences its learning capacity and overall performance. Larger ANNs, characterized by increased neuron and layer counts, exhibit enhanced representational power, enabling them to capture intricate patterns and relationships within the data. However, it is crucial to strike a balance, as excessively large ANNs can be prone to overfitting, where the network becomes excessively specialized in the training data and fails to generalize well to unseen examples. To address this challenge, regularization techniques such as dropout have been introduced. Dropout is a method that randomly deactivates a fraction of neurons during training, effectively preventing overfitting and enhancing the network's ability to generalize. By temporarily removing neurons, dropout fosters model robustness, compelling the network to rely on different sets of neurons for each training sample. This approach encourages the network to learn more diverse features, resulting in improved performance on unseen data. Furthermore, the choice of learning rate during training is another critical factor. Utilizing a low learning rate facilitates effective exploration of the action space, thereby mitigating instances where the ANN becomes trapped with exceedingly low overall rewards. Consequently, employing a judicious combination of appropriate ANN size, dropout regularization, and suitable learning rate holds paramount significance in achieving optimal performance and enabling effective exploration in AI training scenarios [5-6]. In the pursuit of finding a delicate balance between performance and the size of an Artificial Neural Network (ANN) in reinforcement learning (RL), the identification of the smallest working size that exhibits desirable performance characteristics holds immense potential for revolutionizing the AI landscape. This breakthrough discovery not only contributes to substantial resource savings but also enables the deployment of AI algorithms on resource-constrained platforms with limited computational power and memory. By unlocking the ability to achieve strong performance with minimal computational requirements, this advancement has the power to disrupt various domains, including embedded systems, mobile devices, and Internet of Things (IoT) applications. The widespread adoption of AI algorithms in previously unfeasible scenarios due to resource constraints becomes feasible. Consequently, this development opens up new avenues for efficient AI systems, leading to increased productivity and enabling breakthroughs in deploying AI in resource-limited environments [7-8].The pursuit of finding a delicate balance between performance and the size of an Artificial Neural Network (ANN) in reinforcement learning (RL) is a central motivation in our research. To achieve this, we introduce the A2C (Advantage Actor-Critic) algorithm as a crucial component in RL. A2C combines policy-based and value-based methods, enabling a fine-tuned balance between exploration and exploitation. In our experiments, we utilize A2C with a shared body for both the actor and critic networks. This shared architecture facilitates effective communication between the networks, improving performance and stability during learning. The critic network provides detailed feedback on the value of states, enhancing the decision-making process of the actor network without drastically altering its outputs. This setup ensures a delicate equilibrium between exploration and exploitation, leading to robust learning outcomes. Furthermore, A2C's suitability for environments with numerous input attributes makes it an excellent choice for tasks requiring complex sensory processing. By leveraging the benefits of A2C's shared body architecture, our research aims to uncover the intricate relationship between ANN size, AI training, and performance, ultimately providing insights and advancements that optimize the deployment of AI algorithms [9-10]. Atari simulations were chosen as the testbed for evaluating the performance of Artificial Neural Networks (ANNs) due to their significance in the field of AI research. These simulations provide a rich and diverse set of environments with varying rules and dynamics, offering a comprehensive evaluation platform for testing the capabilities of ANNs. Each Atari game presents unique challenges and scenarios, ranging from simple and intuitive tasks with immediate rewards to complex and strategic gameplay that demands long-term planning and decision-making. The diversity of Atari simulations allows us to examine how ANNs perform under various conditions, testing their ability to learn and generalize from different environments. Some simulations require quick reflexes and intuition, rewarding immediate actions and demonstrating the AI's capacity to respond in real-time scenarios. Others present more complex gameplay mechanics, challenging the AI's strategic thinking, long-term planning, and the ability to recognize patterns and make informed decisions. By exploring this wide range of simulations, our research aims to assess the adaptability and generalization capabilities of AI algorithms across different tasks and rule sets [11-14].Drawing inspiration from previous studies like "Human-level control through deep reinforcement learning" and "Playing Atari with Deep Reinforcement Learning," [24-25] we adopted a similar ANN structure with a larger hidden layer scaling down to the output layer. The overall size of the layers was adjusted to meet the specific requirements of the "rom" version of the simulation, which had an input shape of (125,) rather than the pixel input. Multiple episodes and simulations were conducted for each ANN size, ranging from 64-32-16-9 to 512-256-128-64-32-16-9, to comprehensively evaluate their performance. To ensure a diverse set of challenges, we carefully selected Atari games that often present vastly different setups of inputs. This selection allowed us to assess the adaptability, generalization capabilities, and overall performance of the ANNs across a range of scenarios. Additionally, recognizing the dynamic nature of these games, we employed an approach that involved training the networks after every single action with one epoch of training. This enabled the networks to continually learn and adapt to the evolving game dynamics and make more informed decisions. By adopting this experimental setup, we aimed to gain insights into the performance characteristics of ANNs across different sizes and their relationship with AI training and performance. The analysis of performance distribution across different ANN sizes revealed several key findings. Both smaller and larger ANN sizes tended to exhibit suboptimal performance. Smaller ANN sizes struggled to capture the complexity of the data, resulting in limited learning capacity and lower rewards. Conversely, larger ANN sizes initially demonstrated better performance due to their ability to store and process a larger volume of data. However, as the size increased beyond a certain threshold, the drawbacks of overfitting became more prominent, leading to a decline in performance. These findings align with the expected trend of a bell-like curve, where small-sized ANNs yield low rewards, larger-sized ANNs yield better rewards, and an optimal performance point exists somewhere in between. It highlights the importance of finding the sweet spot where the ANN's capacity matches the complexity of the problem at hand. In addition to examining performance distribution, we also investigated other factors that could contribute to the observed behavior. We analyzed the runtime for each episode, the number of actions taken per episode, the average reward per episode for specific ANN sizes, and the largest collected reward for episodes across different ANN sizes. By considering these factors, we aimed to uncover patterns and gain insights into the relationships between ANN size, runtime, action frequency, and reward outcomes. Interestingly, we discovered that the distribution of performance was not always as expected. In some cases, ANNs with the lowest number of neurons yielded surprisingly high results, challenging the conventional understanding. To explain such occurrences, we thoroughly examined all collected patterns of factors, aiming to unveil potential underlying mechanisms and provide a comprehensive understanding of the observed behavior [15-18]. The impact of simulation responsiveness on ANN performance is a crucial aspect to consider. We observed that simulations with a higher number of moves until rewards are given tend to require larger ANN sizes to achieve better performance. This correlation highlights the necessity for a larger capacity within the ANN to effectively capture complex patterns and strategies present in the environment. The relationship between the behavior of the agent, the type of rewards provided for each action, and the RL algorithm employed also plays a significant role. In systems where agents are rewarded for every single action, and the best policy is one that yields the next significant reward, a relatively simple equation can represent the optimal policy. Consequently, smaller ANNs can effectively handle these types of tasks. However, for more complex tasks that involve intricate policies and a more nuanced relationship between actions and rewards, larger and more complex ANNs become necessary. This observation holds particularly true for A2C-type algorithms, where discovering the best state and reward requires solving a complex set of actions. Although complex equations can be optimized and simplified, it may be possible to leverage the capacity of smaller ANNs to approximate the functions performed by larger ANNs effectively [19-23].The smaller ANNs have demonstrated superior performance for certain simulations, outperforming larger ANN sizes in highly responsive environments. To explain this behavior, we delved into the theory that small ANNs possess the perfect density to hold information rather than approximating it, unlike significantly larger ANNs. To test this theory, we conducted experiments focusing on the outliers of performance across all types and sizes of ANNs. We found that the ANNs with better performance had lower gradients to be applied, where the gradient represents the reward combined with the difference between V(t) and V(t+1) in A2C algorithms. This suggests that small ANNs can effectively hold information by discovering a successful policy through luck or chance and retaining it due to their suitable density. Additionally, their ability to forget less crucial information during the learning process contributes to their performance. This finding challenges the notion that larger ANNs are always superior in performance. It emphasizes the role of density and the balance between retaining crucial strategies and avoiding overfitting. By exploring the capabilities of small ANNs and their capacity to hold effective strategies, we gain valuable insights into the potential advantages of these compact architectures. [3-4]