The IEEE 802.11 standard’s binary exponential back-off (BEB) algorithm is the prevailing method for tackling the collision avoidance problem. Under the BEB paradigm, the back-off period increases each time a collision occurs, aiming to minimize the likelihood of subsequent collisions. However, this provides sub-optimal results degrading network performance and leading to bandwidth wastage, especially in dynamic dense networks. To overcome these drawbacks, this paper proposes using a decentralized approach with deep reinforcement learning algorithms, namely Deep Q Learning (DQN) and Deep Deterministic Policy Gradient (DDPG), to optimize the contention window value and maximize throughput while minimizing collisions. Simulations with the NS-3 simulator and NS3-gym toolkit reveal that DQN and DDPG outperform BEB in both static and dynamic scenarios, achieving up to a 37.16% network throughput improvement in dense networks, keeping a high and stable throughput as the number of stations increases.