Neural circuits support rapid visual learning. However, due to technical roadblocks, it is not known how visual circuits represent multiple features or how behaviorally relevant representations are selected for long-term memory. Here we developed Moculus, a head-mounted virtual reality platform for mice that covers the entire visual field, and allows binocular depth perception and full immersion. This controllable environment, combined with fast acousto-optical imaging, affords rapid visual learning and the uncovering of novel circuit substrates: both the control and reinforcement-associated visual cue coding neuronal assemblies are extended transiently to a near-saturating level. They formed partially orthogonal and overlapping clusters centered around hub cells with higher and earlier ramp-like responses, as well as locally increased connectivity. This temporally maximizes computational capability and allows competition between assemblies that encode behaviorally relevant information by stochastic fluctuation from trial-to-trial. The coding competition is driven by reinforcement feedback at the level of individual neurons.