Co-design involves simultaneously optimizing thecontroller and the agent’s physical design. Its inherent bi-level optimization formulation necessitates an outer loop designoptimization driven by an inner loop control optimization. Thiscan be challenging when the design space is large and eachdesign evaluation involves a data-intensive reinforcement learningprocess for control optimization. To improve sample efficiencywe propose a multi-fidelity-based design exploration strategy inwhich we tie the controllers learned across the design spacesthrough a universal policy learner for warm-starting subsequentcontroller learning problems. Experiments performed on a widerange of agent design problems demonstrate the superiority ofour method compared to the baselines. Additionally, analysisof the optimized designs shows interesting design alterationsincluding design simplifications and non-intuitive alterations thathave emerged in the biological world.