Our mind can represent various objects from the physical world metaphorically into an abstract and complex high-dimensional object space, with a finite number of orthogonal axes encoding critical object features. However, little is known about what features serve as axes of the object space to critically affect object recognition. Here we asked whether the feature of objects’ real-world size constructed an axis of object space with deep convolutional neural networks (DCNNs) based on three criteria of sensitivity, independence and necessity that are impractical to be examined altogether with traditional approaches. A principal component analysis on features extracted by the DCNNs showed that objects’ real-world size was encoded by an independent axis, and the removal of this axis significantly impaired DCNN’s performance in recognizing objects. With a mutually-inspired paradigm of computational modeling and biological observation, we found that the shape of objects, rather than retinal size, co-occurrence, task demands and texture features, was necessary to represent the real-world size of objects for DCNNs and humans. In short, our study provided the first evidence supporting the feature of objects’ real-world size as an axis of object space, and devised a novel paradigm for future exploring the structure of object space.