Unmanned aerial vehicles (UAVs) are frequently adopted in disaster management. The vision they provided is extremelyvaluable for rescuers. However, they face severe problems in their stability in actual disaster scenarios, as the images captured by theon-board sensors cannot consistently give enough information for deep learning models to make accurate decisions. In many cases,UAVs have to capture multiple images from different views to output final recognition results. In this paper, we desire to formulate the flypath task for UAVs, considering the actual perception needs. A new convolutional neural network (CNN) model is proposed to detectand localize the objects, such as the buildings, as well as an optimization method to find the optimal flying path to accutately recognizeas many as possible objects with a minimum time cost. The simulation results demonstrate that the proposed method is effective andefficient, and can well address the actual scene understanding and path planning problems for UAVs in the real world.