The 3D human pose estimation is crucial in computer vision. It mainly aims at estimating the spatial coordinates of important joints of the human body from monocular images. However, occlusion still is a challenge problem, which hinders their practical applications. To solve this, a novel method based on attention mechanism and distillation learning framework is proposed for unsupervised 3D human pose estimation. Due to the fact that the occluded joints often have multiple motion solutions, a pose filling network based on attention mechanism is proposed to accurately predict the coordinates of the occluded joints. To better establish the relationship between the 3D pose and occluded 2D poses, a distillation learning framework is employed. More precisely, the teacher network uses the complete 2D pose as input, and thus it can produce a more accurate 3D pose. The student network uses occluded 2D pose as the training set and the teacher network output as the target to establish a more robust dependency. Moreover, a simple yet effective data augmentation method is incorporated , which improve the performance by increasing the data diversity. Experimental results of the proposed method achieves PA-MPJPE values 2.3 and 5.2 lower than that of the benchmark method on the Human3.6M and MPI-INF- 3DHP datasets without occlusion, respectively. Moreover, the PA-MPJPE of our method is reduced to 30.3 when 5 out of 17 joints are occluded, comparing to 72.7 of a competitor. The extensive experiments demonstrated the efficiency of proposed method.