Towards Face Recognition with Imbalanced Training Data: From Loss Function Design to Deep Generative Models