Tags aggregation1 attention1 backpropagation2 batchnorm2 classification3 generative1 layernorm2 loss function3 self-supervised training1 semi-supervised training1 supervised training4 unsupervised1