Knowledge distillation is a compression technique in which a small model is trained to reproduce the behavior of a larger model.
See - https://medium.com/huggingface/distilbert-8cf3380435b5
Knowledge distillation is a compression technique in which a small model is trained to reproduce the behavior of a larger model.
See - https://medium.com/huggingface/distilbert-8cf3380435b5