This seminar explores advanced techniques that enhance computer vision performance in agricultural settings. Starting with a quick review of CNNs, evaluation metrics, and image representations, it introduces state-of-the-art methods such as transformer backbones, multi-scale feature fusion, and self-supervised learning.
Participants then learn how to apply knowledge distillation to create lightweight models for edge deployment, and explore the potential of foundation and multimodal models like CLIP and DINOv2 for zero-shot recognition, retrieval, and rapid fine-tuning on Smart Droplets datasets.