Automating Cricket Narration Using LLMs and Multi-Modal Analysis
Designed a novel framework to generate human-like cricket commentary by integrating Computer Vision and Large Language Models. The system uses a hybrid deep learning pipeline (YOLOv9, MediaPipe, GRU) to classify complex gameplay events—such as batsman shots and fielding actions—and detect player emotions. These visual insights are combined with real-time OCR scoreboard data and historical statistics, feeding into a multi-agent LLM architecture that produces context-aware, stylistically diverse game narratives.
Cricket Shot Classification via Body Pose Landmarks
Developed a deep learning system to classify cricket shots (e.g., cut, drive, flick) from video footage with 93.47% accuracy. The pipeline utilizes YOLOv8 for batsman detection and MediaPipe for extracting 3D body pose landmarks. To handle real-world challenges, it employs spline interpolation for missing data recovery and SMOTE for class balancing. The core classification is powered by advanced sequence models, specifically Gated Recurrent Units (GRUs), which effectively capture the temporal dynamics of batting movements.