Balanced View Temporal Contrastive Learning (BV-TCLR) for Improved Video Representation Learning

dc.contributor.advisorMenore Tekeba (PhD)
dc.contributor.authorAyantu Tesema
dc.date.accessioned2025-06-19T14:35:44Z
dc.date.available2025-06-19T14:35:44Z
dc.date.issued2025-01
dc.description.abstractUnderstanding video data is crucial for tasks like action recognition, event detection, and video classification. However, traditional methods often struggle to effectively capture both the spatial and temporal aspects of video. To address this challenge, we introduce Balanced View Temporal Contrastive Learning (BV-TCLR), a new approach designed to improve video representation by addressing the issue of temporal imbalances. The term "Balanced View" refers to a method that ensures the model is exposed to both frequent and rare temporal events during training. This approach helps the model avoid focusing too much on common events while overlooking rare but important ones, leading to a more balanced and comprehensive understanding of the video data. This is achieved by combining balanced sampling and data augmentation techniques to diversify the temporal patterns the model learns from. We tested BV-TCLR on benchmark datasets like UCF101 and UCF10, and the results are promising. In linear evaluation, BV-TCLR boosts accuracy by 2.2% (from 91% to 93.2%) and increases F1-score by 2.5% (from 90% to 92.5%) compared to traditional Temporal Contrastive Learning (TCLR). In nearest neighbor retrieval, BV-TCLR outperforms TCLR with 0.8% higher accuracy (91.8% vs. 91%) and a 1.2% improvement in F1-score (91.2% vs. 90%). These results show that BV-TCLR is not only more accurate but also more adaptable, making it a powerful tool for tackling real-world challenges in video analysis.
dc.identifier.urihttps://etd.aau.edu.et/handle/123456789/5604
dc.language.isoen_US
dc.publisherAddis Ababa University
dc.subjectVideo Representation Learning
dc.subjectTemporal Contrastive Learning
dc.subjectBalanced Sampling
dc.subjectData Augmentation
dc.titleBalanced View Temporal Contrastive Learning (BV-TCLR) for Improved Video Representation Learning
dc.typeThesis

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Ayantu Tesema.pdf
Size:
1.43 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed to upon submission
Description: