← Back to Projects
Computer VisionDeep LearningPyTorchObject Detection

Occluded Pet Detection (Computer Vision)

Computer vision system for detecting pets in occluded scenarios using deep learning and attention mechanisms.

Overview

A computer vision model that detects and localizes pets in images with significant occlusion. Uses attention mechanisms and data augmentation to handle challenging scenarios where pets are partially hidden.

Addresses real-world pet detection needs where occlusion is common (behind furniture, partially visible). Demonstrates practical ML application with constraint handling and robustness considerations.

Your Role

What I Built

  • Custom YOLO-based architecture with attention modules
  • Data augmentation pipeline for occlusion scenarios
  • Training infrastructure with distributed data loading
  • Inference optimization for real-time performance

What I Owned End-to-End

  • Model architecture design and hyperparameter tuning
  • Dataset curation and annotation quality control
  • Evaluation metrics and benchmarking
  • Model deployment and serving infrastructure

Technical Highlights

Architecture Decisions

  • Modified YOLOv8 backbone with spatial attention modules
  • Multi-scale feature pyramid for small object detection
  • Focal loss for handling class imbalance
  • Non-maximum suppression with occlusion-aware IoU

Algorithms / Protocols / Constraints

  • Spatial attention mechanism for occlusion handling
  • Hard negative mining for difficult examples
  • Mixup and CutMix augmentation strategies
  • Learning rate scheduling with warm restarts

Optimization Strategies

  • Model quantization for edge deployment
  • TensorRT optimization for inference acceleration
  • Batch processing pipeline for throughput

Tech Stack

PythonPyTorchOpenCVTensorRTFastAPIDocker

Results / Learnings

What Worked

  • Achieved 87% mAP on occluded pet detection benchmark
  • Reduced false positives by 40% compared to baseline YOLO
  • Achieved 30 FPS inference on edge devices

What I Learned

  • Importance of domain-specific data augmentation
  • Attention mechanisms significantly improve occlusion handling
  • Tradeoffs between model complexity and inference speed

Tradeoffs Considered

  • Chose accuracy over inference speed for initial version
  • Accepted higher model size for better occlusion handling
  • Prioritized precision over recall to reduce false alarms