Computer VisionOpenCVMediaPipe

Touchless Air Mouse

Translating physical hand gestures into digital cursor commands with near-zero latency.

Objective

Traditional Human-Computer Interaction relies heavily on physical peripherals, which can create friction in sterile environments or accessibility challenges. The objective was to architect a robust, vision-based system that bypasses hardware entirely, utilizing standard webcam feeds for real-time spatial computing.

Technical Implementation

Frame Processing

OpenCV captures and pre-processes the live video stream, optimizing matrix transformations to ensure high-speed inference and minimal overhead.

Landmark Detection

The stream is processed via the MediaPipe framework, extracting 21 distinct 3D hand landmarks in real-time for high-fidelity tracking.

Coordinate Scaling

Custom algorithms map index finger coordinates to the display's 2D pixel space, utilizing a low-pass filter to eliminate high-frequency jitter.

Gesture Engine

Employed deterministic Euclidean distance calculations between nodes to reliably trigger click and scroll events while managing signal noise.

The Outcome

The application demonstrates a seamless interface on consumer-grade hardware. It proves the viability of optimized computer vision loops that deliver fluid navigation while effectively handling edge cases like signal noise and debounce cooldowns.

Touchless Controller

Python • OpenCV • MediaPipe

View Source Download Code