Machine LearningOpenCVScikit-Learn

Image Color Extractor

Extracting mathematically dominant color palettes from high-resolution images using K-Means clustering.

Awaiting Input Data

Upload an image to start the extraction pipeline.

The Objective

Finding the "dominant" colors in a photograph isn't as simple as just counting the most common pixels. Lighting, shadows, and slight gradients mean that thousands of pixels might be nearly identical, but technically different hex codes. I wanted to build a data science pipeline that groups these pixels mathematically to extract a clean, usable UI palette from any image.

The Python Pipeline

Image Processing

The script utilizes OpenCV (cv2) to read and convert images from BGR to RGB space. To optimize computational load for high-res photos, the image matrix is downscaled heavily before being flattened into a 2D RGB array.

K-Means Clustering

This is where the heavy lifting happens. Using Scikit-Learn, a K-Means classifier groups the thousands of pixel vectors into a specified number of clusters (typically 5), finding the exact geometric center (centroid) of each color family.

Data Sorting

Once the centroids are established, the Python Counter library is used to calculate the volume of pixels assigned to each cluster. This ensures the final palette is ordered from most dominant to least dominant.

Visualization

The RGB centroids are converted into standard hexadecimal strings and passed into Matplotlib to render a visual pie chart representing the total color distribution of the original image.