EDA Report — Garbage Classification Dataset

Forum 06 | Deep Learning | google/vit-large-patch16-224

1. Dataset Overview

2,527
Total images (raw)
6
Original categories
2,390
Images after removing trash
5
Categories used for modelling
512 × 384
Uniform source resolution (px)

2. Class Distribution — Full Dataset

Label distribution — all 6 classes

3. Class Distribution — Modelling Dataset (trash excluded)

Label distribution — 5 classes

4. Per-Category Descriptive Statistics

category count mean_W mean_H mean_size_kb mean_aspect
cardboard 403 512.0 384.0 41.3 1.333
glass 501 512.0 384.0 38.7 1.333
metal 410 512.0 384.0 35.2 1.333
paper 594 512.0 384.0 44.1 1.333
plastic 482 512.0 384.0 37.8 1.333
trash 137 512.0 384.0 36.5 1.333

5. File Size Distribution per Category

File size boxplot

6. Per-Channel Pixel Statistics

Channel mean bar chart
category Red Green Blue
cardboard 0.698 0.631 0.536
glass 0.424 0.441 0.464
metal 0.498 0.491 0.499
paper 0.847 0.835 0.822
plastic 0.541 0.519 0.511

7. Key Observations