A service to detect if a given image of palm is ripe or not. A R&D initiated for Swopt

Dr-Swopt 5096aab978 feat: implement FastAPI service with multi-engine inference and local SQLite history tracking il y a 4 heures
palm_oil_mobile 0f77fae817 feat: Implement interactive Plotly visualizations, PDF batch report generation, and a 'History Vault' tab, alongside a new API endpoint. il y a 4 jours
src 5096aab978 feat: implement FastAPI service with multi-engine inference and local SQLite history tracking il y a 4 heures
.env 491c6c4829 services il y a 3 semaines
.gitignore 0e965994ee update il y a 2 jours
LICENSE 328e3e8b22 Initialize: Add requirements.txt and MIT LICENSE il y a 3 semaines
README.md 5096aab978 feat: implement FastAPI service with multi-engine inference and local SQLite history tracking il y a 4 heures
Streamlit.md ebcabef0d7 basic setup and trained data il y a 3 semaines
best.onnx 9caef843c9 working ver il y a 1 semaine
best.pt f486848971 enhancements for different grades il y a 2 semaines
demo_app.py 5096aab978 feat: implement FastAPI service with multi-engine inference and local SQLite history tracking il y a 4 heures
export_mobile.py 68dd501a74 Further streamlit UI enchancements il y a 2 semaines
export_raw_tflite.py 9caef843c9 working ver il y a 1 semaine
gemini-embedding-service-key.json 491c6c4829 services il y a 3 semaines
last.pt f486848971 enhancements for different grades il y a 2 semaines
main.py 6702db105a update to include mutliple endpoints il y a 3 semaines
manual_convert_tflite.py 0089edcf9b adidtional set up for tfllite conversion from best.pt il y a 2 semaines
requirements.txt ae6d70a7ba updated il y a 4 jours
sawit_tbs.pt 4bc1b9ab71 include 3rd party model il y a 3 jours
test_benchmark.py 793eb2495e feat: Add a FastAPI backend for model inference and management, update the demo app to use a YOLOv8-Sawit benchmark model and display its capabilities via the new API. il y a 3 jours
test_model.py f486848971 enhancements for different grades il y a 2 semaines
train_palm.py 91b2d69f4f update il y a 2 semaines
unify.py 91b2d69f4f update il y a 2 semaines
yolo26n.pt 91b2d69f4f update il y a 2 semaines
yolov8n.pt 6dc23e13aa simple setup il y a 3 semaines

README.md

๐ŸŒด Palm Oil FFB Management System (YOLO26)

A production-ready AI system for detecting the ripeness of Palm Oil Fresh Fruit Bunches (FFB). Built on a custom-trained YOLO26 model (YOLOv8 architecture fork) with a dual-engine inference backend (ONNX + PyTorch), a FastAPI server, and a full-featured Streamlit dashboard. The entire backend is architected with Domain-Driven Design (DDD) for maximum scalability and n8n workflow integration.


๐Ÿš€ Project Overview

Component Technology Purpose
Vision Engine YOLO26 (Custom-trained on MPOB-standard datasets) FFB Ripeness Detection
ONNX Runtime onnxruntime + best.onnx Zero-latency, NMS-Free edge inference (~39ms)
PyTorch Runtime ultralytics + best.pt High-resolution auditing inference
Benchmark Engine YOLOv8-Sawit (sawit_tbs.pt) Third-party model comparison
Inference Server FastAPI (Python) REST API for n8n & mobile integration
Visual Fingerprinting Vertex AI Multimodal Embedding (multimodalembedding@001) 1408-D vector generation
Cloud Archival MongoDB Atlas Vector Search Similarity-based semantic recall
Local History SQLite (palm_history.db) Offline audit log, zero cloud dependency
Demo Dashboard Streamlit (demo_app.py) 5-tab production operations UI

๐Ÿ›  Prerequisites

  • Python 3.10+
  • An NVIDIA GPU (recommended, but not required โ€” CPU inference is supported)
  • n8n (Desktop or Self-hosted) for workflow automation
  • MongoDB Atlas Account (optional โ€” required only for cloud archival & semantic search)
  • Google Cloud Platform with Vertex AI API enabled (optional โ€” required only for vectorization)

๐Ÿ“ฆ Setup Instructions

1. Environment Setup

# Clone and enter the repository
git clone <your-repo-url>
cd palm-oil-ai

# Create and activate virtual environment
python -m venv venv
.\venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Note: onnxruntime and fpdf2 are required but not yet in requirements.txt. Install manually if needed:

> pip install onnxruntime fpdf2
> ```

### 2. Dataset & Training

1. Download the dataset from [Roboflow](https://universe.roboflow.com/assignment-vvtq7/oil-palm-ripeness/dataset/5/download/yolov8) or source your own (ensure consistent YOLO `.yaml` structure).
2. Extract into `/datasets`.
3. **Train the model:**
    ```bash
    python train_palm.py
    ```
4. Copy the resulting `best.pt` from `runs/detect/train/weights/` to the project root.
5. **Export to ONNX** for high-speed inference:
    ```bash
    python export_raw_tflite.py  # or use yolo export
    ```
    Copy the resulting `best.onnx` to the project root.

### 3. Configuration (`.env`)

Populate your `.env` file. Cloud services (Vertex AI, MongoDB) are **optional** โ€” the system gracefully degrades to local-only mode if they are unavailable.

```env
# Required for Cloud Archival & Semantic Search
MONGO_URI=mongodb+srv://<user>:<password>@<cluster>.mongodb.net/
PROJECT_ID=your-gcp-project-id
LOCATION=us-central1
DB_NAME=palm_oil_db
COLLECTION_NAME=ffb_records

# Path to your GCP Service Account key JSON
GOOGLE_APPLICATION_CREDENTIALS=gemini-embedding-service-key.json

๐Ÿšฆ How to Run

Start the FastAPI Backend

The API server is the required component. The Streamlit dashboard will not function without it.

# Start the FastAPI server (root-level wrapper)
python main.py

The server will be available at http://localhost:8000. Interactive API docs are at http://localhost:8000/docs.

Alternatively, run as a module: python -m src.api.main

Start the Streamlit Dashboard

Open a second terminal and run:

streamlit run demo_app.py

The dashboard automatically connects to the backend and will display an error with a retry button if the API is offline.


๐Ÿ”Œ API Endpoints

Endpoint Method Description
/analyze POST Single Analysis: Runs inference on one image; auto-archives to local SQLite vault. Accepts model_type form field (onnx, pytorch, yolov8_sawit).
/process_batch POST Batch Processor: Processes multiple images; generates a manifest.json data contract in batch_outputs/. Accepts model_type and metadata (JSON string).
/vectorize_and_store POST Cloud Archival: Vectorizes a single detection and saves to MongoDB Atlas. Requires active GCP billing.
/search_hybrid POST Semantic Search: Visual similarity (upload image) or natural language query via Vertex AI embeddings.
/get_history GET History Vault: Returns all records from the local SQLite audit log, ordered by most recent.
/get_image/{record_id} GET Image Retrieval: Returns the Base64-encoded image for a specific MongoDB record.
/get_model_info GET Returns the available detection categories and description for the specified model_type.
/get_confidence GET Retrieves the current global AI confidence threshold.
/set_confidence POST Updates the AI confidence threshold globally (live, no restart required).

๐Ÿ–ฅ๏ธ Streamlit Dashboard Tabs

The dashboard (demo_app.py) features a 5-tab production operations UI:

Tab Feature Description
Single Analysis Live Detection Drag-and-drop a single image for auto-detection. Includes an interactive Plotly overlay viewer, a Manager's Dashboard (metrics), raw tensor inspector, harvest quality pie chart, OER yield-loss insights, cloud archival button, and misclassification flagging.
Batch Processing Bulk Analysis Upload multiple images and configure production metadata (Estate, Block ID, Harvester ID, Priority) via a modal dialog. Displays a batch quality dashboard (bar chart), annotated evidence gallery, performance timeline (start/end/duration), and generates a downloadable PDF executive report.
Similarity Search Semantic Search Search the MongoDB Atlas vector index by uploading a reference image (visual similarity) or typing a natural language query (text-to-vector).
History Vault Local Audit Log SQLite-backed audit log of every /analyze call. Supports a list view (filterable dataframe) and a "Deep Dive" detail view with interactive Plotly + static annotated image views and the raw mathematical tensor.
Batch Reviewer Manifest Auditor Browses batches saved in the batch_outputs/ directory. Loads manifest.json data contracts, displays the full batch metadata audit (Job ID, venue, engine, threshold, performance timeline), a quality overview chart, and a per-image inventory with interactive detection overlays and Subscriber Payloads (clean ERP-ready JSON).

Sidebar Controls

  • Confidence Threshold: Live slider (0.1โ€“1.0) that updates the backend globally in real-time.
  • Model Engine Selector: Switch between YOLO26 (ONNX), YOLO26 (PyTorch), and YOLOv8-Sawit (Benchmark). Switching engines automatically clears the current analysis canvas.
  • Model Capabilities Panel: Dynamically shows the detection categories for the selected engine.
  • AI Interpretation Guide: A built-in dialog explaining the raw tensor format, coordinate systems (normalized vs. absolute pixels), and the confidence scoring mechanism.

๐Ÿ“ฆ Batch Output Contract (manifest.json)

Each batch job produces a portable data bundle under batch_outputs/<BATCH_ID>/:

batch_outputs/
โ””โ”€โ”€ BATCH_<ID>/
    โ”œโ”€โ”€ manifest.json   # The Data Contract
    โ””โ”€โ”€ raw/            # Original uploaded images
        โ”œโ”€โ”€ <uid>_image1.jpg
        โ””โ”€โ”€ <uid>_image2.jpg

The manifest.json schema:

{
  "job_id": "BATCH_XXXXXXXX",
  "timestamp": "2026-03-30T...",
  "source_context": { "estate": "...", "block": "...", "harvester": "...", "priority": "..." },
  "engine": { "name": "YOLO26", "type": "onnx", "threshold": 0.25 },
  "performance": { "start_time": "...", "end_time": "...", "duration_seconds": 1.23 },
  "industrial_summary": { "Ripe": 5, "Unripe": 1, "Underripe": 2, "Abnormal": 0, "Empty_Bunch": 0, "Overripe": 0 },
  "inventory": [
    {
      "image_id": "abc123",
      "filename": "abc123_image.jpg",
      "inference_ms": 38.5,
      "raw_tensor": [...],
      "detections": [
        {
          "bunch_id": 1, "class": "Ripe", "confidence": 0.92,
          "is_health_alert": false,
          "box": [x1, y1, x2, y2],
          "norm_box": [0.1, 0.2, 0.5, 0.8]
        }
      ]
    }
  ]
}

Note: norm_box stores resolution-agnostic normalized coordinates (0.0โ€“1.0), enabling the Batch Reviewer to re-render detections on any image resolution without data loss.


๐Ÿ—๏ธ Architecture (DDD)

palm-oil-ai/
โ”œโ”€โ”€ src/
โ”‚   โ”œโ”€โ”€ api/
โ”‚   โ”‚   โ””โ”€โ”€ main.py             # FastAPI routes, ModelManager (ONNX + PyTorch), SQLite auto-archival
โ”‚   โ”œโ”€โ”€ application/
โ”‚   โ”‚   โ””โ”€โ”€ analyze_bunch.py    # Use Cases: AnalyzeBunchUseCase, AnalyzeBatchUseCase, SearchSimilarUseCase
โ”‚   โ”œโ”€โ”€ domain/
โ”‚   โ”‚   โ””โ”€โ”€ models.py           # PalmOilBunch dataclass (core business entity)
โ”‚   โ””โ”€โ”€ infrastructure/
โ”‚       โ”œโ”€โ”€ repository.py       # MongoPalmOilRepository (Atlas Vector Search, CRUD)
โ”‚       โ””โ”€โ”€ vision_service.py   # VertexVisionService (1408-D embeddings, Base64 encoding)
โ”œโ”€โ”€ demo_app.py                 # Streamlit 5-tab dashboard
โ”œโ”€โ”€ main.py                     # Root-level uvicorn launcher (DDD wrapper)
โ”œโ”€โ”€ train_palm.py               # YOLO training script
โ”œโ”€โ”€ export_raw_tflite.py        # ONNX/TFLite export utility
โ”œโ”€โ”€ best.onnx                   # YOLO26 ONNX weights (primary engine)
โ”œโ”€โ”€ best.pt                     # YOLO26 PyTorch weights
โ”œโ”€โ”€ sawit_tbs.pt                # YOLOv8-Sawit benchmark weights
โ”œโ”€โ”€ palm_history.db             # Local SQLite audit log
โ”œโ”€โ”€ batch_outputs/              # Batch job data bundles (manifest + raw images)
โ”œโ”€โ”€ history_archive/            # Archived images for History Vault
โ”œโ”€โ”€ feedback/                   # Misclassification feedback data (Human-in-the-Loop)
โ”œโ”€โ”€ datasets/                   # Labeled training images (Train/Valid/Test)
โ”œโ”€โ”€ runs/                       # YOLO training logs and output weights
โ”œโ”€โ”€ requirements.txt            # Python dependencies
โ”œโ”€โ”€ .env                        # Configuration (secrets, GCP, MongoDB)
โ””โ”€โ”€ README.md                   # You are here

Detection Classes (MPOB Standard)

Class Description Health Alert
Ripe Prime harvest condition โ€” maximum OER โŒ
Underripe Harvested before peak โ€” reduces OER โŒ
Unripe Harvested too early โ€” significant yield loss โŒ
Overripe Past peak โ€” potential quality degradation โŒ
Abnormal Disease or structural defect detected โœ… CRITICAL
Empty_Bunch No fruit present โ€” waste indicator โœ… Warning

๐Ÿ”‘ Key Design Decisions

  • Dual-Engine Inference: ONNX runtime is the primary engine for its ~39ms NMS-free speed. PyTorch (.pt) is retained for high-resolution auditing where standard NMS post-processing is preferred.
  • Coordinate Normalization: The batch pipeline stores norm_box (0.0โ€“1.0 ratios) alongside absolute pixel box coordinates. This makes the data contract resolution-agnostic for downstream ERP or vectorization subscribers.
  • Graceful Degradation: MongoDB Atlas and Vertex AI connections are established at startup. If they fail (e.g., no billing, no network), the system logs a warning and continues operating in local-only mode. Only cloud-dependent endpoints return errors.
  • Human-in-the-Loop: The "Flag Misclassification" feature in the Single Analysis tab saves flagged images and their detection metadata to a local feedback/ folder for future model retraining data collection.
  • SQLite Auto-Archival: Every call to /analyze is automatically logged to palm_history.db with the image, detections, engine used, inference/processing latency, and the raw mathematical tensor โ€” enabling a full offline audit trail.

๐Ÿ“œ License

This project is licensed under the MIT License โ€” see the LICENSE file for details.