| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156 |
- /**
- * Generates a sample of 3 processed FFB Production records to validate:
- * - New nested schema shape (site/phase/block objects)
- * - Ollama qwen3:0.6b remark generation quality
- *
- * Run: npx ts-node --transpile-only scripts/sample-ffb-processed.ts
- */
- import * as dotenv from 'dotenv';
- import * as path from 'path';
- import * as fs from 'fs';
- import { MongoClient, ObjectId } from 'mongodb';
- dotenv.config({ path: path.resolve(__dirname, '../.env') });
- const MONGO_URI = process.env.MONGO_URI!;
- const MONGO_DB_NAME = process.env.MONGO_DB_NAME!;
- const OLLAMA_BASE_URL = 'http://localhost:11434';
- const REMARK_MODEL = 'qwen3:0.6b';
- const MONGO_STUFF = path.resolve(__dirname, '../../mongo stuff');
- const SAMPLE_SIZE = 3;
- // ─── Source interfaces ────────────────────────────────────────────────────────
- interface RawPhase { phaseID: number; phaseCode: string; phaseName: string; phaseDesc: string; }
- interface RawBlock {
- blockID: number; blockCode: string; blockDesc: string; loc_type: string;
- numOfTreesPlanted: string | number | null; totalPlantedArea: string | number | null;
- loc_soil_condition: string; plantedLocUOM: string;
- }
- interface RawFFB {
- activityId: number; productionDate: string; siteId: string;
- phaseId: number; blockId: number;
- net_weight: string; act_uom: string; no_of_bunches: number; qty_uom: string;
- }
- // ─── Ollama generate (non-streaming) ─────────────────────────────────────────
- async function generateRemark(blockCode: string, soilCondition: string, phaseName: string): Promise<string> {
- const prompt = `You are an oil palm plantation field supervisor writing a brief harvest observation note.
- Write ONE short sentence (max 25 words) about field conditions observed during FFB harvesting today.
- Context: Block ${blockCode}, Phase: ${phaseName}, Soil type: ${soilCondition || 'mineral'}.
- Your sentence must mention one of: soil/ground conditions, weather, worker performance, equipment, or pest/disease observation.
- Reply with ONLY the observation sentence. No quotes, no labels, no preamble. /no_think`;
- const res = await fetch(`${OLLAMA_BASE_URL}/api/generate`, {
- method: 'POST',
- headers: { 'Content-Type': 'application/json' },
- body: JSON.stringify({ model: REMARK_MODEL, prompt, stream: false }),
- });
- if (!res.ok) throw new Error(`Ollama generate failed: ${res.status} ${res.statusText}`);
- const json = (await res.json()) as { response: string };
- return json.response.trim();
- }
- // ─── Main ─────────────────────────────────────────────────────────────────────
- async function main() {
- console.log('\n═══════════════════════════════════════════════');
- console.log(' FFB Processed JSON — Sample Preview (3 recs)');
- console.log('═══════════════════════════════════════════════\n');
- // Load source files
- const rawPhases: RawPhase[] = JSON.parse(fs.readFileSync(path.join(MONGO_STUFF, 'phaseData.json'), 'utf-8'));
- const rawBlocks: RawBlock[] = JSON.parse(fs.readFileSync(path.join(MONGO_STUFF, 'blockData.json'), 'utf-8'));
- const rawFFBs: RawFFB[] = JSON.parse(fs.readFileSync(path.join(MONGO_STUFF, 'FFBProductionData.json'), 'utf-8'));
- // In-memory lookup maps (integer ID → raw data)
- const phaseById = new Map<number, RawPhase>(rawPhases.map(p => [p.phaseID, p]));
- const blockById = new Map<number, RawBlock>(rawBlocks.map(b => [b.blockID, b]));
- // Connect to Atlas to get actual _id ObjectIds from Phase and Block collections
- console.log('🔗 Connecting to Atlas to resolve ObjectIds...');
- const client = new MongoClient(MONGO_URI);
- await client.connect();
- const db = client.db(MONGO_DB_NAME);
- // Fetch all Phase and Block docs (small collections — 13 phases, 598 blocks)
- const phaseDocs = await db.collection('Phase').find({}, { projection: { _id: 1, locId: 1, phaseCode: 1 } }).toArray();
- const blockDocs = await db.collection('Block').find({}, { projection: { _id: 1, locId: 1, blockCode: 1 } }).toArray();
- await client.close();
- // Map locId (== original phaseID / blockID) → MongoDB _id
- const phaseLocIdToMongoId = new Map<number, ObjectId>(phaseDocs.map(d => [d.locId as number, d._id as ObjectId]));
- const blockLocIdToMongoId = new Map<number, ObjectId>(blockDocs.map(d => [d.locId as number, d._id as ObjectId]));
- console.log(` Phase ObjectIds resolved: ${phaseLocIdToMongoId.size}`);
- console.log(` Block ObjectIds resolved: ${blockLocIdToMongoId.size}\n`);
- // Take first SAMPLE_SIZE records that are fully resolvable
- const sample: RawFFB[] = [];
- for (const raw of rawFFBs) {
- if (sample.length >= SAMPLE_SIZE) break;
- if (phaseById.has(raw.phaseId) && blockById.has(raw.blockId) &&
- phaseLocIdToMongoId.has(raw.phaseId) && blockLocIdToMongoId.has(raw.blockId)) {
- sample.push(raw);
- }
- }
- console.log(`📋 Generating remarks for ${sample.length} sample records via ${REMARK_MODEL}...\n`);
- const output: object[] = [];
- for (let i = 0; i < sample.length; i++) {
- const raw = sample[i];
- const rawPhase = phaseById.get(raw.phaseId)!;
- const rawBlock = blockById.get(raw.blockId)!;
- const phaseMongoId = phaseLocIdToMongoId.get(raw.phaseId)!;
- const blockMongoId = blockLocIdToMongoId.get(raw.blockId)!;
- process.stdout.write(` [${i + 1}/${sample.length}] activityId=${raw.activityId} → generating remark...`);
- const remark = await generateRemark(rawBlock.blockCode, rawBlock.loc_soil_condition, rawPhase.phaseName);
- console.log(` ✅`);
- console.log(` "${remark}"\n`);
- output.push({
- activityId: raw.activityId,
- productionDate: new Date(raw.productionDate).toISOString(),
- site: {
- _id: null, // Site collection not yet seeded; placeholder
- siteId: raw.siteId,
- },
- phase: {
- id: phaseMongoId.toHexString(), // actual ObjectId from Phase collection
- phaseId: raw.phaseId,
- },
- block: {
- id: blockMongoId.toHexString(), // actual ObjectId from Block collection
- blockId: raw.blockId,
- },
- weight: parseFloat(raw.net_weight) || 0,
- weightUom: raw.act_uom,
- quantity: raw.no_of_bunches,
- quantityUom: raw.qty_uom,
- remarks: remark,
- vector: [], // to be filled during full seed run
- });
- }
- // Pretty-print to console and write sample file
- const outPath = path.join(MONGO_STUFF, 'FFBProductionData_sample.json');
- const pretty = JSON.stringify(output, null, 2);
- fs.writeFileSync(outPath, pretty, 'utf-8');
- console.log('═══════════════════════════════════════════════');
- console.log(' SAMPLE OUTPUT');
- console.log('═══════════════════════════════════════════════\n');
- console.log(pretty);
- console.log(`\n✅ Written to: mongo stuff/FFBProductionData_sample.json`);
- }
- main().catch(err => {
- console.error('\n❌ Sample failed:', err.message || err);
- process.exit(1);
- });
|