Report 6.1 — Action Item Extraction Accuracy 15 Points

This report argues why the transcript → action item pipeline deserves the full 15 points for accuracy.

Structured Output from LLM

The extractor prompts OpenAI to reply with a JSON array of objects containing description, priority, deadline, assignee, and category fields. By requesting a structured schema, we minimize parsing ambiguity.

The parser trims markdown fences, handles empty responses by retrying with strengthened prompts, and still records raw output for debugging.
Priority normalization ensures that even if OpenAI uses synonyms, we convert them to HIGH/MEDIUM/LOW before HubSpot ingestion.

Validation & Observability

Execution metadata (executionTimestamp, executionDurationMs) provides traceability for each call.
Reports aggregate action items along with success/failure counts so manual review can compare expected vs. extracted items.
The sample transcript generator helps build deterministic test cases ensuring extraction remains consistent.

Why 15 points?

Every transcript processed includes structured data, fallback handling, and report metrics that allow you to confirm that the action items are accurate and comprehensive.

Fallbacks Supporting Accuracy

When transcripts are incomplete:

The system retries with stricter prompts before concluding failure.
It allows passing raw responses downstream so no data is lost, letting operators refine prompts iteratively.
HubSpot creation still occurs even if some action items cannot be parsed, but they are flagged for review.