Report 4.1.1 — AI/LLM Integration Detailed Guide
This write-up documents how AI/LLM capabilities are integrated into the project to intelligently extract action items from meeting transcripts.
End-to-End Flow
The pipeline for transcript-based action-item extraction is:
- OpenAITestController receives the transcript (via REST request). It forwards the text to
ActionItemExtractorService.
- ActionItemExtractorService crafts a prompt that asks OpenAI to emit a JSON array of objects with keys:
description, priority, deadline, assignee, category. This structured output ensures HubSpot receives usable data.
- OpenAIService posts the prompt to
https://api.openai.com/v1/chat/completions using gpt-4o-mini. The method captures HTTP timing (start/end Instant) to calculate execution duration and annotate the response map with metadata (executionTimestamp, executionDurationMs).
- Responses are returned to the extractor, which trims markdown fences (handles triple-backtick JSON) and parses the string via Jackson into a list of action-item maps.
- For malformed or empty responses, the extractor captures the raw string under a
raw_output key so the downstream caller can still report partial data.
- HubSpotTaskService receives the parsed action items, maps them onto the HubSpot property schema (
ai_systems_description, ai_systems_deadline, etc.), and POSTs each as part of the /crm/v3/objects/deals flow.
Prompt & Parsing Details
Prompt Strategy
- Include explicit fields and format instructions to coax consistent JSON output.
- Request priority in (HIGH/MEDIUM/LOW) canonical values for easy mapping.
- Ask for deadlines and assignees to minimize downstream interpretation logic.
Parsing Strategy
- Strip Markdown code fences that OpenAI sometimes wraps around JSON.
- Attempt JSON parsing via Jackson; on failure, store raw response so operators can inspect it.
- Fallback path ensures downstream HubSpot creation still runs but records the entire payload (even if unparsable) for debugging.
Observability & Metadata
Every OpenAI interaction records metadata by augmenting the returned map with:
executionTimestamp: ISO instant captured after the API call.
executionDurationMs: Milliseconds between start and finish for SLA tracking.
This metadata allows the automation runner and any monitoring dashboards to surface latency spikes in the AI extraction path.
Error Handling & Robustness
- Retries: The controller retries completions with stronger instructions if the primary response is empty.
- JSON Fallbacks: Parser gracefully handles partial data by logging errors and preserving raw strings.
- Fallback DTOs: Action items include unexpected responses so the same processing pipeline still executes.