✅ MANUELA — COMPLETE TECHNICAL ASSESSMENT

Senior AI/ML + Blockchain + Cloud Architect.

Dataset used: US Border Crossing Port Activity Data (stored in MySQL us_border_crossing_data.border_crossing_entry_data).

Sample rows (same schema as CSV)

Port Name | State | Port Code | Border | Date | Measure | Value | Latitude | Longitude | Point Jackman | Maine | 0104 | US-Canada Border | Jan-24 | Trucks | 6556 | 45.806 | -70.397 | POINT (-70.396722 45.805661) Porthill | Idaho | 3308 | US-Canada Border | Apr-24 | Trucks | 98 | 49 | -116.499 | POINT (-116.49925 48.999861) San Luis | Arizona | 2608 | US-Mexico Border | Apr-24 | Buses | 10 | 32.485 | -114.782 | POINT (-114.7822222 32.485) Warroad | Minnesota | 3423 | US-Canada Border | Jan-24 | Personal Vehicle Passengers| 9266 | 48.999 | -95.377 | POINT (-95.376555 48.999) Ysleta | Texas | 2401 | US-Mexico Border | Jan-24 | Personal Vehicle Passengers| 521714 | 31.673 | -106.335 | POINT (-106.335449846028 31.6731261376859)

Reference this table directly when prepping data for QLoRA/RAG; Point is stored as a spatial POINT.

SECTION 1 — AI/ML + MODERN LLM ENGINEERING

1.1 Practical Conceptual Questions (Short Technical Answers Required)

These map directly to real production AI work.

Compare pretraining, SFT, RLHF, DPO, LoRA fine-tuning in LLMs.
Provide concise, technical differences and when to use each.
Explain how RAG reduces hallucinations when answering questions about border-crossing activity.
Be precise about retrieval, grounding, and context-aware scoring.
Explain tokenizers, positional encodings, and attention mechanisms using the border dataset as input examples.
Map how each component processes tabular text into model-ready tensors.
Explain KV-cache, quantization (4bit/8bit), speculative decoding and how they speed up inference for your RAG agent.
Highlight memory reuse, reduced precision math, and shorter search paths.
Compare vLLM vs TensorRT-LLM vs HuggingFace Transformers for high-throughput border analytics, small-batch low-latency inference, GPU optimization.
List criteria per workload and GPU tuning differences.
Explain how to protect your LLM API from prompt injection, jailbreak attempts, training data extraction (border records).
Describe guardrails, filters, and watermarking.

Each answer must be short (5–7 lines), highly technical, and precise.

1.2 Hands-On ML Task

Task: Build a Border-Crossing QA Model (QLoRA + RAG + API)

A. QLoRA Fine-Tuning (Llama-3-8B)

B. Evaluation Pipeline

C. RAG Pipeline (FAISS)

D. Inference API (FastAPI)

SECTION 2 — AI/ML ENGINEERING (PRODUCTION)

2.1 Architecture Challenge — Real-Time LLM Microservice

Design a production inference system for border analytics covering:

Deliverables: Architecture Diagram, Component Responsibilities, Scaling Strategy, Failure Modes & Mitigations.

SECTION 3 — BLOCKCHAIN ENGINEERING

Use the border-crossing data as the economic asset: every record becomes an on-chain data point, users stake tokens on predictions, rewards depend on accuracy.

3.1 Smart Contract Assignment

3.2 DApp Frontend

SECTION 4 — BACKEND ENGINEERING (REST + GraphQL)

SECTION 5 — CLOUD + DEVOPS (AWS)

SECTION 6 — FULL-STACK FRONTEND

SECTION 7 — FINAL SYSTEM DESIGN CHALLENGE

Design a full real-time AI-powered border analytics platform that:

Deliverables: System sequence diagram, data flow + storage schema, API service layout, scaling strategy, and security considerations.