Automation saved 25% of developers' time
Revolgy
AI & Automation, Custom Development

25% time saved
CI/CD process optimization
0 human errors
Detail
AI & Automation, Custom Development
A large Czech industrial enterprise's domain experts spent up to five hours a week manually classifying unstructured B2B inquiries into more than 30 technical parameters. We built an LLM pipeline that handles it in minutes, with over 99% accuracy, achieved entirely through prompt engineering, not fine-tuning.

01
Unstructured Document Processing
An OCR and LLM pipeline that handles PDFs, scanned documents, and handwritten specifications at scale, regardless of format or language variation.
02
Multi-Parameter Extraction Engine
One dedicated LLM call per parameter, each returning a structured JSON value and a verbatim citation from the source document for expert verification.
03
Confidence-Based Routing
Semi-deterministic confidence scoring per parameter, routing high-confidence extractions to review-ready status and flagging uncertain values for expert attention.
04
Self-Improving Feedback Loop
Expert corrections and explanations feed back into the prompt context, improving classification accuracy over time without model retraining.
05
Kanban Review Board
A web interface with email notifications, per-expert assignment, and Excel export that fits into existing sales workflows.
A large Czech industrial enterprise's sales department handles a high volume of B2B inquiries every week. Each arrives in a different format: some as structured PDFs, others as scanned images of handwritten technical specifications, some as plain text with mixed-language annotations.
Every inquiry lands with a team of domain experts who know the field in depth, classifying each submission into more than 30 parameters: material dimensions, chemical composition, mechanical properties, surface treatment, packaging specifications. Before our involvement, experts spent up to five hours a week on this classification.
The bottleneck had a direct commercial cost. In B2B heavy industry, response speed is one of the key competitive factors. Every hour of classification delay was a disadvantage: not because the quotes were worse, but because they arrived later.
Off-the-shelf OCR tools weren't reliable enough across the diverse input formats. Standard document parsing couldn't handle the domain specificity: a mislabeled material parameter meant either rework or an incorrect quote. The company needed a system that was both accurate enough to trust and fast enough to matter.
We built an LLM workflow that processes each inquiry from raw input to structured output, ready for the sales team in minutes.
Document parsing. The pipeline starts with OCR on the visual input, then splits multi-item inquiries into individual line items. A single inquiry may cover ten different material variants with different specifications. The system handles each item independently.
One call per parameter. Rather than extracting all parameters in a single LLM call, the system uses a dedicated extraction call for each parameter. This prevents cross-parameter contamination: the model's reasoning stays focused on a single attribute, and an error in one parameter cannot cascade into others. Each call returns a structured JSON value plus a verbatim citation from the source document, so experts can verify any extraction without re-reading the original.
No fine-tuning required. The model we selected already had strong knowledge of the domain: material properties, technical notation, industry standards. The challenge lay in how we framed each extraction task, not in what the model knew. The team reached >99% classification accuracy through prompt engineering, structured output schemas, and iterative evaluation against a benchmark of 60 real inquiries with expert-prepared expected outputs.
Confidence scoring. The system computes a confidence score for each parameter semi-deterministically from the model's output. Asking the model to rate its own confidence tends to be poorly calibrated on specialized domain tasks, so we avoided that approach. High-confidence results proceed directly to review-ready status. The system flags low-confidence results for expert attention.
Feedback loop. When an expert corrects a misclassification, they record the corrected value and a short explanation. The pipeline incorporates these corrections into its prompt context for subsequent extractions. Accuracy improves incrementally as experts use the system, without any retraining.
The interface provides a Kanban board: Processing → Waiting for Review → In Review → Done. Each expert receives an email notification with a direct link to their review items. Finalized outputs export to an Excel spreadsheet for the sales team to use in quoting.
Classification work that previously consumed up to five hours a week now runs in minutes. Sales teams can respond to inquiries within hours instead of days.
We validated the >99% accuracy on a benchmark of 60 real inquiries with expected outputs prepared by experts. The team responded positively; several described the result as extremely positive.
You can directly reuse the core pattern of parallel parameter extraction, confidence-based routing, and expert feedback loops across other industrial classification problems. Any domain where humans classify unstructured inputs against a defined schema is a candidate.
Revolgy
AI & Automation, Custom Development

25% time saved
CI/CD process optimization
0 human errors
Detail
Heureka Group
AI & Automation, Strategy & Training

90%+ of R&D using AI tools
50% time savings on key tasks
13 teams onboarded in 3 months
Detail