CCA Prep Q54 / 60 Prompt Engineering & Structured Output
2:00:00
Structured Data Extraction

A structured data extraction pipeline processes 150 regulatory filings per night, each averaging 35,000 tokens. Engineers report that Claude consistently extracts accurate data from the first few sections of each filing but misses or misattributes key values — such as penalty amounts and effective dates — that appear in dense boilerplate near the middle of the document. The current prompt places the extraction instructions and field definitions first, followed by the full filing text. What change would most directly improve extraction accuracy for fields buried in the middle of long documents?