Document AI, FLATTEN, and Dynamic Tables in Snowflake

Document AI: Modern finance and operations teams are working in PDF invoices and logistics documents. These files contain rich information – header details, nested line items, shipment summaries – but they are locked in unstructured formats.

In this post, we’ll walk through an example that combines:

Snowflake Document AI to extract structured JSON from PDF invoices
A lateral FLATTEN + array indexing pattern to handle nested line items cleanly
Dynamic Tables to materialize header- and line-level facts continuously

It’s designed to handle variable numbers of line items and incremental ingestion of new PDFs.

The Use Case: Logistics Invoice with Nested Line Items

Our scenario is a logistics company (Acme Logistics Ltd) generating invoices for a retail customer. Each invoice is a PDF with:

Header-level fields
- Invoice number: INV-2045
- Invoice date: 16-May-2025
- Vendor name: ACME LOGISTICS LTD
- PO number: PO-4500123
- Currency: INR
Nested line items – one line per service, e.g.
- SRV-ZONEA Line Haul Charges – Zone A
- SRV-WH-MAY Warehouse Storage – May 2025
- SRV-FUEL Fuel Surcharge & Handling
Quantities and prices per line, and invoice-level totals.

Technical:

We use Snowflake Document AI to build a model that extracts invoice metadata and returns a JSON payload like:

Step1 : We have developed the model using Document AI.

Observe the Description:Nested Array, and our goal is convert nested arrays into a clean relational structure.

ajithkurup1989@gmail.com

9447955466

AJITH'S TECH TIPS

Build.Break.Improve.Repeat..!!!!

Document AI, FLATTEN, and Dynamic Tables in Snowflake

Technical:

Leave a Reply Cancel reply

About AJITH'S TECH TIPS

Quick Links

Recent Posts