Automated information extraction with Azure AI Document Intelligence.

A presentation at Dev Up Conference 2023 in August 2023 in St Charles, MO, USA by Sam Gomez

Slide 1

Slide 1

Automated information extraction with Azure AI Document Intelligence.

Slide 2

Slide 2

Originally from Mexico. Tech Lead at Geneca. Spend time with family, movies, videogames, soccer. @thesoccerdev drkclw samueljgomez

Slide 3

Slide 3

Thank you to our sponsors!

Slide 4

Slide 4

Agenda. • Azure AI offerings • Cognitive Services • Applied AI services • Azure AI Document Intelligence • Model options • Input requirements • Data privacy • Demo

Slide 5

Slide 5

Cognitive services. Speech Language Vision Decision

Slide 6

Slide 6

Applied Services API. Services for common business problems Document intelligence (Form recognizer) Immersive reader Metrics advisor Video indexer

Slide 7

Slide 7

Applied Services API. Cognitive search Bot service

Slide 8

Slide 8

Model options. • Document analysis models. • Read • Layout • General document • Prebuilt models. • Custom models.

Slide 9

Slide 9

Document analysis models

Slide 10

Slide 10

Read model. • Extract text from documents (print and handwritten). • Detects paragraphs, text lines, words, locations and languages. • Higher resolution than Vision Read API. • Underlying OCR engine for other models.

Slide 11

Slide 11

Read model development options. Model Resources Model ID Read model •Document Intelligence Studio •REST API •C# SDK •Python SDK •Java SDK •JavaScript prebuilt-read

Slide 12

Slide 12

Layout model. • Extract text and layout to return structured data representations. • Text roles in documents. • Geometric roles: Text, tables, and selection marks are examples of geometric roles. • Logical roles: Titles, headings, and footers are examples of logical roles. • Combines OCR with deep learning models.

Slide 13

Slide 13

Slide 14

Slide 14

Layout model development options. Feature Resources Model ID Layout model •Document Intelligence Studio •REST API •C# SDK •Python SDK •Java SDK •JavaScript SDK prebuilt-layout

Slide 15

Slide 15

General document model. • Extract text, layout and key-value pairs from documents. • Pretrained model. • Supports structured, semi-structured and unstructured documents.

Slide 16

Slide 16

Key-value pairs • For structured document the label and value for a field. • For unstructured documents they are based on the text in the paragraph (like date in a contract).

Slide 17

Slide 17

General document model development options. Feature Resources General document model •Document Intelligence Studio •REST API •C# SDK •Python SDK •Java SDK •JavaScript SDK Model ID prebuilt-document

Slide 18

Slide 18

Prebuilt models

Slide 19

Slide 19

Available models. • Invoice. • Extract customer and vendor details. • Receipt. • Extract sales transaction details. • Identity. • Insurance card.

Slide 20

Slide 20

Available models. • W2. • Extract taxable compensation details. • Business card. • Extract business contact details. • Contract. • Extract agreement and party details. • US Tax 1098-E Form. • Extract student loan interest details.

Slide 21

Slide 21

Available models. • US Tax 1098 form. • Extract mortgage interest details. • US Tax 1098-T form. • Extract qualified tuition details.

Slide 22

Slide 22

Development options. Feature Resources Model ID Prebuilt models •Document Intelligence Studio •REST API •C# SDK •Python SDK •Java SDK* •JavaScript SDK* prebuilt-invoice prebuilt-receipt prebuilt-idDocument prebuilthealthInsuranceCard.us prebuilt-tax.us.w2 prebuilt-contract prebuilt-businessCard prebuilt-tax.us.1098 prebuilt-tax.us.1098E prebuilt-tax.us.1098T

Slide 23

Slide 23

Custom models

Slide 24

Slide 24

Custom template. • Extract data from static layouts. • Capabilities. • • • • • Form fields. Selection marks. Tabular fields (tables). Signature. Selected regions.

Slide 25

Slide 25

Custom neural. • Extract data from mixed-type documents. • Supports structured, semi-structured and unstructured data.

Slide 26

Slide 26

Custom composed. • Extract data using a collection of models. •.

Slide 27

Slide 27

Classification model. • Identify document type before calling extraction model. •.

Slide 28

Slide 28

Input requirements.

Slide 29

Slide 29

File formats. Model PDF Image (JPEG/JPG, PNG, BMP, and TIFF) Microsoft Office: Word (DOCX), Excel (XLS), PowerPoint (PPT), and HTML Read Yes Yes Yes Layout Yes Yes No General Document Yes Yes No Prebuilt Yes Yes No Custom Yes Yes No

Slide 30

Slide 30

Model restrictions. • For PDF and TIFF, up to 2000 pages can be processed (2 for free tier). • .File size less than 500 MB (4 MB for free tier). • Image dimensions must be between 50 x 50 pixels and 10,000 px x 10,000 pixels.

Slide 31

Slide 31

Model restrictions. • PDFs can’t be password protected. • Minimum size of text is 8-point text at 150 DPI.

Slide 32

Slide 32

Data privacy. • Authentication. • Data secured in transit. • Encrypted input data.

Slide 33

Slide 33

Data privacy. • Data stored. • Data and extracted results are stored temporarily. • Output for custom trained models and models themselves are stored temporarily as well. • Input data and results are deleted within 24 hours and not used for other purposes. • Containers.

Slide 34

Slide 34

DEMO

Slide 35

Slide 35

https://www.rambli.com/2016/06/the-prayer-of-the-demo-gods/

Slide 36

Slide 36

Questions?

Slide 37

Slide 37

Useful links. • https://learn.microsoft.com/enus/azure/ai-services/documentintelligence/?view=doc-intel-3.1.0 • https://learn.microsoft.com/enus/legal/cognitive-services/documentintelligence/data-privacy-security • https://formrecognizer.appliedai.azure.co m/studio

Slide 38

Slide 38

Useful links.

Slide 39

Slide 39

@thesoccerdev drkclw samueljgomez