Automated information extraction with Azure AI Document Intelligence.

A presentation at Azure Cleveland User Group in September 2023 in by Sam Gomez

Slide 1

Slide 1

Automated information extraction with Azure AI Document Intelligence.

Slide 2

Slide 2

Originally from Mexico. Tech Lead at Geneca. Spend time with family, movies, videogames, soccer. @thesoccerdev drkclw samueljgomez

Slide 3

Slide 3

Agenda. • Azure AI offerings • Cognitive Services • Applied AI services • Azure AI Document Intelligence • Model options • Input requirements • Data privacy • Demo

Slide 4

Slide 4

Cognitive services. Speech Language Vision Decision

Slide 5

Slide 5

Applied Services API. Services for common business problems Document intelligence (Form recognizer) Immersive reader Metrics advisor Video indexer

Slide 6

Slide 6

Applied Services API. Cognitive search Bot service

Slide 7

Slide 7

Model options. • Document analysis models. • Read • Layout • General document • Prebuilt models. • Custom models.

Slide 8

Slide 8

Document analysis models

Slide 9

Slide 9

Read model. • Extract text from documents (print and handwritten). • Detects paragraphs, text lines, words, locations and languages. • Higher resolution than Vision Read API. • Underlying OCR engine for other models.

Slide 10

Slide 10

Read model development options. Model Resources Model ID Read model •Document Intelligence Studio •REST API •C# SDK •Python SDK •Java SDK •JavaScript prebuilt-read

Slide 11

Slide 11

Layout model. • Extract text and layout to return structured data representations. • Text roles in documents. • Geometric roles: Text, tables, and selection marks are examples of geometric roles. • Logical roles: Titles, headings, and footers are examples of logical roles. • Combines OCR with deep learning models.

Slide 12

Slide 12

Slide 13

Slide 13

Layout model development options. Feature Resources Model ID Layout model •Document Intelligence Studio •REST API •C# SDK •Python SDK •Java SDK •JavaScript SDK prebuilt-layout

Slide 14

Slide 14

General document model. • Extract text, layout and key-value pairs from documents. • Pretrained model. • Supports structured, semi-structured and unstructured documents.

Slide 15

Slide 15

Key-value pairs • For structured document the label and value for a field. • For unstructured documents they are based on the text in the paragraph (like date in a contract).

Slide 16

Slide 16

General document model development options. Feature Resources General document model •Document Intelligence Studio •REST API •C# SDK •Python SDK •Java SDK •JavaScript SDK Model ID prebuilt-document

Slide 17

Slide 17

Prebuilt models

Slide 18

Slide 18

Available models. • Invoice. • Extract customer and vendor details. • Receipt. • Extract sales transaction details. • Identity. • Insurance card.

Slide 19

Slide 19

Available models. • W2. • Extract taxable compensation details. • Business card. • Extract business contact details. • Contract. • Extract agreement and party details. • US Tax 1098-E Form. • Extract student loan interest details.

Slide 20

Slide 20

Available models. • US Tax 1098 form. • Extract mortgage interest details. • US Tax 1098-T form. • Extract qualified tuition details.

Slide 21

Slide 21

Development options. Feature Resources Model ID Prebuilt models •Document Intelligence Studio •REST API •C# SDK •Python SDK •Java SDK* •JavaScript SDK* prebuilt-invoice prebuilt-receipt prebuilt-idDocument prebuilthealthInsuranceCard.us prebuilt-tax.us.w2 prebuilt-contract prebuilt-businessCard prebuilt-tax.us.1098 prebuilt-tax.us.1098E prebuilt-tax.us.1098T

Slide 22

Slide 22

Custom models

Slide 23

Slide 23

Custom template. • Extract data from static layouts. • Capabilities. • • • • • Form fields. Selection marks. Tabular fields (tables). Signature. Selected regions.

Slide 24

Slide 24

Custom neural. • Extract data from mixed-type documents. • Supports structured, semi-structured and unstructured data.

Slide 25

Slide 25

Custom composed. • Extract data using a collection of models.

Slide 26

Slide 26

Classification model. • Identify document type before calling extraction model.

Slide 27

Slide 27

Input requirements.

Slide 28

Slide 28

File formats. Model PDF Image (JPEG/JPG, PNG, BMP, and TIFF) Microsoft Office: Word (DOCX), Excel (XLS), PowerPoint (PPT), and HTML Read Yes Yes Yes Layout Yes Yes No General Document Yes Yes No Prebuilt Yes Yes No Custom Yes Yes No

Slide 29

Slide 29

Model restrictions. • For PDF and TIFF, up to 2000 pages can be processed (2 for free tier). • .File size less than 500 MB (4 MB for free tier). • Image dimensions must be between 50 x 50 pixels and 10,000 px x 10,000 pixels.

Slide 30

Slide 30

Model restrictions. • PDFs can’t be password protected. • Minimum size of text is 8-point text at 150 DPI.

Slide 31

Slide 31

Data privacy. • Authentication. • Data secured in transit. • Encrypted input data.

Slide 32

Slide 32

Data privacy. • Data stored. • Data and extracted results are stored temporarily. • Output for custom trained models and models themselves are stored temporarily as well. • Input data and results are deleted within 24 hours and not used for other purposes. • Containers.

Slide 33

Slide 33

DEMO

Slide 34

Slide 34

https://www.rambli.com/2016/06/the-prayer-of-the-demo-gods/

Slide 35

Slide 35

Questions?

Slide 36

Slide 36

Useful links. • https://learn.microsoft.com/enus/azure/ai-services/documentintelligence/?view=doc-intel-3.1.0 • https://learn.microsoft.com/enus/legal/cognitive-services/documentintelligence/data-privacy-security • https://formrecognizer.appliedai.azure.co m/studio

Slide 37

Slide 37

@thesoccerdev drkclw samueljgomez