Automated information extraction with Azure AI Document Intelligence.
Slide 2
Originally from Mexico.
Tech Lead at Geneca.
Spend time with family, movies, videogames, soccer.
@thesoccerdev drkclw samueljgomez
Slide 3
Thank you to our sponsors!
Slide 4
Agenda. • Azure AI offerings
• Cognitive Services • Applied AI services
• Azure AI Document Intelligence • Model options • Input requirements • Data privacy
• Demo
Slide 5
Cognitive services. Speech
Language
Vision
Decision
Slide 6
Applied Services API.
Services for common business problems
Document intelligence (Form recognizer)
Immersive reader
Metrics advisor
Video indexer
Slide 7
Applied Services API.
Cognitive search
Bot service
Slide 8
Model options. • Document analysis models. • Read • Layout • General document
• Prebuilt models. • Custom models.
Slide 9
Document analysis models
Slide 10
Read model. • Extract text from documents (print and handwritten). • Detects paragraphs, text lines, words, locations and languages.
• Higher resolution than Vision Read API. • Underlying OCR engine for other models.
Slide 11
Read model development options. Model
Resources
Model ID
Read model
•Document Intelligence Studio •REST API •C# SDK •Python SDK •Java SDK •JavaScript
prebuilt-read
Slide 12
Layout model. • Extract text and layout to return structured data representations. • Text roles in documents.
• Geometric roles: Text, tables, and selection marks are examples of geometric roles. • Logical roles: Titles, headings, and footers are examples of logical roles.
• Combines OCR with deep learning models.
Slide 13
Slide 14
Layout model development options. Feature
Resources
Model ID
Layout model
•Document Intelligence Studio •REST API •C# SDK •Python SDK •Java SDK •JavaScript SDK
prebuilt-layout
Slide 15
General document model. • Extract text, layout and key-value pairs from documents. • Pretrained model. • Supports structured, semi-structured and unstructured documents.
Slide 16
Key-value pairs • For structured document the label and value for a field.
• For unstructured documents they are based on the text in the paragraph (like date in a contract).
Slide 17
General document model development options. Feature
Resources
General document model •Document Intelligence Studio •REST API •C# SDK •Python SDK •Java SDK •JavaScript SDK
Model ID prebuilt-document
Available models. • W2.
• Extract taxable compensation details.
• Business card.
• Extract business contact details.
• Contract.
• Extract agreement and party details.
• US Tax 1098-E Form.
• Extract student loan interest details.
Slide 21
Available models. • US Tax 1098 form. • Extract mortgage interest details.
• US Tax 1098-T form. • Extract qualified tuition details.
Slide 22
Development options. Feature
Resources
Model ID
Prebuilt models
•Document Intelligence Studio •REST API •C# SDK •Python SDK •Java SDK* •JavaScript SDK*
prebuilt-invoice prebuilt-receipt prebuilt-idDocument prebuilthealthInsuranceCard.us prebuilt-tax.us.w2 prebuilt-contract prebuilt-businessCard prebuilt-tax.us.1098 prebuilt-tax.us.1098E prebuilt-tax.us.1098T
Slide 23
Custom models
Slide 24
Custom template. • Extract data from static layouts. • Capabilities. • • • • •
Form fields. Selection marks. Tabular fields (tables). Signature. Selected regions.
Slide 25
Custom neural. • Extract data from mixed-type documents. • Supports structured, semi-structured and unstructured data.
Slide 26
Custom composed. • Extract data using a collection of models. •.
Slide 27
Classification model. • Identify document type before calling extraction model. •.
Slide 28
Input requirements.
Slide 29
File formats. Model
PDF
Image (JPEG/JPG, PNG, BMP, and TIFF)
Microsoft Office: Word (DOCX), Excel (XLS), PowerPoint (PPT), and HTML
Read
Yes
Yes
Yes
Layout
Yes
Yes
No
General Document Yes
Yes
No
Prebuilt
Yes
Yes
No
Custom
Yes
Yes
No
Slide 30
Model restrictions. • For PDF and TIFF, up to 2000 pages can be processed (2 for free tier). • .File size less than 500 MB (4 MB for free tier). • Image dimensions must be between 50 x 50 pixels and 10,000 px x 10,000 pixels.
Slide 31
Model restrictions. • PDFs can’t be password protected. • Minimum size of text is 8-point text at 150 DPI.
Slide 32
Data privacy. • Authentication.
• Data secured in transit.
• Encrypted input data.
Slide 33
Data privacy. • Data stored.
• Data and extracted results are stored temporarily. • Output for custom trained models and models themselves are stored temporarily as well. • Input data and results are deleted within 24 hours and not used for other purposes.
• Containers.