Automated information extraction with Azure AI Document Intelligence.
Slide 2
Originally from Mexico.
Tech Lead at Geneca.
Spend time with family, movies, videogames, football.
@thesoccerdev drkclw samueljgomez
Slide 3
Agenda. • Azure AI offerings
• Cognitive Services • Applied AI services
• Azure AI Document Intelligence • Model options • Input requirements • Data privacy
• Demo
Slide 4
Cognitive services. Speech
Language
Vision
Decision
Slide 5
Applied Services API.
Services for common business problems
Document intelligence (Form recognizer)
Immersive reader
Metrics advisor
Video indexer
Slide 6
Applied Services API.
Cognitive search
Bot service
Slide 7
Model options. • Document analysis models. • Read • Layout • General document
• Prebuilt models. • Custom models.
Slide 8
Document analysis models
Slide 9
Read model. • Extract text from documents (print and handwritten). • Detects paragraphs, text lines, words, locations and languages.
• Higher resolution than Vision Read API. • Underlying OCR engine for other models.
Slide 10
Read model development options. Model
Resources
Model ID
Read model
•Document Intelligence Studio •REST API •C# SDK •Python SDK •Java SDK •JavaScript
prebuilt-read
Slide 11
Layout model. • Extract text and layout to return structured data representations. • Text roles in documents.
• Geometric roles: Text, tables, and selection marks are examples of geometric roles. • Logical roles: Titles, headings, and footers are examples of logical roles.
• Combines OCR with deep learning models.
Slide 12
Slide 13
Layout model development options. Feature
Resources
Model ID
Layout model
•Document Intelligence Studio •REST API •C# SDK •Python SDK •Java SDK •JavaScript SDK
prebuilt-layout
Slide 14
General document model. • Extract text, layout and key-value pairs from documents. • Pretrained model. • Supports structured, semi-structured and unstructured documents.
Slide 15
Key-value pairs • For structured document the label and value for a field.
• For unstructured documents they are based on the text in the paragraph (like date in a contract).
Slide 16
General document model development options. Feature
Resources
General document model •Document Intelligence Studio •REST API •C# SDK •Python SDK •Java SDK •JavaScript SDK
Model ID prebuilt-document
Available models. • W2.
• Extract taxable compensation details.
• Business card.
• Extract business contact details.
• Contract.
• Extract agreement and party details.
• US Tax 1098-E Form.
• Extract student loan interest details.
Slide 20
Available models. • US Tax 1098 form. • Extract mortgage interest details.
• US Tax 1098-T form. • Extract qualified tuition details.
Slide 21
Development options. Feature
Resources
Model ID
Prebuilt models
•Document Intelligence Studio •REST API •C# SDK •Python SDK •Java SDK* •JavaScript SDK*
prebuilt-invoice prebuilt-receipt prebuilt-idDocument prebuilthealthInsuranceCard.us prebuilt-tax.us.w2 prebuilt-contract prebuilt-businessCard prebuilt-tax.us.1098 prebuilt-tax.us.1098E prebuilt-tax.us.1098T
Slide 22
Custom models
Slide 23
Custom template. • Extract data from static layouts. • Capabilities. • • • • •
Form fields. Selection marks. Tabular fields (tables). Signature. Selected regions.
Slide 24
Custom neural. • Extract data from mixed-type documents. • Supports structured, semi-structured and unstructured data.
Slide 25
Custom composed. • Extract data using a collection of models.
Slide 26
Classification model. • Identify document type before calling extraction model.
Slide 27
Input requirements.
Slide 28
File formats. Model
PDF
Image (JPEG/JPG, PNG, BMP, and TIFF)
Microsoft Office: Word (DOCX), Excel (XLS), PowerPoint (PPT), and HTML
Read
Yes
Yes
Yes
Layout
Yes
Yes
No
General Document Yes
Yes
No
Prebuilt
Yes
Yes
No
Custom
Yes
Yes
No
Slide 29
Model restrictions. • For PDF and TIFF, up to 2000 pages can be processed (2 for free tier). • .File size less than 500 MB (4 MB for free tier). • Image dimensions must be between 50 x 50 pixels and 10,000 px x 10,000 pixels.
Slide 30
Model restrictions. • PDFs can’t be password protected. • Minimum size of text is 8-point text at 150 DPI.
Slide 31
Data privacy. • Authentication.
• Data secured in transit.
• Encrypted input data.
Slide 32
Data privacy. • Data stored.
• Data and extracted results are stored temporarily. • Output for custom trained models and models themselves are stored temporarily as well. • Input data and results are deleted within 24 hours and not used for other purposes.
• Containers.