文档智能入门
重要
- Azure 认知服务表单识别器现称为 Azure AI 文档智能。
- 某些平台仍在等待命名更新。
- 我们的文档中提及的所有表单识别器或文档智能均指同一项 Azure 服务。
Azure AI 文档智能/表单识别器是一款基于云的 Azure AI 服务,它使用机器学习从文档中提取键值对、文本、表和关键数据。
可以使用编程语言 SDK 或调用 REST API 轻松将文档处理模型集成到工作流和应用程序中。
对于此快速入门,我们建议你在学习该技术时使用免费服务。 请记住,每月的免费页数限于 500。
若要详细了解 API 功能和开发选项,请访问我们的概述页。
客户端库 | SDK 参考 | REST API 参考 | 包 | 示例 |支持的 REST API 版本
在本快速入门中,使用以下功能来分析和提取表单和文档中的数据和值:
先决条件
Azure 订阅 - 创建试用订阅。
Visual Studio IDE 的当前版本。
Azure AI 服务或表单识别器资源。 获得 Azure 订阅后,在 Azure 门户中创建单服务或 Azure AI 多服务资源以获取密钥和终结点。
可以使用免费定价层 (
F0
) 试用该服务,然后再升级到付费层进行生产。
提示
如果计划通过一个终结点/密钥访问多个 Azure AI 服务,请创建 Azure AI 服务资源。 若要仅访问表单识别器,请创建表单识别器资源。 请注意,如果你打算使用 Microsoft Entra 身份验证,则需要单一服务资源。
部署资源后,选择“转到资源”。 需从创建的资源获取密钥和终结点,以便将应用程序连接到表单识别器 API。 稍后需要在本快速入门中将密钥和终结点粘贴到代码中:
设置
启动 Visual Studio。
在“开始”页上,选择“创建新项目”。
在“创建新项目”页面上,在搜索框中输入“控制台”。 选择“控制台应用程序”模板,然后选择“下一步”。
- 在“配置新项目”对话框中,在项目名称框中输入
form_recognizer_quickstart
。 然后选择“下一步”。
在“附加信息”对话框窗口中,选择“.NET 8.0 (长期支持)”,然后选择“创建”。
使用 NuGet 安装客户端库
右键单击 form_recognizer_quickstart 项目,然后选择“管理 NuGet 包...”。
选择“浏览”选项卡,并输入 Azure.AI.FormRecognizer。 从下拉菜单中选择版本 4.1.0
右键单击 form_recognizer_quickstart 项目,然后选择“管理 NuGet 包...”。
选择“浏览”选项卡,并输入 Azure.AI.FormRecognizer。 从下拉菜单中选择版本 4.0.0
生成应用程序
若要与表单识别器服务交互,需要创建 DocumentAnalysisClient
类的实例。 为此,你将使用 Azure 门户的 key
创建一个 AzureKeyCredential
,使用 AzureKeyCredential
和表单识别器 endpoint
创建一个 DocumentAnalysisClient
实例。
注意
- 从 .NET 6 开始,使用
console
模板的新项目将生成与以前版本不同的新程序样式。 - 新的输出使用最新的 C# 功能,这些功能简化了你需要编写的代码。
- 使用较新版本时,只需编写
Main
方法的主体。 无需包括顶级语句、全局 using 指令或隐式 using 指令。 - 有关详细信息,请参阅新的 C# 模板生成顶级语句。
打开 Program.cs 文件。
删除预先存在的代码(包括行
Console.Writeline("Hello World!")
)并选择以下代码示例之一,以复制和粘贴到应用程序的 Program.cs 文件中:
重要
完成后,请记住将密钥从代码中删除,并且永远不要公开发布该密钥。 对于生产来说,请使用安全的方式存储和访问凭据,例如 Azure Key Vault。 有关详细信息,请参阅 Azure AI 服务安全性。
布局模型
从文档中提取文本、选择标记、文本样式、表结构和边界区域坐标。
- 对于此示例,需要 URI 中的一个文档文件。 在本快速入门中,可使用示例文档。
- 我们已将文件 URI 值添加到脚本顶部的
Uri fileUri
变量中。 - 若要从 URI 中的特定文件中提取布局,请使用
StartAnalyzeDocumentFromUri
方法,并传递prebuilt-layout
作为模型 ID。 返回的值是一个AnalyzeResult
对象,其中包含来自已提交文档的数据。
将以下代码示例添加到 Program.cs 文件。 请确保使用 Azure 门户中表单识别器实例中的值更新密钥和终结点变量:
using Azure;
using Azure.AI.FormRecognizer.DocumentAnalysis;
//set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal to create your `AzureKeyCredential` and `DocumentAnalysisClient` instance
string endpoint = "<your-endpoint>";
string key = "<your-key>";
AzureKeyCredential credential = new AzureKeyCredential(key);
DocumentAnalysisClient client = new DocumentAnalysisClient(new Uri(endpoint), credential);
//sample document
Uri fileUri = new Uri ("https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf");
AnalyzeDocumentOperation operation = await client.AnalyzeDocumentFromUriAsync(WaitUntil.Completed, "prebuilt-layout", fileUri);
AnalyzeResult result = operation.Value;
foreach (DocumentPage page in result.Pages)
{
Console.WriteLine($"Document Page {page.PageNumber} has {page.Lines.Count} line(s), {page.Words.Count} word(s),");
Console.WriteLine($"and {page.SelectionMarks.Count} selection mark(s).");
for (int i = 0; i < page.Lines.Count; i++)
{
DocumentLine line = page.Lines[i];
Console.WriteLine($" Line {i} has content: '{line.Content}'.");
Console.WriteLine($" Its bounding box is:");
Console.WriteLine($" Upper left => X: {line.BoundingPolygon[0].X}, Y= {line.BoundingPolygon[0].Y}");
Console.WriteLine($" Upper right => X: {line.BoundingPolygon[1].X}, Y= {line.BoundingPolygon[1].Y}");
Console.WriteLine($" Lower right => X: {line.BoundingPolygon[2].X}, Y= {line.BoundingPolygon[2].Y}");
Console.WriteLine($" Lower left => X: {line.BoundingPolygon[3].X}, Y= {line.BoundingPolygon[3].Y}");
}
for (int i = 0; i < page.SelectionMarks.Count; i++)
{
DocumentSelectionMark selectionMark = page.SelectionMarks[i];
Console.WriteLine($" Selection Mark {i} is {selectionMark.State}.");
Console.WriteLine($" Its bounding box is:");
Console.WriteLine($" Upper left => X: {selectionMark.BoundingPolygon[0].X}, Y= {selectionMark.BoundingPolygon[0].Y}");
Console.WriteLine($" Upper right => X: {selectionMark.BoundingPolygon[1].X}, Y= {selectionMark.BoundingPolygon[1].Y}");
Console.WriteLine($" Lower right => X: {selectionMark.BoundingPolygon[2].X}, Y= {selectionMark.BoundingPolygon[2].Y}");
Console.WriteLine($" Lower left => X: {selectionMark.BoundingPolygon[3].X}, Y= {selectionMark.BoundingPolygon[3].Y}");
}
}
foreach (DocumentStyle style in result.Styles)
{
// Check the style and style confidence to see if text is handwritten.
// Note that value '0.8' is used as an example.
bool isHandwritten = style.IsHandwritten.HasValue && style.IsHandwritten == true;
if (isHandwritten && style.Confidence > 0.8)
{
Console.WriteLine($"Handwritten content found:");
foreach (DocumentSpan span in style.Spans)
{
Console.WriteLine($" Content: {result.Content.Substring(span.Index, span.Length)}");
}
}
}
Console.WriteLine("The following tables were extracted:");
for (int i = 0; i < result.Tables.Count; i++)
{
DocumentTable table = result.Tables[i];
Console.WriteLine($" Table {i} has {table.RowCount} rows and {table.ColumnCount} columns.");
foreach (DocumentTableCell cell in table.Cells)
{
Console.WriteLine($" Cell ({cell.RowIndex}, {cell.ColumnIndex}) has kind '{cell.Kind}' and content: '{cell.Content}'.");
}
}
运行应用程序
将代码示例添加到应用程序后,选择 formRecognizer_quickstart 旁边的绿色“开始”按钮以生成和运行程序,或按 F5。
布局模型输出
下面是预期输出的代码段:
Document Page 1 has 69 line(s), 425 word(s), and 15 selection mark(s).
Line 0 has content: 'UNITED STATES'.
Its bounding box is:
Upper left => X: 3.4915, Y= 0.6828
Upper right => X: 5.0116, Y= 0.6828
Lower right => X: 5.0116, Y= 0.8265
Lower left => X: 3.4915, Y= 0.8265
Line 1 has content: 'SECURITIES AND EXCHANGE COMMISSION'.
Its bounding box is:
Upper left => X: 2.1937, Y= 0.9061
Upper right => X: 6.297, Y= 0.9061
Lower right => X: 6.297, Y= 1.0498
Lower left => X: 2.1937, Y= 1.0498
若要查看整个输出,请访问 GitHub 上的 Azure 示例存储库,以查看布局模型输出。
将以下代码示例添加到 Program.cs 文件。 请确保使用 Azure 门户中表单识别器实例中的值更新密钥和终结点变量:
using Azure;
using Azure.AI.FormRecognizer.DocumentAnalysis;
//set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal to create your `AzureKeyCredential` and `DocumentAnalysisClient` instance
string endpoint = "<your-endpoint>";
string key = "<your-key>";
AzureKeyCredential credential = new AzureKeyCredential(key);
DocumentAnalysisClient client = new DocumentAnalysisClient(new Uri(endpoint), credential);
//sample document
Uri fileUri = new Uri ("https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf");
AnalyzeDocumentOperation operation = await client.AnalyzeDocumentFromUriAsync(WaitUntil.Completed, "prebuilt-layout", fileUri);
AnalyzeResult result = operation.Value;
foreach (DocumentPage page in result.Pages)
{
Console.WriteLine($"Document Page {page.PageNumber} has {page.Lines.Count} line(s), {page.Words.Count} word(s),");
Console.WriteLine($"and {page.SelectionMarks.Count} selection mark(s).");
for (int i = 0; i < page.Lines.Count; i++)
{
DocumentLine line = page.Lines[i];
Console.WriteLine($" Line {i} has content: '{line.Content}'.");
Console.WriteLine($" Its bounding polygon (points ordered clockwise):");
for (int j = 0; j < line.BoundingPolygon.Count; j++)
{
Console.WriteLine($" Point {j} => X: {line.BoundingPolygon[j].X}, Y: {line.BoundingPolygon[j].Y}");
}
}
for (int i = 0; i < page.SelectionMarks.Count; i++)
{
DocumentSelectionMark selectionMark = page.SelectionMarks[i];
Console.WriteLine($" Selection Mark {i} is {selectionMark.State}.");
Console.WriteLine($" Its bounding polygon (points ordered clockwise):");
for (int j = 0; j < selectionMark.BoundingPolygon.Count; j++)
{
Console.WriteLine($" Point {j} => X: {selectionMark.BoundingPolygon[j].X}, Y: {selectionMark.BoundingPolygon[j].Y}");
}
}
}
Console.WriteLine("Paragraphs:");
foreach (DocumentParagraph paragraph in result.Paragraphs)
{
Console.WriteLine($" Paragraph content: {paragraph.Content}");
if (paragraph.Role != null)
{
Console.WriteLine($" Role: {paragraph.Role}");
}
}
foreach (DocumentStyle style in result.Styles)
{
// Check the style and style confidence to see if text is handwritten.
// Note that value '0.8' is used as an example.
bool isHandwritten = style.IsHandwritten.HasValue && style.IsHandwritten == true;
if (isHandwritten && style.Confidence > 0.8)
{
Console.WriteLine($"Handwritten content found:");
foreach (DocumentSpan span in style.Spans)
{
Console.WriteLine($" Content: {result.Content.Substring(span.Index, span.Length)}");
}
}
}
Console.WriteLine("The following tables were extracted:");
for (int i = 0; i < result.Tables.Count; i++)
{
DocumentTable table = result.Tables[i];
Console.WriteLine($" Table {i} has {table.RowCount} rows and {table.ColumnCount} columns.");
foreach (DocumentTableCell cell in table.Cells)
{
Console.WriteLine($" Cell ({cell.RowIndex}, {cell.ColumnIndex}) has kind '{cell.Kind}' and content: '{cell.Content}'.");
}
}
Extract the layout of a document from a file stream
To extract the layout from a given file at a file stream, use the AnalyzeDocument method and pass prebuilt-layout as the model ID. The returned value is an AnalyzeResult object containing data about the submitted document.
string filePath = "<filePath>";
using var stream = new FileStream(filePath, FileMode.Open);
AnalyzeDocumentOperation operation = await client.AnalyzeDocumentAsync(WaitUntil.Completed, "prebuilt-layout", stream);
AnalyzeResult result = operation.Value;
foreach (DocumentPage page in result.Pages)
{
Console.WriteLine($"Document Page {page.PageNumber} has {page.Lines.Count} line(s), {page.Words.Count} word(s),");
Console.WriteLine($"and {page.SelectionMarks.Count} selection mark(s).");
for (int i = 0; i < page.Lines.Count; i++)
{
DocumentLine line = page.Lines[i];
Console.WriteLine($" Line {i} has content: '{line.Content}'.");
Console.WriteLine($" Its bounding polygon (points ordered clockwise):");
for (int j = 0; j < line.BoundingPolygon.Count; j++)
{
Console.WriteLine($" Point {j} => X: {line.BoundingPolygon[j].X}, Y: {line.BoundingPolygon[j].Y}");
}
}
for (int i = 0; i < page.SelectionMarks.Count; i++)
{
DocumentSelectionMark selectionMark = page.SelectionMarks[i];
Console.WriteLine($" Selection Mark {i} is {selectionMark.State}.");
Console.WriteLine($" Its bounding polygon (points ordered clockwise):");
for (int j = 0; j < selectionMark.BoundingPolygon.Count; j++)
{
Console.WriteLine($" Point {j} => X: {selectionMark.BoundingPolygon[j].X}, Y: {selectionMark.BoundingPolygon[j].Y}");
}
}
}
Console.WriteLine("Paragraphs:");
foreach (DocumentParagraph paragraph in result.Paragraphs)
{
Console.WriteLine($" Paragraph content: {paragraph.Content}");
if (paragraph.Role != null)
{
Console.WriteLine($" Role: {paragraph.Role}");
}
}
foreach (DocumentStyle style in result.Styles)
{
// Check the style and style confidence to see if text is handwritten.
// Note that value '0.8' is used as an example.
bool isHandwritten = style.IsHandwritten.HasValue && style.IsHandwritten == true;
if (isHandwritten && style.Confidence > 0.8)
{
Console.WriteLine($"Handwritten content found:");
foreach (DocumentSpan span in style.Spans)
{
Console.WriteLine($" Content: {result.Content.Substring(span.Index, span.Length)}");
}
}
}
Console.WriteLine("The following tables were extracted:");
for (int i = 0; i < result.Tables.Count; i++)
{
DocumentTable table = result.Tables[i];
Console.WriteLine($" Table {i} has {table.RowCount} rows and {table.ColumnCount} columns.");
foreach (DocumentTableCell cell in table.Cells)
{
Console.WriteLine($" Cell ({cell.RowIndex}, {cell.ColumnIndex}) has kind '{cell.Kind}' and content: '{cell.Content}'.");
}
}
运行应用程序
将代码示例添加到应用程序后,选择 formRecognizer_quickstart 旁边的绿色“开始”按钮以生成和运行程序,或按 F5。
预生成模型
使用预生成模型分析和提取特定文档类型的公共字段。 在本示例中,我们将使用预生成发票模型分析发票。
提示
不止发票,还有几个预生成模型可供选择,每个模型都有自己的一组受支持的字段。 用于 analyze
操作的模型由要分析的文档类型确定。 请参阅模型数据提取。
将以下代码示例添加到 Program.cs 文件。 请确保使用 Azure 门户中表单识别器实例中的值更新密钥和终结点变量:
using Azure;
using Azure.AI.FormRecognizer.DocumentAnalysis;
//set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal to create your `AzureKeyCredential` and `FormRecognizerClient` instance
string endpoint = "<your-endpoint>";
string key = "<your-key>";
AzureKeyCredential credential = new AzureKeyCredential(key);
DocumentAnalysisClient client = new DocumentAnalysisClient(new Uri(endpoint), credential);
//sample invoice document
Uri invoiceUri = new Uri ("https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf");
Operation operation = await client.AnalyzeDocumentAsync(WaitUntil.Completed, "prebuilt-invoice", invoiceUri);
AnalyzeResult result = operation.Value;
for (int i = 0; i < result.Documents.Count; i++)
{
Console.WriteLine($"Document {i}:");
AnalyzedDocument document = result.Documents[i];
if (document.Fields.TryGetValue("VendorName", out DocumentField vendorNameField))
{
if (vendorNameField.FieldType == DocumentFieldType.String)
{
string vendorName = vendorNameField.Value.AsString();
Console.WriteLine($"Vendor Name: '{vendorName}', with confidence {vendorNameField.Confidence}");
}
}
if (document.Fields.TryGetValue("CustomerName", out DocumentField customerNameField))
{
if (customerNameField.FieldType == DocumentFieldType.String)
{
string customerName = customerNameField.Value.AsString();
Console.WriteLine($"Customer Name: '{customerName}', with confidence {customerNameField.Confidence}");
}
}
if (document.Fields.TryGetValue("Items", out DocumentField itemsField))
{
if (itemsField.FieldType == DocumentFieldType.List)
{
foreach (DocumentField itemField in itemsField.Value.AsList())
{
Console.WriteLine("Item:");
if (itemField.FieldType == DocumentFieldType.Dictionary)
{
IReadOnlyDictionary<string, DocumentField> itemFields = itemField.Value.AsDictionary();
if (itemFields.TryGetValue("Description", out DocumentField itemDescriptionField))
{
if (itemDescriptionField.FieldType == DocumentFieldType.String)
{
string itemDescription = itemDescriptionField.Value.AsString();
Console.WriteLine($" Description: '{itemDescription}', with confidence {itemDescriptionField.Confidence}");
}
}
if (itemFields.TryGetValue("Amount", out DocumentField itemAmountField))
{
if (itemAmountField.FieldType == DocumentFieldType.Currency)
{
CurrencyValue itemAmount = itemAmountField.Value.AsCurrency();
Console.WriteLine($" Amount: '{itemAmount.Symbol}{itemAmount.Amount}', with confidence {itemAmountField.Confidence}");
}
}
}
}
}
}
if (document.Fields.TryGetValue("SubTotal", out DocumentField subTotalField))
{
if (subTotalField.FieldType == DocumentFieldType.Currency)
{
CurrencyValue subTotal = subTotalField.Value.AsCurrency();
Console.WriteLine($"Sub Total: '{subTotal.Symbol}{subTotal.Amount}', with confidence {subTotalField.Confidence}");
}
}
if (document.Fields.TryGetValue("TotalTax", out DocumentField totalTaxField))
{
if (totalTaxField.FieldType == DocumentFieldType.Currency)
{
CurrencyValue totalTax = totalTaxField.Value.AsCurrency();
Console.WriteLine($"Total Tax: '{totalTax.Symbol}{totalTax.Amount}', with confidence {totalTaxField.Confidence}");
}
}
if (document.Fields.TryGetValue("InvoiceTotal", out DocumentField invoiceTotalField))
{
if (invoiceTotalField.FieldType == DocumentFieldType.Currency)
{
CurrencyValue invoiceTotal = invoiceTotalField.Value.AsCurrency();
Console.WriteLine($"Invoice Total: '{invoiceTotal.Symbol}{invoiceTotal.Amount}', with confidence {invoiceTotalField.Confidence}");
}
}
}
运行应用程序
将代码示例添加到应用程序后,选择 formRecognizer_quickstart 旁边的绿色“开始”按钮以生成和运行程序,或按 F5。
预生成模型输出
下面是预期输出的代码段:
Document 0:
Vendor Name: 'CONTOSO LTD.', with confidence 0.962
Customer Name: 'MICROSOFT CORPORATION', with confidence 0.951
Item:
Description: 'Test for 23 fields', with confidence 0.899
Amount: '100', with confidence 0.902
Sub Total: '100', with confidence 0.979
若要查看整个输出,请访问 GitHub 上的 Azure 示例存储库,以查看预生成模型输出。
将以下代码示例添加到 Program.cs 文件。 请确保使用 Azure 门户中表单识别器实例中的值更新密钥和终结点变量:
using Azure;
using Azure.AI.FormRecognizer.DocumentAnalysis;
//set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal to create your `AzureKeyCredential` and `FormRecognizerClient` instance
string endpoint = "<your-endpoint>";
string key = "<your-key>";
AzureKeyCredential credential = new AzureKeyCredential(key);
DocumentAnalysisClient client = new DocumentAnalysisClient(new Uri(endpoint), credential);
//sample invoice document
Uri invoiceUri = new Uri ("https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf");
AnalyzeDocumentOperation operation = await client.AnalyzeDocumentFromUriAsync(WaitUntil.Completed, "prebuilt-invoice", invoiceUri);
AnalyzeResult result = operation.Value;
for (int i = 0; i < result.Documents.Count; i++)
{
Console.WriteLine($"Document {i}:");
AnalyzedDocument document = result.Documents[i];
if (document.Fields.TryGetValue("VendorName", out DocumentField vendorNameField))
{
if (vendorNameField.FieldType == DocumentFieldType.String)
{
string vendorName = vendorNameField.Value.AsString();
Console.WriteLine($"Vendor Name: '{vendorName}', with confidence {vendorNameField.Confidence}");
}
}
if (document.Fields.TryGetValue("CustomerName", out DocumentField customerNameField))
{
if (customerNameField.FieldType == DocumentFieldType.String)
{
string customerName = customerNameField.Value.AsString();
Console.WriteLine($"Customer Name: '{customerName}', with confidence {customerNameField.Confidence}");
}
}
if (document.Fields.TryGetValue("Items", out DocumentField itemsField))
{
if (itemsField.FieldType == DocumentFieldType.List)
{
foreach (DocumentField itemField in itemsField.Value.AsList())
{
Console.WriteLine("Item:");
if (itemField.FieldType == DocumentFieldType.Dictionary)
{
IReadOnlyDictionary<string, DocumentField> itemFields = itemField.Value.AsDictionary();
if (itemFields.TryGetValue("Description", out DocumentField itemDescriptionField))
{
if (itemDescriptionField.FieldType == DocumentFieldType.String)
{
string itemDescription = itemDescriptionField.Value.AsString();
Console.WriteLine($" Description: '{itemDescription}', with confidence {itemDescriptionField.Confidence}");
}
}
if (itemFields.TryGetValue("Amount", out DocumentField itemAmountField))
{
if (itemAmountField.FieldType == DocumentFieldType.Currency)
{
CurrencyValue itemAmount = itemAmountField.Value.AsCurrency();
Console.WriteLine($" Amount: '{itemAmount.Symbol}{itemAmount.Amount}', with confidence {itemAmountField.Confidence}");
}
}
}
}
}
}
if (document.Fields.TryGetValue("SubTotal", out DocumentField subTotalField))
{
if (subTotalField.FieldType == DocumentFieldType.Currency)
{
CurrencyValue subTotal = subTotalField.Value.AsCurrency();
Console.WriteLine($"Sub Total: '{subTotal.Symbol}{subTotal.Amount}', with confidence {subTotalField.Confidence}");
}
}
if (document.Fields.TryGetValue("TotalTax", out DocumentField totalTaxField))
{
if (totalTaxField.FieldType == DocumentFieldType.Currency)
{
CurrencyValue totalTax = totalTaxField.Value.AsCurrency();
Console.WriteLine($"Total Tax: '{totalTax.Symbol}{totalTax.Amount}', with confidence {totalTaxField.Confidence}");
}
}
if (document.Fields.TryGetValue("InvoiceTotal", out DocumentField invoiceTotalField))
{
if (invoiceTotalField.FieldType == DocumentFieldType.Currency)
{
CurrencyValue invoiceTotal = invoiceTotalField.Value.AsCurrency();
Console.WriteLine($"Invoice Total: '{invoiceTotal.Symbol}{invoiceTotal.Amount}', with confidence {invoiceTotalField.Confidence}");
}
}
}
运行应用程序
将代码示例添加到应用程序后,选择 formRecognizer_quickstart 旁边的绿色“开始”按钮以生成和运行程序,或按 F5。
客户端库 | SDK 参考 | REST API 参考 | 包 (Maven) | 示例| 支持的 REST API 版本
客户端库 | SDK 参考 | REST API 参考 | 包 (Maven) | 示例|支持的 REST API 版本
在本快速入门中,使用以下功能来分析和提取表单和文档中的数据和值:
先决条件
Azure 订阅 - 创建试用订阅。
最新版本的 Visual Studio Code 或者你首选的 IDE。 请参阅 Visual Studio Code 中的 Java。
提示
- Visual Studio Code 提供适用于 Windows 和 macOS 的 "Java 编码包"。该编码包是 VS Code、Java 开发工具包 (JDK) 和 Microsoft 建议扩展的集合。 编码包还可用于修复现有开发环境。
- 如果使用 VS Code 和适用于 Java 的编码包,请安装 Gradle for Java 扩展。
如果不使用 Visual Studio Code,请确保在开发环境中安装了以下内容:
Java 开发工具包 (JDK) 版本 8 或更高版本。 有关详细信息,请参阅 OpenJDK 的 Microsoft 版本。
Gradle 版本 6.8 或更高版本。
Azure AI 服务或文档智能资源。 获得 Azure 订阅后,在 Azure 门户中创建单服务或多服务文档智能资源以获取密钥和终结点。 可以使用免费定价层 (
F0
) 试用该服务,然后再升级到付费层进行生产。提示
如果计划通过一个终结点/密钥访问多个 Azure AI 服务,请创建 Azure AI 服务资源。 请创建仅供文档智能访问的文档智能资源。 请注意,如果你打算使用 Microsoft Entra 身份验证,则需要单一服务资源。
部署资源后,选择“转到资源”。 需要从创建的资源获取密钥和终结点,以便将应用程序连接到文档智能 API。 稍后将密钥和终结点粘贴到代码中:
设置
创建新的 Gradle 项目
在控制台窗口(例如 cmd、PowerShell 或 Bash)中,为应用创建名为 form-recognize-app 的新目录,并导航到该目录。
mkdir form-recognize-app && form-recognize-app
mkdir form-recognize-app; cd form-recognize-app
从工作目录运行
gradle init
命令。 此命令将创建 Gradle 的基本生成文件,其中包括 build.gradle.kts - 在运行时将使用该文件创建并配置应用程序。gradle init --type basic
当提示你选择一个 DSL 时,选择 Kotlin。
通过选择 Return 或 Enter 接受默认项目名称 (form-recognize-app)。
安装客户端库
本快速入门使用 Gradle 依赖项管理器。 可以在 Maven 中央存储库中找到客户端库以及其他依赖项管理器的信息。
在 IDE 中打开项目的 build.gradle.kts 文件。 复制粘贴以下代码,将客户端库作为 implementation
语句与所需的插件和设置一起包括在内。
plugins {
java
application
}
application {
mainClass.set("FormRecognizer")
}
repositories {
mavenCentral()
}
dependencies {
implementation group: 'com.azure', name: 'azure-ai-formrecognizer', version: '4.1.0'
}
本快速入门使用 Gradle 依赖项管理器。 可以在 Maven 中央存储库中找到客户端库以及其他依赖项管理器的信息。
在 IDE 中打开项目的 build.gradle.kts 文件。 复制粘贴以下代码,将客户端库作为 implementation
语句与所需的插件和设置一起包括在内。
plugins {
java
application
}
application {
mainClass.set("FormRecognizer")
}
repositories {
mavenCentral()
}
dependencies {
implementation group: 'com.azure', name: 'azure-ai-formrecognizer', version: '4.0.0'
}
创建 Java 应用程序
若要与文档智能服务交互,需要创建 DocumentAnalysisClient
类的实例。 为此,你要通过 Azure 门户使用 key
创建一个 AzureKeyCredential
,并使用 AzureKeyCredential
和文档智能 endpoint
创建一个 DocumentAnalysisClient
实例。
从 doc-intel-app 目录运行以下命令:
mkdir -p src/main/java
创建以下目录结构:
重要
完成后,请记住将密钥从代码中删除,并且永远不要公开发布该密钥。 对于生产来说,请使用安全的方式存储和访问凭据,例如 Azure Key Vault。 有关详细信息,请参阅 Azure AI 服务安全性。
布局模型
从文档中提取文本、选择标记、文本样式、表结构和边界区域坐标。
- 对于此示例,需要 URI 中的一个文档文件。 在本快速入门中,可使用示例文档。
- 若要分析 URI 上的给定文件,请使用
beginAnalyzeDocumentFromUrl
方法并传递prebuilt-layout
作为模型 ID。返回的值是一个AnalyzeResult
对象,其中包含有关提交的文档的数据。 - 我们已将文件 URI 值添加到 main 方法的
documentUrl
变量中。
将以下代码示例添加到 FormRecognizer.java
文件中。 请确保使用 Azure 门户中文档智能实例中的值更新密钥和终结点变量:
import com.azure.ai.formrecognizer.documentanalysis.models.*;
import com.azure.ai.formrecognizer.documentanalysis.DocumentAnalysisClient;
import com.azure.ai.formrecognizer.documentanalysis.DocumentAnalysisClientBuilder;
import com.azure.core.credential.AzureKeyCredential;
import com.azure.core.util.polling.SyncPoller;
import java.io.IOException;
import java.util.List;
import java.util.Arrays;
import java.time.LocalDate;
import java.util.Map;
import java.util.stream.Collectors;
public class FormRecognizer {
// set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal
private static final String endpoint = "<your-endpoint>";
private static final String key = "<your-key>";
public static void main(String[] args) {
// create your `DocumentAnalysisClient` instance and `AzureKeyCredential` variable
DocumentAnalysisClient client = new DocumentAnalysisClientBuilder()
.credential(new AzureKeyCredential(key))
.endpoint(endpoint)
.buildClient();
// sample document
String documentUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf";
String modelId = "prebuilt-layout";
SyncPoller < OperationResult, AnalyzeResult > analyzeLayoutResultPoller =
client.beginAnalyzeDocumentFromUrl(modelId, documentUrl);
AnalyzeResult analyzeLayoutResult = analyzeLayoutResultPoller.getFinalResult();
// pages
analyzeLayoutResult.getPages().forEach(documentPage -> {
System.out.printf("Page has width: %.2f and height: %.2f, measured with unit: %s%n",
documentPage.getWidth(),
documentPage.getHeight(),
documentPage.getUnit());
// lines
documentPage.getLines().forEach(documentLine ->
System.out.printf("Line %s is within a bounding polygon %s.%n",
documentLine.getContent(),
documentLine.getBoundingPolygon().toString()));
// words
documentPage.getWords().forEach(documentWord ->
System.out.printf("Word '%s' has a confidence score of %.2f%n",
documentWord.getContent(),
documentWord.getConfidence()));
// selection marks
documentPage.getSelectionMarks().forEach(documentSelectionMark ->
System.out.printf("Selection mark is %s and is within a bounding polygon %s with confidence %.2f.%n",
documentSelectionMark.getState().toString(),
documentSelectionMark.getBoundingPolygon().toString(),
documentSelectionMark.getConfidence()));
});
// tables
List < DocumentTable > tables = analyzeLayoutResult.getTables();
for (int i = 0; i < tables.size(); i++) {
DocumentTable documentTable = tables.get(i);
System.out.printf("Table %d has %d rows and %d columns.%n", i, documentTable.getRowCount(),
documentTable.getColumnCount());
documentTable.getCells().forEach(documentTableCell -> {
System.out.printf("Cell '%s', has row index %d and column index %d.%n", documentTableCell.getContent(),
documentTableCell.getRowIndex(), documentTableCell.getColumnIndex());
});
System.out.println();
}
}
// Utility function to get the bounding polygon coordinates
private static String getBoundingCoordinates(List < Point > boundingPolygon) {
return boundingPolygon.stream().map(point -> String.format("[%.2f, %.2f]", point.getX(),
point.getY())).collect(Collectors.joining(", "));
}
}
生成并运行应用程序
将代码示例添加到应用程序后,导航回主项目目录 -“form-recognize-app”。
使用
build
命令生成应用程序:gradle build
使用
run
命令运行应用程序:gradle run
布局模型输出
下面是预期输出的代码段:
Table 0 has 5 rows and 3 columns.
Cell 'Title of each class', has row index 0 and column index 0.
Cell 'Trading Symbol', has row index 0 and column index 1.
Cell 'Name of exchange on which registered', has row index 0 and column index 2.
Cell 'Common stock, $0.00000625 par value per share', has row index 1 and column index 0.
Cell 'MSFT', has row index 1 and column index 1.
Cell 'NASDAQ', has row index 1 and column index 2.
Cell '2.125% Notes due 2021', has row index 2 and column index 0.
Cell 'MSFT', has row index 2 and column index 1.
Cell 'NASDAQ', has row index 2 and column index 2.
Cell '3.125% Notes due 2028', has row index 3 and column index 0.
Cell 'MSFT', has row index 3 and column index 1.
Cell 'NASDAQ', has row index 3 and column index 2.
Cell '2.625% Notes due 2033', has row index 4 and column index 0.
Cell 'MSFT', has row index 4 and column index 1.
Cell 'NASDAQ', has row index 4 and column index 2.
若要查看整个输出,请访问 GitHub 上的 Azure 示例存储库,以查看布局模型输出。
将以下代码示例添加到 FormRecognizer.java
文件中。 请确保使用 Azure 门户中文档智能实例中的值更新密钥和终结点变量:
import com.azure.ai.formrecognizer.documentanalysis.DocumentAnalysisClient;
import com.azure.ai.formrecognizer.documentanalysis.DocumentAnalysisClientBuilder;
import com.azure.ai.formrecognizer.documentanalysis.models.AnalyzeResult;
import com.azure.ai.formrecognizer.documentanalysis.models.OperationResult;
import com.azure.ai.formrecognizer.documentanalysis.models.DocumentTable;
import com.azure.ai.formrecognizer.documentanalysis.models.Point;
import com.azure.core.credential.AzureKeyCredential;
import com.azure.core.util.polling.SyncPoller;
import java.util.List;
import java.util.stream.Collectors;
public class FormRecognizer {
// set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal
private static final String endpoint = "<your-endpoint>";
private static final String key = "<your-key>";
public static void main(String[] args) {
// create your `DocumentAnalysisClient` instance and `AzureKeyCredential` variable
DocumentAnalysisClient client = new DocumentAnalysisClientBuilder()
.credential(new AzureKeyCredential(key))
.endpoint(endpoint)
.buildClient();
// sample document
String documentUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf";
String modelId = "prebuilt-layout";
SyncPoller < OperationResult, AnalyzeResult > analyzeLayoutPoller =
client.beginAnalyzeDocumentFromUrl(modelId, documentUrl);
AnalyzeResult analyzeLayoutResult = analyzeLayoutPoller.getFinalResult();
// pages
analyzeLayoutResult.getPages().forEach(documentPage -> {
System.out.printf("Page has width: %.2f and height: %.2f, measured with unit: %s%n",
documentPage.getWidth(),
documentPage.getHeight(),
documentPage.getUnit());
// lines
documentPage.getLines().forEach(documentLine ->
System.out.printf("Line '%s' is within a bounding polygon %s.%n",
documentLine.getContent(),
getBoundingCoordinates(documentLine.getBoundingPolygon())));
// words
documentPage.getWords().forEach(documentWord ->
System.out.printf("Word '%s' has a confidence score of %.2f.%n",
documentWord.getContent(),
documentWord.getConfidence()));
// selection marks
documentPage.getSelectionMarks().forEach(documentSelectionMark ->
System.out.printf("Selection mark is '%s' and is within a bounding polygon %s with confidence %.2f.%n",
documentSelectionMark.getSelectionMarkState().toString(),
getBoundingCoordinates(documentSelectionMark.getBoundingPolygon()),
documentSelectionMark.getConfidence()));
});
// tables
List < DocumentTable > tables = analyzeLayoutResult.getTables();
for (int i = 0; i < tables.size(); i++) {
DocumentTable documentTable = tables.get(i);
System.out.printf("Table %d has %d rows and %d columns.%n", i, documentTable.getRowCount(),
documentTable.getColumnCount());
documentTable.getCells().forEach(documentTableCell -> {
System.out.printf("Cell '%s', has row index %d and column index %d.%n", documentTableCell.getContent(),
documentTableCell.getRowIndex(), documentTableCell.getColumnIndex());
});
System.out.println();
}
// styles
analyzeLayoutResult.getStyles().forEach(documentStyle -
> System.out.printf("Document is handwritten %s.%n", documentStyle.isHandwritten()));
}
/**
* Utility function to get the bounding polygon coordinates.
*/
private static String getBoundingCoordinates(List < Point > boundingPolygon) {
return boundingPolygon.stream().map(point -> String.format("[%.2f, %.2f]", point.getX(),
point.getY())).collect(Collectors.joining(", "));
}
}
生成并运行应用程序
将代码示例添加到应用程序后,导航回主项目目录 -“form-recognize-app”。
使用
build
命令生成应用程序:gradle build
使用
run
命令运行应用程序:gradle run
预生成模型
使用预生成模型分析和提取特定文档类型的公共字段。 在本示例中,我们将使用预生成发票模型分析发票。
提示
不止发票,还有几个预生成模型可供选择,每个模型都有自己的一组受支持的字段。 用于 analyze
操作的模型由要分析的文档类型确定。 请参阅模型数据提取。
将以下代码示例添加到 FormRecognizer.java
文件中。 请确保使用 Azure 门户中文档智能实例中的值更新密钥和终结点变量:
import com.azure.ai.formrecognizer.documentanalysis.models.*;
import com.azure.ai.formrecognizer.documentanalysis.DocumentAnalysisClient;
import com.azure.ai.formrecognizer.documentanalysis.DocumentAnalysisClientBuilder;
import com.azure.core.credential.AzureKeyCredential;
import com.azure.core.util.polling.SyncPoller;
import java.io.IOException;
import java.util.List;
import java.util.Arrays;
import java.time.LocalDate;
import java.util.Map;
import java.util.stream.Collectors;
public class FormRecognizer {
// set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal
private static final String endpoint = "<your-endpoint>";
private static final String key = "<your-key>";
public static void main(final String[] args) throws IOException {
// create your `DocumentAnalysisClient` instance and `AzureKeyCredential` variable
DocumentAnalysisClient client = new DocumentAnalysisClientBuilder()
.credential(new AzureKeyCredential(key))
.endpoint(endpoint)
.buildClient();
// sample document
String modelId = "prebuilt-invoice";
String invoiceUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf";
SyncPoller < OperationResult, AnalyzeResult > analyzeInvoicePoller = client.beginAnalyzeDocumentFromUrl(modelId, invoiceUrl);
AnalyzeResult analyzeInvoiceResult = analyzeInvoicePoller.getFinalResult();
for (int i = 0; i < analyzeInvoiceResult.getDocuments().size(); i++) {
AnalyzedDocument analyzedInvoice = analyzeInvoiceResult.getDocuments().get(i);
Map < String, DocumentField > invoiceFields = analyzedInvoice.getFields();
System.out.printf("----------- Analyzing invoice %d -----------%n", i);
DocumentField vendorNameField = invoiceFields.get("VendorName");
if (vendorNameField != null) {
if (DocumentFieldType.STRING == vendorNameField.getType()) {
String merchantName = vendorNameField.getValueAsString();
System.out.printf("Vendor Name: %s, confidence: %.2f%n",
merchantName, vendorNameField.getConfidence());
}
}
DocumentField vendorAddressField = invoiceFields.get("VendorAddress");
if (vendorAddressField != null) {
if (DocumentFieldType.STRING == vendorAddressField.getType()) {
String merchantAddress = vendorAddressField.getValueAsString();
System.out.printf("Vendor address: %s, confidence: %.2f%n",
merchantAddress, vendorAddressField.getConfidence());
}
}
DocumentField customerNameField = invoiceFields.get("CustomerName");
if (customerNameField != null) {
if (DocumentFieldType.STRING == customerNameField.getType()) {
String merchantAddress = customerNameField.getValueAsString();
System.out.printf("Customer Name: %s, confidence: %.2f%n",
merchantAddress, customerNameField.getConfidence());
}
}
DocumentField customerAddressRecipientField = invoiceFields.get("CustomerAddressRecipient");
if (customerAddressRecipientField != null) {
if (DocumentFieldType.STRING == customerAddressRecipientField.getType()) {
String customerAddr = customerAddressRecipientField.getValueAsString();
System.out.printf("Customer Address Recipient: %s, confidence: %.2f%n",
customerAddr, customerAddressRecipientField.getConfidence());
}
}
DocumentField invoiceIdField = invoiceFields.get("InvoiceId");
if (invoiceIdField != null) {
if (DocumentFieldType.STRING == invoiceIdField.getType()) {
String invoiceId = invoiceIdField.getValueAsString();
System.out.printf("Invoice ID: %s, confidence: %.2f%n",
invoiceId, invoiceIdField.getConfidence());
}
}
DocumentField invoiceDateField = invoiceFields.get("InvoiceDate");
if (customerNameField != null) {
if (DocumentFieldType.DATE == invoiceDateField.getType()) {
LocalDate invoiceDate = invoiceDateField.getValueAsDate();
System.out.printf("Invoice Date: %s, confidence: %.2f%n",
invoiceDate, invoiceDateField.getConfidence());
}
}
DocumentField invoiceTotalField = invoiceFields.get("InvoiceTotal");
if (customerAddressRecipientField != null) {
if (DocumentFieldType.DOUBLE == invoiceTotalField.getType()) {
Double invoiceTotal = invoiceTotalField.getValueAsDouble();
System.out.printf("Invoice Total: %.2f, confidence: %.2f%n",
invoiceTotal, invoiceTotalField.getConfidence());
}
}
DocumentField invoiceItemsField = invoiceFields.get("Items");
if (invoiceItemsField != null) {
System.out.printf("Invoice Items: %n");
if (DocumentFieldType.LIST == invoiceItemsField.getType()) {
List < DocumentField > invoiceItems = invoiceItemsField.getValueAsList();
invoiceItems.stream()
.filter(invoiceItem -> DocumentFieldType.MAP == invoiceItem.getType())
.map(documentField -> documentField.getValueAsMap())
.forEach(documentFieldMap -> documentFieldMap.forEach((key, documentField) -> {
// See a full list of fields found on an invoice here:
// https://aka.ms/formrecognizer/invoicefields
if ("Description".equals(key)) {
if (DocumentFieldType.STRING == documentField.getType()) {
String name = documentField.getValueAsString();
System.out.printf("Description: %s, confidence: %.2fs%n",
name, documentField.getConfidence());
}
}
if ("Quantity".equals(key)) {
if (DocumentFieldType.DOUBLE == documentField.getType()) {
Double quantity = documentField.getValueAsDouble();
System.out.printf("Quantity: %f, confidence: %.2f%n",
quantity, documentField.getConfidence());
}
}
if ("UnitPrice".equals(key)) {
if (DocumentFieldType.DOUBLE == documentField.getType()) {
Double unitPrice = documentField.getValueAsDouble();
System.out.printf("Unit Price: %f, confidence: %.2f%n",
unitPrice, documentField.getConfidence());
}
}
if ("ProductCode".equals(key)) {
if (DocumentFieldType.DOUBLE == documentField.getType()) {
Double productCode = documentField.getValueAsDouble();
System.out.printf("Product Code: %f, confidence: %.2f%n",
productCode, documentField.getConfidence());
}
}
}));
}
}
}
}
}
生成并运行应用程序
将代码示例添加到应用程序后,导航回主项目目录 -“doc-intel-app”。
使用
build
命令生成应用程序:gradle build
使用
run
命令运行应用程序:gradle run
预生成模型输出
下面是预期输出的代码段:
----------- Analyzing invoice 0 -----------
Analyzed document has doc type invoice with confidence : 1.00
Vendor Name: CONTOSO LTD., confidence: 0.92
Vendor address: 123 456th St New York, NY, 10001, confidence: 0.91
Customer Name: MICROSOFT CORPORATION, confidence: 0.84
Customer Address Recipient: Microsoft Corp, confidence: 0.92
Invoice ID: INV-100, confidence: 0.97
Invoice Date: 2019-11-15, confidence: 0.97
若要查看整个输出,请访问 GitHub 上的 Azure 示例存储库,以查看预生成模型输出。
将以下代码示例添加到 FormRecognizer.java
文件中。 请确保使用 Azure 门户中文档智能实例中的值更新密钥和终结点变量:
import com.azure.ai.formrecognizer.documentanalysis.DocumentAnalysisClient;
import com.azure.ai.formrecognizer.documentanalysis.DocumentAnalysisClientBuilder;
import com.azure.ai.formrecognizer.documentanalysis.models.AnalyzeResult;
import com.azure.ai.formrecognizer.documentanalysis.models.AnalyzedDocument;
import com.azure.ai.formrecognizer.documentanalysis.models.DocumentField;
import com.azure.ai.formrecognizer.documentanalysis.models.DocumentFieldType;
import com.azure.ai.formrecognizer.documentanalysis.models.OperationResult;
import com.azure.core.credential.AzureKeyCredential;
import com.azure.core.util.polling.SyncPoller;
import java.io.IOException;
import java.time.LocalDate;
import java.util.List;
import java.util.Map;
public class FormRecognizer {
// set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal
private static final String endpoint = "<your-endpoint>";
private static final String key = "<your-key>";
public static void main(String[] args) {
// create your `DocumentAnalysisClient` instance and `AzureKeyCredential` variable
DocumentAnalysisClient client = new DocumentAnalysisClientBuilder()
.credential(new AzureKeyCredential(key))
.endpoint(endpoint)
.buildClient();
// sample document
String modelId = "prebuilt-invoice";
String invoiceUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf";
SyncPoller < OperationResult, AnalyzeResult > analyzeInvoicePoller = client.beginAnalyzeDocumentFromUrl(modelId, invoiceUrl);
AnalyzeResult analyzeInvoiceResult = analyzeInvoicePoller.getFinalResult();
for (int i = 0; i < analyzeInvoiceResult.getDocuments().size(); i++) {
AnalyzedDocument analyzedInvoice = analyzeInvoiceResult.getDocuments().get(i);
Map < String, DocumentField > invoiceFields = analyzedInvoice.getFields();
System.out.printf("----------- Analyzing invoice %d -----------%n", i);
DocumentField vendorNameField = invoiceFields.get("VendorName");
if (vendorNameField != null) {
if (DocumentFieldType.STRING == vendorNameField.getType()) {
String merchantName = vendorNameField.getValueAsString();
System.out.printf("Vendor Name: %s, confidence: %.2f%n",
merchantName, vendorNameField.getConfidence());
}
}
DocumentField vendorAddressField = invoiceFields.get("VendorAddress");
if (vendorAddressField != null) {
if (DocumentFieldType.STRING == vendorAddressField.getType()) {
String merchantAddress = vendorAddressField.getValueAsString();
System.out.printf("Vendor address: %s, confidence: %.2f%n",
merchantAddress, vendorAddressField.getConfidence());
}
}
DocumentField customerNameField = invoiceFields.get("CustomerName");
if (customerNameField != null) {
if (DocumentFieldType.STRING == customerNameField.getType()) {
String merchantAddress = customerNameField.getValueAsString();
System.out.printf("Customer Name: %s, confidence: %.2f%n",
merchantAddress, customerNameField.getConfidence());
}
}
DocumentField customerAddressRecipientField = invoiceFields.get("CustomerAddressRecipient");
if (customerAddressRecipientField != null) {
if (DocumentFieldType.STRING == customerAddressRecipientField.getType()) {
String customerAddr = customerAddressRecipientField.getValueAsString();
System.out.printf("Customer Address Recipient: %s, confidence: %.2f%n",
customerAddr, customerAddressRecipientField.getConfidence());
}
}
DocumentField invoiceIdField = invoiceFields.get("InvoiceId");
if (invoiceIdField != null) {
if (DocumentFieldType.STRING == invoiceIdField.getType()) {
String invoiceId = invoiceIdField.getValueAsString();
System.out.printf("Invoice ID: %s, confidence: %.2f%n",
invoiceId, invoiceIdField.getConfidence());
}
}
DocumentField invoiceDateField = invoiceFields.get("InvoiceDate");
if (customerNameField != null) {
if (DocumentFieldType.DATE == invoiceDateField.getType()) {
LocalDate invoiceDate = invoiceDateField.getValueAsDate();
System.out.printf("Invoice Date: %s, confidence: %.2f%n",
invoiceDate, invoiceDateField.getConfidence());
}
}
DocumentField invoiceTotalField = invoiceFields.get("InvoiceTotal");
if (customerAddressRecipientField != null) {
if (DocumentFieldType.DOUBLE == invoiceTotalField.getType()) {
Double invoiceTotal = invoiceTotalField.getValueAsDouble();
System.out.printf("Invoice Total: %.2f, confidence: %.2f%n",
invoiceTotal, invoiceTotalField.getConfidence());
}
}
DocumentField invoiceItemsField = invoiceFields.get("Items");
if (invoiceItemsField != null) {
System.out.printf("Invoice Items: %n");
if (DocumentFieldType.LIST == invoiceItemsField.getType()) {
List < DocumentField > invoiceItems = invoiceItemsField.getValueAsList();
invoiceItems.stream()
.filter(invoiceItem -> DocumentFieldType.MAP == invoiceItem.getType())
.map(documentField -> documentField.getValueAsMap())
.forEach(documentFieldMap -> documentFieldMap.forEach((key, documentField) -> {
// See a full list of fields found on an invoice here:
// https://aka.ms/formrecognizer/invoicefields
if ("Description".equals(key)) {
if (DocumentFieldType.STRING == documentField.getType()) {
String name = documentField.getValueAsString();
System.out.printf("Description: %s, confidence: %.2fs%n",
name, documentField.getConfidence());
}
}
if ("Quantity".equals(key)) {
if (DocumentFieldType.DOUBLE == documentField.getType()) {
Double quantity = documentField.getValueAsDouble();
System.out.printf("Quantity: %f, confidence: %.2f%n",
quantity, documentField.getConfidence());
}
}
if ("UnitPrice".equals(key)) {
if (DocumentFieldType.DOUBLE == documentField.getType()) {
Double unitPrice = documentField.getValueAsDouble();
System.out.printf("Unit Price: %f, confidence: %.2f%n",
unitPrice, documentField.getConfidence());
}
}
if ("ProductCode".equals(key)) {
if (DocumentFieldType.DOUBLE == documentField.getType()) {
Double productCode = documentField.getValueAsDouble();
System.out.printf("Product Code: %f, confidence: %.2f%n",
productCode, documentField.getConfidence());
}
}
}));
}
}
}
}
}
生成并运行应用程序
将代码示例添加到应用程序后,导航回主项目目录 -“doc-intel-app”。
使用
build
命令生成应用程序:gradle build
使用
run
命令运行应用程序:gradle run
客户端库 | SDK 参考 | REST API 参考 | 包 (npm) | 示例 |支持的 REST API 版本
客户端库 | SDK 参考 | REST API 参考 | 包 (npm) | 示例 |支持的 REST API 版本
在本快速入门中,使用以下功能来分析和提取表单和文档中的数据和值:
先决条件
Azure 订阅 - 创建试用订阅。
最新版本的 Visual Studio Code 或者你首选的 IDE。 有关详细信息,请参阅 Visual Studio Code 中的 Node.js。
Node.js 的最新
LTS
版本。Azure AI 服务或文档智能资源。 获得 Azure 订阅后,在 Azure 门户中创建单服务或多服务文档智能资源以获取密钥和终结点。 可以使用免费定价层 (
F0
) 试用该服务,然后再升级到付费层进行生产。提示
如果计划通过一个终结点/密钥访问多个 Azure AI 服务,请创建 Azure AI 服务资源。 请创建仅供文档智能访问的文档智能资源。 请注意,如果你打算使用 Microsoft Entra 身份验证,则需要单一服务资源。
部署资源后,选择“转到资源”。 需要从创建的资源获取密钥和终结点,以便将应用程序连接到文档智能 API。 稍后需要在本快速入门中将密钥和终结点粘贴到代码中:
设置
创建新的 Node.js Express 应用程序:在控制台窗口(例如 cmd、PowerShell 或 Bash)中,为应用创建名为
doc-intel-app
的新目录并导航到该目录。mkdir doc-intel-app && cd doc-intel-app
运行
npm init
命令以初始化应用程序并为项目构建基架。npm init
使用终端中提供的提示指定项目的属性。
- 最重要的属性包括名称、版本号和入口点。
- 建议保留
index.js
作为入口点名称。 描述、测试命令、GitHub 存储库、关键字、作者和许可证信息均为可选属性 —— 在此项目中可跳过。 - 通过选择“返回”或“Enter”,接受括号中的建议。
- 完成提示后,doc-intel-app 目录中会创建一个
package.json
文件。
安装
ai-form-recognizer
客户端库和azure/identity
npm 包:npm i @azure/ai-form-recognizer@5.0.0 @azure/identity
- 应用的
package.json
文件将使用依赖项进行更新。
- 应用的
安装
ai-form-recognizer
客户端库和azure/identity
npm 包:npm i @azure/ai-form-recognizer@4.0.0 @azure/identity
在应用程序目录中创建一个名为
index.js
的文件。提示
- 可以使用 Powershell 创建新文件。
- 按住 Shift 键并右键单击该文件夹,在项目目录中打开 Powershell 窗口。
- 键入以下命令:New-Item index.js。
生成应用程序
若要与文档智能服务交互,需要创建 DocumentAnalysisClient
类的实例。 为此,你将使用 Azure 门户的 key
创建一个 AzureKeyCredential
,使用 AzureKeyCredential
和表单识别器 endpoint
创建一个 DocumentAnalysisClient
实例。
重要
完成后,请记住将密钥从代码中删除,并且永远不要公开发布该密钥。 对于生产来说,请使用安全的方式存储和访问凭据,例如 Azure Key Vault。 有关详细信息,请参阅 Azure AI 服务安全性。
布局模型
从文档中提取文本、选择标记、文本样式、表结构和边界区域坐标。
- 对于此示例,需要 URI 中的一个文档文件。 在本快速入门中,可使用示例文档。
- 我们已将文件 URL 值添加到文件顶部附近的
formUrl
变量中。- 若要分析 URL 中的特定文件,请使用
beginAnalyzeDocuments
方法,并传入prebuilt-layout
作为模型 ID。
将以下代码示例添加到 index.js
文件中。 请确保使用 Azure 门户中文档智能实例中的值更新密钥和终结点变量:
const { AzureKeyCredential, DocumentAnalysisClient } = require("@azure/ai-form-recognizer");
// set `<your-key>` and `<your-endpoint>` variables with the values from the Azure portal.
const key = "<your-key>";
const endpoint = "<your-endpoint>";
// sample document
const formUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf"
async function main() {
const client = new DocumentAnalysisClient(endpoint, new AzureKeyCredential(key));
const poller = await client.beginAnalyzeDocumentFromUrl("prebuilt-layout", formUrl);
const {
pages,
tables
} = await poller.pollUntilDone();
if (pages.length <= 0) {
console.log("No pages were extracted from the document.");
} else {
console.log("Pages:");
for (const page of pages) {
console.log("- Page", page.pageNumber, `(unit: ${page.unit})`);
console.log(` ${page.width}x${page.height}, angle: ${page.angle}`);
console.log(` ${page.lines.length} lines, ${page.words.length} words`);
}
}
if (tables.length <= 0) {
console.log("No tables were extracted from the document.");
} else {
console.log("Tables:");
for (const table of tables) {
console.log(
`- Extracted table: ${table.columnCount} columns, ${table.rowCount} rows (${table.cells.length} cells)`
);
}
}
}
main().catch((error) => {
console.error("An error occurred:", error);
process.exit(1);
});
运行应用程序
将代码示例添加到应用程序后,运行你的程序:
导航到你的文档智能应用程序所在的文件夹 (doc-intel-app)。
在终端中键入以下命令:
node index.js
布局模型输出
下面是预期输出的代码段:
Pages:
- Page 1 (unit: inch)
8.5x11, angle: 0
69 lines, 425 words
Tables:
- Extracted table: 3 columns, 5 rows (15 cells)
若要查看整个输出,请访问 GitHub 上的 Azure 示例存储库,以查看布局模型输出。
预生成模型
在本示例中,我们将使用预生成发票模型分析发票。
提示
不止发票,还有几个预生成模型可供选择,每个模型都有自己的一组受支持的字段。 用于 analyze
操作的模型由要分析的文档类型确定。 请参阅模型数据提取。
const {
AzureKeyCredential,
DocumentAnalysisClient
} = require("@azure/ai-form-recognizer");
// set `<your-key>` and `<your-endpoint>` variables with the values from the Azure portal.
const key = "<your-key>";
const endpoint = "<your-endpoint>";
// sample document
invoiceUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf"
async function main() {
const client = new DocumentAnalysisClient(endpoint, new AzureKeyCredential(key));
const poller = await client.beginAnalyzeDocumentFromUrl("prebuilt-invoice", invoiceUrl);
const {
pages,
tables
} = await poller.pollUntilDone();
if (pages.length <= 0) {
console.log("No pages were extracted from the document.");
} else {
console.log("Pages:");
for (const page of pages) {
console.log("- Page", page.pageNumber, `(unit: ${page.unit})`);
console.log(` ${page.width}x${page.height}, angle: ${page.angle}`);
console.log(` ${page.lines.length} lines, ${page.words.length} words`);
if (page.lines && page.lines.length > 0) {
console.log(" Lines:");
for (const line of page.lines) {
console.log(` - "${line.content}"`);
// The words of the line can also be iterated independently. The words are computed based on their
// corresponding spans.
for (const word of line.words()) {
console.log(` - "${word.content}"`);
}
}
}
}
}
if (tables.length <= 0) {
console.log("No tables were extracted from the document.");
} else {
console.log("Tables:");
for (const table of tables) {
console.log(
`- Extracted table: ${table.columnCount} columns, ${table.rowCount} rows (${table.cells.length} cells)`
);
}
}
}
main().catch((error) => {
console.error("An error occurred:", error);
process.exit(1);
});
运行应用程序
将代码示例添加到应用程序后,运行你的程序:
导航到你的文档智能应用程序所在的文件夹 (doc-intel-app)。
在终端中键入以下命令:
node index.js
预生成模型输出
下面是预期输出的代码段:
Vendor Name: CONTOSO LTD.
Customer Name: MICROSOFT CORPORATION
Invoice Date: 2019-11-15T00:00:00.000Z
Due Date: 2019-12-15T00:00:00.000Z
Items:
- <no product code>
Description: Test for 23 fields
Quantity: 1
Date: undefined
Unit: undefined
Unit Price: 1
Tax: undefined
Amount: 100
若要查看整个输出,请访问 GitHub 上的 Azure 示例存储库,以查看预生成模型输出。
const { AzureKeyCredential, DocumentAnalysisClient } = require("@azure/ai-form-recognizer");
// set `<your-key>` and `<your-endpoint>` variables with the values from the Azure portal.
const key = "<your-key>";
const endpoint = "<your-endpoint>";
// sample document
invoiceUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf"
async function main() {
const client = new DocumentAnalysisClient(endpoint, new AzureKeyCredential(key));
const poller = await client.beginAnalyzeDocument("prebuilt-invoice", invoiceUrl);
const {
documents: [document],
} = await poller.pollUntilDone();
if (document) {
const {
vendorName,
customerName,
invoiceDate,
dueDate,
items,
subTotal,
previousUnpaidBalance,
totalTax,
amountDue,
} = document.fields;
// The invoice model has many fields. For details, *see* [Invoice model field extraction](../../concept-invoice.md#field-extraction)
console.log("Vendor Name:", vendorName && vendorName.value);
console.log("Customer Name:", customerName && customerName.value);
console.log("Invoice Date:", invoiceDate && invoiceDate.value);
console.log("Due Date:", dueDate && dueDate.value);
console.log("Items:");
for (const item of (items && items.values) || []) {
const { productCode, description, quantity, date, unit, unitPrice, tax, amount } =
item.properties;
console.log("-", (productCode && productCode.value) || "<no product code>");
console.log(" Description:", description && description.value);
console.log(" Quantity:", quantity && quantity.value);
console.log(" Date:", date && date.value);
console.log(" Unit:", unit && unit.value);
console.log(" Unit Price:", unitPrice && unitPrice.value);
console.log(" Tax:", tax && tax.value);
console.log(" Amount:", amount && amount.value);
}
console.log("Subtotal:", subTotal && subTotal.value);
console.log("Previous Unpaid Balance:", previousUnpaidBalance && previousUnpaidBalance.value);
console.log("Tax:", totalTax && totalTax.value);
console.log("Amount Due:", amountDue && amountDue.value);
} else {
throw new Error("Expected at least one receipt in the result.");
}
}
main().catch((error) => {
console.error("An error occurred:", error);
process.exit(1);
});
运行应用程序
将代码示例添加到应用程序后,运行你的程序:
导航到你的文档智能应用程序所在的文件夹 (doc-intel-app)。
在终端中键入以下命令:
node index.js
在本快速入门中,使用以下功能来分析和提取表单和文档中的数据:
先决条件
Azure 订阅 - 创建试用订阅
-
- 你的 Python 安装应包含 pip。 可以通过在命令行上运行
pip --version
来检查是否安装了 pip。 通过安装最新版本的 Python 获取 pip。
- 你的 Python 安装应包含 pip。 可以通过在命令行上运行
最新版本的 Visual Studio Code 或者你首选的 IDE。 有关详细信息,请参阅 Visual Studio Code 中的 Python 入门。
Azure AI 服务或文档智能资源。 获得 Azure 订阅后,在 Azure 门户中创建单服务或多服务文档智能资源以获取密钥和终结点。 可以使用免费定价层 (
F0
) 试用该服务,然后再升级到付费层进行生产。
提示
如果计划通过一个终结点/密钥访问多个 Azure AI 服务,请创建 Azure AI 服务资源。 请创建仅供文档智能访问的文档智能资源。 请注意,如果你打算使用 Microsoft Entra 身份验证,则需要单一服务资源。
部署资源后,选择“转到资源”。 需要从创建的资源获取密钥和终结点,以便将应用程序连接到文档智能 API。 稍后需要在本快速入门中将密钥和终结点粘贴到代码中:
设置
在本地环境中打开一个终端窗口,并使用 pip 安装适用于 Python 的 Azure 文档智能客户端库:
pip install azure-ai-formrecognizer==3.3.0
pip install azure-ai-formrecognizer==3.2.0b6
创建 Python 应用程序
若要与文档智能服务交互,需要创建 DocumentAnalysisClient
类的实例。 为此,你要通过 Azure 门户使用 key
创建一个 AzureKeyCredential
,并使用 AzureKeyCredential
和文档智能 endpoint
创建一个 DocumentAnalysisClient
实例。
重要
完成后,请记住将密钥从代码中删除,并且永远不要公开发布该密钥。 对于生产来说,请使用安全的方式存储和访问凭据,例如 Azure Key Vault。 有关详细信息,请参阅 Azure AI 服务安全性。
布局模型
从文档中提取文本、选择标记、文本样式、表结构和边界区域坐标。
- 对于此示例,需要 URI 中的一个文档文件。 在本快速入门中,可使用示例文档。
- 我们已将文件 URL 值添加到
analyze_layout
函数的formUrl
变量中。
若要分析 URL 中的给定文件,请使用 begin_analyze_document_from_url
方法并传递 prebuilt-layout
作为模型 ID。返回的值是一个 result
对象,其中包含有关提交的文档的数据。
将以下代码示例添加到 form_recognizer_quickstart.py 应用程序。 请确保使用 Azure 门户中表单识别器实例中的值更新密钥和终结点变量:
# import libraries
import os
from azure.ai.formrecognizer import DocumentAnalysisClient
from azure.core.credentials import AzureKeyCredential
# set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal
endpoint = "<your-endpoint>"
key = "<your-key>"
def format_polygon(polygon):
if not polygon:
return "N/A"
return ", ".join(["[{}, {}]".format(p.x, p.y) for p in polygon])
def analyze_layout():
# sample document
formUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf"
document_analysis_client = DocumentAnalysisClient(
endpoint=endpoint, credential=AzureKeyCredential(key)
)
poller = document_analysis_client.begin_analyze_document_from_url(
"prebuilt-layout", formUrl)
result = poller.result()
for idx, style in enumerate(result.styles):
print(
"Document contains {} content".format(
"handwritten" if style.is_handwritten else "no handwritten"
)
)
for page in result.pages:
print("----Analyzing layout from page #{}----".format(page.page_number))
print(
"Page has width: {} and height: {}, measured with unit: {}".format(
page.width, page.height, page.unit
)
)
for line_idx, line in enumerate(page.lines):
words = line.get_words()
print(
"...Line # {} has word count {} and text '{}' within bounding box '{}'".format(
line_idx,
len(words),
line.content,
format_polygon(line.polygon),
)
)
for word in words:
print(
"......Word '{}' has a confidence of {}".format(
word.content, word.confidence
)
)
for selection_mark in page.selection_marks:
print(
"...Selection mark is '{}' within bounding box '{}' and has a confidence of {}".format(
selection_mark.state,
format_polygon(selection_mark.polygon),
selection_mark.confidence,
)
)
for table_idx, table in enumerate(result.tables):
print(
"Table # {} has {} rows and {} columns".format(
table_idx, table.row_count, table.column_count
)
)
for region in table.bounding_regions:
print(
"Table # {} location on page: {} is {}".format(
table_idx,
region.page_number,
format_polygon(region.polygon),
)
)
for cell in table.cells:
print(
"...Cell[{}][{}] has content '{}'".format(
cell.row_index,
cell.column_index,
cell.content,
)
)
for region in cell.bounding_regions:
print(
"...content on page {} is within bounding box '{}'".format(
region.page_number,
format_polygon(region.polygon),
)
)
print("----------------------------------------")
if __name__ == "__main__":
analyze_layout()
运行应用程序
将代码示例添加到应用程序后,构建并运行程序:
导航到 form_recognizer_quickstart.py 文件所在的文件夹。
在终端中键入以下命令:
python form_recognizer_quickstart.py
布局模型输出
下面是预期输出的代码段:
----Analyzing layout from page #1----
Page has width: 8.5 and height: 11.0, measured with unit: inch
...Line # 0 has word count 2 and text 'UNITED STATES' within bounding box '[3.4915, 0.6828], [5.0116, 0.6828], [5.0116, 0.8265], [3.4915, 0.8265]'
......Word 'UNITED' has a confidence of 1.0
......Word 'STATES' has a confidence of 1.0
...Line # 1 has word count 4 and text 'SECURITIES AND EXCHANGE COMMISSION' within bounding box '[2.1937, 0.9061], [6.297, 0.9061], [6.297, 1.0498], [2.1937, 1.0498]'
......Word 'SECURITIES' has a confidence of 1.0
......Word 'AND' has a confidence of 1.0
......Word 'EXCHANGE' has a confidence of 1.0
......Word 'COMMISSION' has a confidence of 1.0
...Line # 2 has word count 3 and text 'Washington, D.C. 20549' within bounding box '[3.4629, 1.1179], [5.031, 1.1179], [5.031, 1.2483], [3.4629, 1.2483]'
......Word 'Washington,' has a confidence of 1.0
......Word 'D.C.' has a confidence of 1.0
若要查看整个输出,请访问 GitHub 上的 Azure 示例存储库,以查看布局模型输出。
将以下代码示例添加到 form_recognizer_quickstart.py 应用程序。 请确保使用 Azure 门户中表单识别器实例中的值更新密钥和终结点变量:
# import libraries
import os
from azure.ai.formrecognizer import DocumentAnalysisClient
from azure.core.credentials import AzureKeyCredential
# set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal
endpoint = "<your-endpoint>"
key = "<your-key>"
def analyze_layout():
# sample document
formUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf"
document_analysis_client = DocumentAnalysisClient(
endpoint=endpoint, credential=AzureKeyCredential(key)
)
poller = document_analysis_client.begin_analyze_document_from_url(
"prebuilt-layout", formUrl
)
result = poller.result()
for idx, style in enumerate(result.styles):
print(
"Document contains {} content".format(
"handwritten" if style.is_handwritten else "no handwritten"
)
)
for page in result.pages:
print("----Analyzing layout from page #{}----".format(page.page_number))
print(
"Page has width: {} and height: {}, measured with unit: {}".format(
page.width, page.height, page.unit
)
)
for line_idx, line in enumerate(page.lines):
words = line.get_words()
print(
"...Line # {} has word count {} and text '{}' within bounding polygon '{}'".format(
line_idx,
len(words),
line.content,
format_polygon(line.polygon),
)
)
for word in words:
print(
"......Word '{}' has a confidence of {}".format(
word.content, word.confidence
)
)
for selection_mark in page.selection_marks:
print(
"...Selection mark is '{}' within bounding polygon '{}' and has a confidence of {}".format(
selection_mark.state,
format_polygon(selection_mark.polygon),
selection_mark.confidence,
)
)
for table_idx, table in enumerate(result.tables):
print(
"Table # {} has {} rows and {} columns".format(
table_idx, table.row_count, table.column_count
)
)
for region in table.bounding_regions:
print(
"Table # {} location on page: {} is {}".format(
table_idx,
region.page_number,
format_polygon(region.polygon),
)
)
for cell in table.cells:
print(
"...Cell[{}][{}] has content '{}'".format(
cell.row_index,
cell.column_index,
cell.content,
)
)
for region in cell.bounding_regions:
print(
"...content on page {} is within bounding polygon '{}'".format(
region.page_number,
format_polygon(region.polygon),
)
)
print("----------------------------------------")
if __name__ == "__main__":
analyze_layout()
运行应用程序
将代码示例添加到应用程序后,构建并运行程序:
导航到 form_recognizer_quickstart.py 文件所在的文件夹。
在终端中键入以下命令:
python form_recognizer_quickstart.py
预生成模型
使用预生成模型分析和提取特定文档类型的公共字段。 在本示例中,我们将使用预生成发票模型分析发票。
提示
不止发票,还有几个预生成模型可供选择,每个模型都有自己的一组受支持的字段。 用于 analyze
操作的模型由要分析的文档类型确定。 请参阅模型数据提取。
若要分析 URI 中的给定文件,请使用 begin_analyze_document_from_url
方法并传递 prebuilt-invoice
作为模型 ID。返回的值是一个 result
对象,其中包含有关提交的文档的数据。
将以下代码示例添加到 form_recognizer_quickstart.py 应用程序。 请确保使用 Azure 门户中表单识别器实例中的值更新密钥和终结点变量:
# import libraries
import os
from azure.ai.formrecognizer import DocumentAnalysisClient
from azure.core.credentials import AzureKeyCredential
# set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal
endpoint = "<your-endpoint>"
key = "<your-key>"
def format_bounding_region(bounding_regions):
if not bounding_regions:
return "N/A"
return ", ".join(
"Page #{}: {}".format(region.page_number, format_polygon(region.polygon))
for region in bounding_regions
)
def format_polygon(polygon):
if not polygon:
return "N/A"
return ", ".join(["[{}, {}]".format(p.x, p.y) for p in polygon])
def analyze_invoice():
invoiceUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf"
document_analysis_client = DocumentAnalysisClient(
endpoint=endpoint, credential=AzureKeyCredential(key)
)
poller = document_analysis_client.begin_analyze_document_from_url(
"prebuilt-invoice", invoiceUrl
)
invoices = poller.result()
for idx, invoice in enumerate(invoices.documents):
print("--------Recognizing invoice #{}--------".format(idx + 1))
vendor_name = invoice.fields.get("VendorName")
if vendor_name:
print(
"Vendor Name: {} has confidence: {}".format(
vendor_name.value, vendor_name.confidence
)
)
vendor_address = invoice.fields.get("VendorAddress")
if vendor_address:
print(
"Vendor Address: {} has confidence: {}".format(
vendor_address.value, vendor_address.confidence
)
)
vendor_address_recipient = invoice.fields.get("VendorAddressRecipient")
if vendor_address_recipient:
print(
"Vendor Address Recipient: {} has confidence: {}".format(
vendor_address_recipient.value, vendor_address_recipient.confidence
)
)
customer_name = invoice.fields.get("CustomerName")
if customer_name:
print(
"Customer Name: {} has confidence: {}".format(
customer_name.value, customer_name.confidence
)
)
customer_id = invoice.fields.get("CustomerId")
if customer_id:
print(
"Customer Id: {} has confidence: {}".format(
customer_id.value, customer_id.confidence
)
)
customer_address = invoice.fields.get("CustomerAddress")
if customer_address:
print(
"Customer Address: {} has confidence: {}".format(
customer_address.value, customer_address.confidence
)
)
customer_address_recipient = invoice.fields.get("CustomerAddressRecipient")
if customer_address_recipient:
print(
"Customer Address Recipient: {} has confidence: {}".format(
customer_address_recipient.value,
customer_address_recipient.confidence,
)
)
invoice_id = invoice.fields.get("InvoiceId")
if invoice_id:
print(
"Invoice Id: {} has confidence: {}".format(
invoice_id.value, invoice_id.confidence
)
)
invoice_date = invoice.fields.get("InvoiceDate")
if invoice_date:
print(
"Invoice Date: {} has confidence: {}".format(
invoice_date.value, invoice_date.confidence
)
)
invoice_total = invoice.fields.get("InvoiceTotal")
if invoice_total:
print(
"Invoice Total: {} has confidence: {}".format(
invoice_total.value, invoice_total.confidence
)
)
due_date = invoice.fields.get("DueDate")
if due_date:
print(
"Due Date: {} has confidence: {}".format(
due_date.value, due_date.confidence
)
)
purchase_order = invoice.fields.get("PurchaseOrder")
if purchase_order:
print(
"Purchase Order: {} has confidence: {}".format(
purchase_order.value, purchase_order.confidence
)
)
billing_address = invoice.fields.get("BillingAddress")
if billing_address:
print(
"Billing Address: {} has confidence: {}".format(
billing_address.value, billing_address.confidence
)
)
billing_address_recipient = invoice.fields.get("BillingAddressRecipient")
if billing_address_recipient:
print(
"Billing Address Recipient: {} has confidence: {}".format(
billing_address_recipient.value,
billing_address_recipient.confidence,
)
)
shipping_address = invoice.fields.get("ShippingAddress")
if shipping_address:
print(
"Shipping Address: {} has confidence: {}".format(
shipping_address.value, shipping_address.confidence
)
)
shipping_address_recipient = invoice.fields.get("ShippingAddressRecipient")
if shipping_address_recipient:
print(
"Shipping Address Recipient: {} has confidence: {}".format(
shipping_address_recipient.value,
shipping_address_recipient.confidence,
)
)
print("Invoice items:")
for idx, item in enumerate(invoice.fields.get("Items").value):
print("...Item #{}".format(idx + 1))
item_description = item.value.get("Description")
if item_description:
print(
"......Description: {} has confidence: {}".format(
item_description.value, item_description.confidence
)
)
item_quantity = item.value.get("Quantity")
if item_quantity:
print(
"......Quantity: {} has confidence: {}".format(
item_quantity.value, item_quantity.confidence
)
)
unit = item.value.get("Unit")
if unit:
print(
"......Unit: {} has confidence: {}".format(
unit.value, unit.confidence
)
)
unit_price = item.value.get("UnitPrice")
if unit_price:
print(
"......Unit Price: {} has confidence: {}".format(
unit_price.value, unit_price.confidence
)
)
product_code = item.value.get("ProductCode")
if product_code:
print(
"......Product Code: {} has confidence: {}".format(
product_code.value, product_code.confidence
)
)
item_date = item.value.get("Date")
if item_date:
print(
"......Date: {} has confidence: {}".format(
item_date.value, item_date.confidence
)
)
tax = item.value.get("Tax")
if tax:
print(
"......Tax: {} has confidence: {}".format(tax.value, tax.confidence)
)
amount = item.value.get("Amount")
if amount:
print(
"......Amount: {} has confidence: {}".format(
amount.value, amount.confidence
)
)
subtotal = invoice.fields.get("SubTotal")
if subtotal:
print(
"Subtotal: {} has confidence: {}".format(
subtotal.value, subtotal.confidence
)
)
total_tax = invoice.fields.get("TotalTax")
if total_tax:
print(
"Total Tax: {} has confidence: {}".format(
total_tax.value, total_tax.confidence
)
)
previous_unpaid_balance = invoice.fields.get("PreviousUnpaidBalance")
if previous_unpaid_balance:
print(
"Previous Unpaid Balance: {} has confidence: {}".format(
previous_unpaid_balance.value, previous_unpaid_balance.confidence
)
)
amount_due = invoice.fields.get("AmountDue")
if amount_due:
print(
"Amount Due: {} has confidence: {}".format(
amount_due.value, amount_due.confidence
)
)
service_start_date = invoice.fields.get("ServiceStartDate")
if service_start_date:
print(
"Service Start Date: {} has confidence: {}".format(
service_start_date.value, service_start_date.confidence
)
)
service_end_date = invoice.fields.get("ServiceEndDate")
if service_end_date:
print(
"Service End Date: {} has confidence: {}".format(
service_end_date.value, service_end_date.confidence
)
)
service_address = invoice.fields.get("ServiceAddress")
if service_address:
print(
"Service Address: {} has confidence: {}".format(
service_address.value, service_address.confidence
)
)
service_address_recipient = invoice.fields.get("ServiceAddressRecipient")
if service_address_recipient:
print(
"Service Address Recipient: {} has confidence: {}".format(
service_address_recipient.value,
service_address_recipient.confidence,
)
)
remittance_address = invoice.fields.get("RemittanceAddress")
if remittance_address:
print(
"Remittance Address: {} has confidence: {}".format(
remittance_address.value, remittance_address.confidence
)
)
remittance_address_recipient = invoice.fields.get("RemittanceAddressRecipient")
if remittance_address_recipient:
print(
"Remittance Address Recipient: {} has confidence: {}".format(
remittance_address_recipient.value,
remittance_address_recipient.confidence,
)
)
print("----------------------------------------")
if __name__ == "__main__":
analyze_invoice()
运行应用程序
将代码示例添加到应用程序后,构建并运行程序:
导航到 form_recognizer_quickstart.py 文件所在的文件夹。
在终端中键入以下命令:
python form_recognizer_quickstart.py
预生成模型输出
下面是预期输出的代码段:
--------Recognizing invoice #1--------
Vendor Name: CONTOSO LTD. has confidence: 0.919
Vendor Address: 123 456th St New York, NY, 10001 has confidence: 0.907
Vendor Address Recipient: Contoso Headquarters has confidence: 0.919
Customer Name: MICROSOFT CORPORATION has confidence: 0.84
Customer Id: CID-12345 has confidence: 0.956
Customer Address: 123 Other St, Redmond WA, 98052 has confidence: 0.909
Customer Address Recipient: Microsoft Corp has confidence: 0.917
Invoice Id: INV-100 has confidence: 0.972
Invoice Date: 2019-11-15 has confidence: 0.971
Invoice Total: CurrencyValue(amount=110.0, symbol=$) has confidence: 0.97
Due Date: 2019-12-15 has confidence: 0.973
若要查看整个输出,请访问 GitHub 上的 Azure 示例存储库,以查看预生成模型输出。
将以下代码示例添加到 form_recognizer_quickstart.py 应用程序。 请确保使用 Azure 门户中表单识别器实例中的值更新密钥和终结点变量:
# import libraries
import os
from azure.ai.formrecognizer import DocumentAnalysisClient
from azure.core.credentials import AzureKeyCredential
# set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal
endpoint = "<your-endpoint>"
key = "<your-key>"
def format_polygon(polygon):
if not polygon:
return "N/A"
return ", ".join(["[{}, {}]".format(p.x, p.y) for p in polygon])
def analyze_layout():
# sample document
formUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf"
document_analysis_client = DocumentAnalysisClient(
endpoint=endpoint, credential=AzureKeyCredential(key)
)
poller = document_analysis_client.begin_analyze_document_from_url(
"prebuilt-layout", formUrl
)
result = poller.result()
for idx, style in enumerate(result.styles):
print(
"Document contains {} content".format(
"handwritten" if style.is_handwritten else "no handwritten"
)
)
for page in result.pages:
print("----Analyzing layout from page #{}----".format(page.page_number))
print(
"Page has width: {} and height: {}, measured with unit: {}".format(
page.width, page.height, page.unit
)
)
for line_idx, line in enumerate(page.lines):
words = line.get_words()
print(
"...Line # {} has word count {} and text '{}' within bounding polygon '{}'".format(
line_idx,
len(words),
line.content,
format_polygon(line.polygon),
)
)
for word in words:
print(
"......Word '{}' has a confidence of {}".format(
word.content, word.confidence
)
)
for selection_mark in page.selection_marks:
print(
"...Selection mark is '{}' within bounding polygon '{}' and has a confidence of {}".format(
selection_mark.state,
format_polygon(selection_mark.polygon),
selection_mark.confidence,
)
)
for table_idx, table in enumerate(result.tables):
print(
"Table # {} has {} rows and {} columns".format(
table_idx, table.row_count, table.column_count
)
)
for region in table.bounding_regions:
print(
"Table # {} location on page: {} is {}".format(
table_idx,
region.page_number,
format_polygon(region.polygon),
)
)
for cell in table.cells:
print(
"...Cell[{}][{}] has content '{}'".format(
cell.row_index,
cell.column_index,
cell.content,
)
)
for region in cell.bounding_regions:
print(
"...content on page {} is within bounding polygon '{}'".format(
region.page_number,
format_polygon(region.polygon),
)
)
print("----------------------------------------")
if __name__ == "__main__":
analyze_layout()
运行应用程序
将代码示例添加到应用程序后,构建并运行程序:
导航到 form_recognizer_quickstart.py 文件所在的文件夹。
在终端中键入以下命令:
python form_recognizer_quickstart.py
| 文档智能 REST API | 支持的 Azure SDK |
| 文档智能 REST API | 支持的 Azure SDK |
在本快速入门中,了解如何使用文档智能 REST API 分析和提取文档中的数据和值:
先决条件
Azure 订阅 - 创建试用订阅
已安装 curl 命令行工具。
PowerShell 7.* 版本(或类似的命令行应用程序。):
若要检查 PowerShell 版本,请键入与操作系统相关的以下命令:
- Windows:
Get-Host | Select-Object Version
- macOS 或 Linux:
$PSVersionTable
- Windows:
文档智能(单服务)或 Azure AI 服务(多服务)资源。 获得 Azure 订阅后,在 Azure 门户中创建单服务或多服务文档智能资源以获取密钥和终结点。 可以使用免费定价层 (
F0
) 试用该服务,然后再升级到付费层进行生产。
提示
如果计划通过一个终结点/密钥访问多个 Azure AI 服务,请创建 Azure AI 服务资源。 请创建仅供文档智能访问的文档智能资源。 请注意,如果你打算使用 Microsoft Entra 身份验证,则需要单一服务资源。
部署资源后,选择“转到资源”。 需要从创建的资源获取密钥和终结点,以便将应用程序连接到文档智能 API。 稍后需要在本快速入门中将密钥和终结点粘贴到代码中:
分析文档并获取结果
POST 请求用于通过预生成或自定义模型分析文档。 GET 请求用于检索文档分析调用的结果。 modelId
与 POST 一起使用,resultId
与 GET 操作一起使用。
分析文档(POST 请求)
在运行 cURL 命令之前,请对 POST 请求进行以下更改:
将
{endpoint}
替换为你的 Azure 门户文档智能实例中的终结点值。将
{key}
替换为你的 Azure 门户文档智能实例中的密钥值。使用下表作为参考,将
{modelID}
和{your-document-url}
替换为所需的值。你需要在 URL 中提供文档文件。 对于本快速入门,你可以使用下表中为每项功能提供的示例表单:
示例文档
功能 | {modelID} | {your-document-url} |
---|---|---|
常规文档 | 预生成文档 | https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf |
读取 | prebuilt-read | https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/rest-api/read.png |
布局 | 预生成布局 | https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/rest-api/layout.png |
医疗保险卡 | prebuilt-healthInsuranceCard.us | https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/rest-api/insurance-card.png |
W-2 | prebuilt-tax.us.w2 | https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/rest-api/w2.png |
发票 | 预生成的发票 | https://github.com/Azure-Samples/cognitive-services-REST-api-samples/raw/master/curl/form-recognizer/rest-api/invoice.pdf |
回执 | prebuilt-receipt | https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/rest-api/receipt.png |
身份文档 | prebuilt-idDocument | https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/rest-api/identity_documents.png |
名片 | prebuilt-businessCard | https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/de5e0d8982ab754823c54de47a47e8e499351523/curl/form-recognizer/rest-api/business_card.jpg |
重要
完成后,请记住将密钥从代码中删除,并且永远不要公开发布该密钥。 对于生产来说,请使用安全的方式存储和访问凭据,例如 Azure Key Vault。 有关详细信息,请参阅 Azure AI 服务安全性。
POST 请求
curl -v -i POST "{endpoint}/formrecognizer/documentModels/{modelID}:analyze?api-version=2023-07-31" -H "Content-Type: application/json" -H "Ocp-Apim-Subscription-Key: {key}" --data-ascii "{'urlSource': '{your-document-url}'}"
curl -v -i POST "{endpoint}/formrecognizer/documentModels/{modelId}:analyze?api-version=2022-08-31" -H "Content-Type: application/json" -H "Ocp-Apim-Subscription-Key: {key}" --data-ascii "{'urlSource': '{your-document-url}'}"
POST 响应 (resultID)
你将收到 202 (Success)
响应,其中包含只读“Operation-Location”标头。 此标头的值包含 resultID
,可以查询该 ID,从而使用包含你同一资源订阅密钥的 GET 请求来获取异步操作的状态并检索结果:
获取分析结果(GET 请求)
调用 Analyze document
API 后,调用获取分析结果 API 以获取操作的状态和提取的数据。 运行该命令之前,请进行以下更改:
调用 Analyze document
API 后,调用获取分析结果 API 以获取操作的状态和提取的数据。 运行该命令之前,请进行以下更改:
替换 POST 响应中的
{resultID}
Operation-location 标头。将
{key}
替换为 Azure 门户中你的文档智能实例中的密钥值。
GET 请求
curl -v -X GET "{endpoint}/formrecognizer/documentModels/{modelId}/analyzeResults/{resultId}?api-version=2023-07-31" -H "Ocp-Apim-Subscription-Key: {key}"
curl -v -X GET "{endpoint}/formrecognizer/documentModels/{modelId}/analyzeResults/{resultId}?api-version=2022-08-31" -H "Ocp-Apim-Subscription-Key: {key}"
检查响应
你将收到包含 JSON 输出的 200 (Success)
响应。 第一个字段 "status"
指示操作的状态。 如果操作未完成,"status"
的值为 "running"
或 "notStarted"
,你应当采用手动方式或通过脚本再次调用该 API。 我们建议两次调用间隔一秒或更长时间。
针对预生成发票的示例响应
{
"status": "succeeded",
"createdDateTime": "2023-08-25T19:31:37Z",
"lastUpdatedDateTime": "2023-08-25T19:31:43Z",
"analyzeResult": {
"apiVersion": "2023-07-31",
"modelId": "prebuilt-invoice",
"stringIndexType": "textElements"...
..."pages": [
{
"pageNumber": 1,
"angle": 0,
"width": 8.5,
"height": 11,
"unit": "inch",
"words": [
{
"content": "CONTOSO",
"boundingBox": [
0.5911,
0.6857,
1.7451,
0.6857,
1.7451,
0.8664,
0.5911,
0.8664
],
"confidence": 1,
"span": {
"offset": 0,
"length": 7
}
}],
}]
}
}
{
"status": "succeeded",
"createdDateTime": "2022-09-25T19:31:37Z",
"lastUpdatedDateTime": "2022-09-25T19:31:43Z",
"analyzeResult": {
"apiVersion": "2022-08-31",
"modelId": "prebuilt-invoice",
"stringIndexType": "textElements"...
..."pages": [
{
"pageNumber": 1,
"angle": 0,
"width": 8.5,
"height": 11,
"unit": "inch",
"words": [
{
"content": "CONTOSO",
"boundingBox": [
0.5911,
0.6857,
1.7451,
0.6857,
1.7451,
0.8664,
0.5911,
0.8664
],
"confidence": 1,
"span": {
"offset": 0,
"length": 7
}
}],
}]
}
}
支持的文档字段
预生成模型提取预定义的文档字段集。 请参阅模型数据提取来了解提取的字段名称、类型、说明和示例。
完成了,恭喜!
在本快速入门中,你使用了文档智能模型来分析各种表单和文档。 接下来,浏览文档智能工作室和参考文档,深入了解文档智能 API。
后续步骤
- 若要获得增强的体验和高级模型质量,请尝试文档智能工作室
此内容适用于: v2.1
开始通过你选择的编程语言或 REST API 来使用 Azure AI 文档智能。 文档智能是一款基于云的 Azure AI 服务,它使用机器学习从文档中提取键值对、文本和表。 我们建议你在学习该技术时使用免费服务。 请记住,每月的免费页数限于 500。
若要详细了解文档智能功能和开发选项,请访问我们的概述页。
在本快速入门中,你将使用以下 API 提取表单和文档中的结构化数据:
先决条件
Azure 订阅 - 创建试用订阅。
Visual Studio IDE 的当前版本。
Azure AI 服务或文档智能资源。 具有 Azure 订阅后,在 Azure 门户中创建单服务或多服务文档智能资源以获取密钥和终结点。 可以使用免费定价层 (
F0
) 试用该服务,然后再升级到付费层进行生产。提示
如果计划通过一个终结点/密钥访问多个 Azure AI 服务,请创建 Azure AI 服务资源。 请创建仅供文档智能访问的文档智能资源。 请注意,如果你打算使用 Microsoft Entra 身份验证,则需要单一服务资源。
部署资源后,选择“转到资源”。 需要从创建的资源获取密钥和终结点,以便将应用程序连接到文档智能 API。 稍后需要在本快速入门中将密钥和终结点粘贴到代码中:
设置
启动 Visual Studio 2019。
在“开始”页上,选择“创建新项目”。
在“创建新项目”页面上,在搜索框中输入“控制台”。 选择“控制台应用程序”模板,然后选择“下一步”。
在“配置新项目”对话框中,在项目名称框中输入
formRecognizer_quickstart
。 然后选择“下一步”。在“附加信息”对话框窗口中,选择“.NET 5.0 (当前版)”,然后选择“创建”。
使用 NuGet 安装客户端库
右键单击 formRecognizer_quickstart 项目,然后选择“管理 NuGet 包...”。
选择“浏览”选项卡,并键入 Azure.AI.FormRecognizer。
从下拉菜单中选择版本 3.1.1,然后选择安装。
生成应用程序
若要与文档智能服务交互,需要创建 FormRecognizerClient
类的实例。 为此,你要使用密钥创建 AzureKeyCredential
,并使用 AzureKeyCredential
和文档智能 endpoint
创建 FormRecognizerClient
实例。
注意
- 从 .NET 6 开始,使用
console
模板的新项目将生成与以前版本不同的新程序样式。 - 新的输出使用最新的 C# 功能,这些功能简化了你需要编写的代码。
- 使用较新版本时,只需编写
Main
方法的主体。 无需包括顶级语句、全局 using 指令或隐式 using 指令。 - 有关详细信息,请参阅新的 C# 模板生成顶级语句。
打开 Program.cs 文件。
包括以下 using 指令:
using Azure;
using Azure.AI.FormRecognizer;
using Azure.AI.FormRecognizer.Models;
using System.Threading.Tasks;
- 设置
endpoint
和key
环境变量,并创建AzureKeyCredential
和FormRecognizerClient
实例:
private static readonly string endpoint = "your-form-recognizer-endpoint";
private static readonly string key = "your-api-key";
private static readonly AzureKeyCredential credential = new AzureKeyCredential(key);
删除
Console.Writeline("Hello World!");
行,并向 Program.cs 文件中添加下列“试用”代码示例之一:选择要复制并粘贴到应用程序的 Main 方法中的代码示例:
重要
完成后,请记住将密钥从代码中删除,并且永远不要公开发布该密钥。 对于生产来说,请使用安全的方式存储和访问凭据,例如 Azure Key Vault。 有关详细信息,请参阅 Azure AI 服务安全性一文。
试用:布局模型
从文档中提取文本、选择标记、文本类型和表结构及其边界区域坐标。
- 对于此示例,需要 URI 中的一个文档文件。 在本快速入门中,可使用示例文档。
- 我们已将文件 URI 值添加到
formUri
变量中。 - 若要从 URI 中的特定文件中提取布局,请使用
StartRecognizeContentFromUriAsync
方法。
将以下代码添加到布局应用程序 Program.cs 文件:
FormRecognizerClient recognizerClient = AuthenticateClient();
Task recognizeContent = RecognizeContent(recognizerClient);
Task.WaitAll(recognizeContent);
private static FormRecognizerClient AuthenticateClient()
{
var credential = new AzureKeyCredential(key);
var client = new FormRecognizerClient(new Uri(endpoint), credential);
return client;
}
private static async Task RecognizeContent(FormRecognizerClient recognizerClient)
{
string formUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf";
FormPageCollection formPages = await recognizerClient
.StartRecognizeContentFromUri(new Uri(formUrl))
.WaitForCompletionAsync();
foreach (FormPage page in formPages)
{
Console.WriteLine($"Form Page {page.PageNumber} has {page.Lines.Count} lines.");
for (int i = 0; i < page.Lines.Count; i++)
{
FormLine line = page.Lines[i];
Console.WriteLine($" Line {i} has {line.Words.Count} word{(line.Words.Count > 1 ? "s" : "")}, and text: '{line.Text}'.");
}
for (int i = 0; i < page.Tables.Count; i++)
{
FormTable table = page.Tables[i];
Console.WriteLine($"Table {i} has {table.RowCount} rows and {table.ColumnCount} columns.");
foreach (FormTableCell cell in table.Cells)
{
Console.WriteLine($" Cell ({cell.RowIndex}, {cell.ColumnIndex}) contains text: '{cell.Text}'.");
}
}
}
}
}
}
试用:预生成模型
此示例演示了如何以发票为例,使用预先训练的模型来分析某些类型常见文档中的数据。
选择一个预生成模型
不止发票,还有几个预生成模型可供选择,每个模型都有自己的一组受支持的字段。 用于分析操作的模型取决于要分析的文档类型。 下面是文档智能服务目前支持的预生成模型:
将以下代码添加到预生成发票应用程序的 Program.cs 文件方法
FormRecognizerClient recognizerClient = AuthenticateClient();
Task analyzeinvoice = AnalyzeInvoice(recognizerClient, invoiceUrl);
Task.WaitAll(analyzeinvoice);
private static FormRecognizerClient AuthenticateClient() {
var credential = new AzureKeyCredential(key);
var client = new FormRecognizerClient(new Uri(endpoint), credential);
return client;
}
static string invoiceUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf";
private static async Task AnalyzeInvoice(FormRecognizerClient recognizerClient, string invoiceUrl) {
var options = new RecognizeInvoicesOptions() {
Locale = "en-US"
};
RecognizedFormCollection invoices = await recognizerClient.StartRecognizeInvoicesFromUriAsync(new Uri(invoiceUrl), options).WaitForCompletionAsync();
RecognizedForm invoice = invoices[0];
FormField invoiceIdField;
if (invoice.Fields.TryGetValue("InvoiceId", out invoiceIdField)) {
if (invoiceIdField.Value.ValueType == FieldValueType.String) {
string invoiceId = invoiceIdField.Value.AsString();
Console.WriteLine($" Invoice Id: '{invoiceId}', with confidence {invoiceIdField.Confidence}");
}
}
FormField invoiceDateField;
if (invoice.Fields.TryGetValue("InvoiceDate", out invoiceDateField)) {
if (invoiceDateField.Value.ValueType == FieldValueType.Date) {
DateTime invoiceDate = invoiceDateField.Value.AsDate();
Console.WriteLine($" Invoice Date: '{invoiceDate}', with confidence {invoiceDateField.Confidence}");
}
}
FormField dueDateField;
if (invoice.Fields.TryGetValue("DueDate", out dueDateField)) {
if (dueDateField.Value.ValueType == FieldValueType.Date) {
DateTime dueDate = dueDateField.Value.AsDate();
Console.WriteLine($" Due Date: '{dueDate}', with confidence {dueDateField.Confidence}");
}
}
FormField vendorNameField;
if (invoice.Fields.TryGetValue("VendorName", out vendorNameField)) {
if (vendorNameField.Value.ValueType == FieldValueType.String) {
string vendorName = vendorNameField.Value.AsString();
Console.WriteLine($" Vendor Name: '{vendorName}', with confidence {vendorNameField.Confidence}");
}
}
FormField vendorAddressField;
if (invoice.Fields.TryGetValue("VendorAddress", out vendorAddressField)) {
if (vendorAddressField.Value.ValueType == FieldValueType.String) {
string vendorAddress = vendorAddressField.Value.AsString();
Console.WriteLine($" Vendor Address: '{vendorAddress}', with confidence {vendorAddressField.Confidence}");
}
}
FormField customerNameField;
if (invoice.Fields.TryGetValue("CustomerName", out customerNameField)) {
if (customerNameField.Value.ValueType == FieldValueType.String) {
string customerName = customerNameField.Value.AsString();
Console.WriteLine($" Customer Name: '{customerName}', with confidence {customerNameField.Confidence}");
}
}
FormField customerAddressField;
if (invoice.Fields.TryGetValue("CustomerAddress", out customerAddressField)) {
if (customerAddressField.Value.ValueType == FieldValueType.String) {
string customerAddress = customerAddressField.Value.AsString();
Console.WriteLine($" Customer Address: '{customerAddress}', with confidence {customerAddressField.Confidence}");
}
}
FormField customerAddressRecipientField;
if (invoice.Fields.TryGetValue("CustomerAddressRecipient", out customerAddressRecipientField)) {
if (customerAddressRecipientField.Value.ValueType == FieldValueType.String) {
string customerAddressRecipient = customerAddressRecipientField.Value.AsString();
Console.WriteLine($" Customer address recipient: '{customerAddressRecipient}', with confidence {customerAddressRecipientField.Confidence}");
}
}
FormField invoiceTotalField;
if (invoice.Fields.TryGetValue("InvoiceTotal", out invoiceTotalField)) {
if (invoiceTotalField.Value.ValueType == FieldValueType.Float) {
float invoiceTotal = invoiceTotalField.Value.AsFloat();
Console.WriteLine($" Invoice Total: '{invoiceTotal}', with confidence {invoiceTotalField.Confidence}");
}
}
}
}
}
运行应用程序
选择 formRecognizer_quickstart 旁的绿色“启动”按钮,来生成并运行程序,或者按 F5。
在本快速入门中,你将使用以下 API 提取表单和文档中的结构化数据:
先决条件
Azure 订阅 - 创建试用订阅。
Java 开发工具包 (JDK) 版本 8 或更高版本。 有关详细信息,请参阅支持的 Java 版本和更新计划。
Azure AI 服务或文档智能资源。 具有 Azure 订阅后,在 Azure 门户中创建单服务或多服务文档智能资源以获取密钥和终结点。 可以使用免费定价层 (
F0
) 试用该服务,然后再升级到付费层进行生产。部署资源后,选择“转到资源”。 需要从创建的资源获取密钥和终结点,以便将应用程序连接到文档智能 API。 稍后需要在本快速入门中将密钥和终结点粘贴到代码中:
设置
创建新的 Gradle 项目
在控制台窗口(例如 cmd、PowerShell 或 Bash)中,为应用创建名为 form-recognizer-app 的新目录,并导航到该目录。
mkdir form-recognizer-app && form-recognizer-app
从工作目录运行
gradle init
命令。 此命令将创建 Gradle 的基本生成文件,其中包括 build.gradle.kts - 在运行时将使用该文件创建并配置应用程序。gradle init --type basic
当提示你选择一个 DSL 时,选择 Kotlin。
接受默认项目名称 (form-recognizer-app)
安装客户端库
本快速入门使用 Gradle 依赖项管理器。 可以在 Maven 中央存储库中找到客户端库以及其他依赖项管理器的信息。
在项目的 build.gradle.kts 文件中,以 implementation
语句的形式包含客户端库及所需的插件和设置。
plugins {
java
application
}
application {
mainClass.set("FormRecognizer")
}
repositories {
mavenCentral()
}
dependencies {
implementation(group = "com.azure", name = "azure-ai-formrecognizer", version = "3.1.1")
}
创建 Java 文件
在工作目录中运行以下命令:
mkdir -p src/main/java
创建以下目录结构:
导航到 Java 目录,创建一个名为 FormRecognizer.java 的文件。 在你偏好的编辑器或 IDE 中打开该文件,并添加以下包声明和 import
语句:
import com.azure.ai.formrecognizer.*;
import com.azure.ai.formrecognizer.models.*;
import java.util.concurrent.atomic.AtomicReference;
import java.util.List;
import java.util.Map;
import java.time.LocalDate;
import com.azure.core.credential.AzureKeyCredential;
import com.azure.core.http.rest.PagedIterable;
import com.azure.core.util.Context;
import com.azure.core.util.polling.SyncPoller;
选择要复制并粘贴到应用程序的 main 方法中的代码示例:
重要
完成后,请记住将密钥从代码中删除,并且永远不要公开发布该密钥。 对于生产来说,请使用安全的方式存储和访问凭据,例如 Azure Key Vault。 有关详细信息,请参阅 Azure AI 服务安全性。
试用:布局模型
从文档中提取文本、选择标记、文本类型和表结构及其边界区域坐标。
- 对于此示例,需要 URI 中的一个文档文件。 在本快速入门中,可使用示例文档。
- 若要分析 URI 中的特定文件,将使用
beginRecognizeContentFromUrl
方法。 - 我们已将文件 URI 值添加到 main 方法的
formUrl
变量中。
使用以下代码更新应用程序的 FormRecognizer 类(请务必使用 Azure 门户文档智能实例中的值更新密钥和终结点变量):
static final String key = "PASTE_YOUR_FORM_RECOGNIZER_KEY_HERE";
static final String endpoint = "PASTE_YOUR_FORM_RECOGNIZER_ENDPOINT_HERE";
public static void main(String[] args) {FormRecognizerClient recognizerClient = new FormRecognizerClientBuilder()
.credential(new AzureKeyCredential(key)).endpoint(endpoint).buildClient();
String formUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf";
System.out.println("Get form content...");
GetContent(recognizerClient, formUrl);
}
private static void GetContent(FormRecognizerClient recognizerClient, String invoiceUri) {
String analyzeFilePath = invoiceUri;
SyncPoller<FormRecognizerOperationResult, List<FormPage>> recognizeContentPoller = recognizerClient
.beginRecognizeContentFromUrl(analyzeFilePath);
List<FormPage> contentResult = recognizeContentPoller.getFinalResult();
// </snippet_getcontent_call>
// <snippet_getcontent_print>
contentResult.forEach(formPage -> {
// Table information
System.out.println("----Recognizing content ----");
System.out.printf("Has width: %f and height: %f, measured with unit: %s.%n", formPage.getWidth(),
formPage.getHeight(), formPage.getUnit());
formPage.getTables().forEach(formTable -> {
System.out.printf("Table has %d rows and %d columns.%n", formTable.getRowCount(),
formTable.getColumnCount());
formTable.getCells().forEach(formTableCell -> {
System.out.printf("Cell has text %s.%n", formTableCell.getText());
});
System.out.println();
});
});
}
试用:预生成模型
此示例演示了如何以发票为例,使用预先训练的模型来分析某些类型常见文档中的数据。
选择一个预生成模型
不止发票,还有几个预生成模型可供选择,每个模型都有自己的一组受支持的字段。 用于分析操作的模型取决于要分析的文档类型。 下面是文档智能服务目前支持的预生成模型:
使用以下代码更新应用程序的 FormRecognizer 类(请务必使用 Azure 门户文档智能实例中的值更新密钥和终结点变量):
static final String key = "PASTE_YOUR_FORM_RECOGNIZER_KEY_HERE";
static final String endpoint = "PASTE_YOUR_FORM_RECOGNIZER_ENDPOINT_HERE";
public static void main(String[] args) {
FormRecognizerClient recognizerClient = new FormRecognizerClientBuilder().credential(new AzureKeyCredential(key)).endpoint(endpoint).buildClient();
String invoiceUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf";
System.out.println("Analyze invoice...");
AnalyzeInvoice(recognizerClient, invoiceUrl);
}
private static void AnalyzeInvoice(FormRecognizerClient recognizerClient, String invoiceUrl) {
SyncPoller < FormRecognizerOperationResult,
List < RecognizedForm >> recognizeInvoicesPoller = recognizerClient.beginRecognizeInvoicesFromUrl(invoiceUrl);
List < RecognizedForm > recognizedInvoices = recognizeInvoicesPoller.getFinalResult();
for (int i = 0; i < recognizedInvoices.size(); i++) {
RecognizedForm recognizedInvoice = recognizedInvoices.get(i);
Map < String,
FormField > recognizedFields = recognizedInvoice.getFields();
System.out.printf("----------- Recognized invoice info for page %d -----------%n", i);
FormField vendorNameField = recognizedFields.get("VendorName");
if (vendorNameField != null) {
if (FieldValueType.STRING == vendorNameField.getValue().getValueType()) {
String merchantName = vendorNameField.getValue().asString();
System.out.printf("Vendor Name: %s, confidence: %.2f%n", merchantName, vendorNameField.getConfidence());
}
}
FormField vendorAddressField = recognizedFields.get("VendorAddress");
if (vendorAddressField != null) {
if (FieldValueType.STRING == vendorAddressField.getValue().getValueType()) {
String merchantAddress = vendorAddressField.getValue().asString();
System.out.printf("Vendor address: %s, confidence: %.2f%n", merchantAddress, vendorAddressField.getConfidence());
}
}
FormField customerNameField = recognizedFields.get("CustomerName");
if (customerNameField != null) {
if (FieldValueType.STRING == customerNameField.getValue().getValueType()) {
String merchantAddress = customerNameField.getValue().asString();
System.out.printf("Customer Name: %s, confidence: %.2f%n", merchantAddress, customerNameField.getConfidence());
}
}
FormField customerAddressRecipientField = recognizedFields.get("CustomerAddressRecipient");
if (customerAddressRecipientField != null) {
if (FieldValueType.STRING == customerAddressRecipientField.getValue().getValueType()) {
String customerAddr = customerAddressRecipientField.getValue().asString();
System.out.printf("Customer Address Recipient: %s, confidence: %.2f%n", customerAddr, customerAddressRecipientField.getConfidence());
}
}
FormField invoiceIdField = recognizedFields.get("InvoiceId");
if (invoiceIdField != null) {
if (FieldValueType.STRING == invoiceIdField.getValue().getValueType()) {
String invoiceId = invoiceIdField.getValue().asString();
System.out.printf("Invoice Id: %s, confidence: %.2f%n", invoiceId, invoiceIdField.getConfidence());
}
}
FormField invoiceDateField = recognizedFields.get("InvoiceDate");
if (customerNameField != null) {
if (FieldValueType.DATE == invoiceDateField.getValue().getValueType()) {
LocalDate invoiceDate = invoiceDateField.getValue().asDate();
System.out.printf("Invoice Date: %s, confidence: %.2f%n", invoiceDate, invoiceDateField.getConfidence());
}
}
FormField invoiceTotalField = recognizedFields.get("InvoiceTotal");
if (customerAddressRecipientField != null) {
if (FieldValueType.FLOAT == invoiceTotalField.getValue().getValueType()) {
Float invoiceTotal = invoiceTotalField.getValue().asFloat();
System.out.printf("Invoice Total: %.2f, confidence: %.2f%n", invoiceTotal, invoiceTotalField.getConfidence());
}
}
}
}
生成并运行应用程序
导航回到主项目目录 form-recognizer-app。
- 使用
build
命令生成应用程序:
gradle build
- 使用
run
命令运行应用程序:
gradle run
在本快速入门中,你将使用以下 API 提取表单和文档中的结构化数据:
先决条件
Azure 订阅 - 创建试用订阅。
最新版本的 Visual Studio Code 或者你首选的 IDE。
最新 LTS 版本的 Node.js
Azure AI 服务或文档智能资源。 具有 Azure 订阅后,在 Azure 门户中创建单服务或多服务文档智能资源以获取密钥和终结点。 可以使用免费定价层 (
F0
) 试用该服务,然后再升级到付费层进行生产。提示
如果计划通过一个终结点/密钥访问多个 Azure AI 服务,请创建 Azure AI 服务资源。 请创建仅供文档智能访问的文档智能资源。 请注意,如果你打算使用 Microsoft Entra 身份验证,则需要单一服务资源。
部署资源后,选择“转到资源”。 需要从创建的资源获取密钥和终结点,以便将应用程序连接到文档智能 API。 稍后需要在本快速入门中将密钥和终结点粘贴到代码中:
设置
创建新的 Node.js 应用程序。 在控制台窗口(例如 cmd、PowerShell 或 Bash)中,为应用创建一个新目录并导航到该目录。
mkdir form-recognizer-app && cd form-recognizer-app
运行
npm init
命令以使用package.json
文件创建一个 node 应用程序。npm init
安装
ai-form-recognizer
客户端库 npm 包:npm install @azure/ai-form-recognizer
应用的
package.json
文件将使用依赖项进行更新。创建一个名为
index.js
的文件,将其打开,并导入以下库:const { FormRecognizerClient, AzureKeyCredential } = require("@azure/ai-form-recognizer");
为资源的 Azure 终结点和密钥创建变量:
const key = "PASTE_YOUR_FORM_RECOGNIZER_KEY_HERE"; const endpoint = "PASTE_YOUR_FORM_RECOGNIZER_ENDPOINT_HERE";
此时,JavaScript 应用程序应包含以下代码行:
const { FormRecognizerClient, AzureKeyCredential } = require("@azure/ai-form-recognizer"); const endpoint = "PASTE_YOUR_FORM_RECOGNIZER_ENDPOINT_HERE"; const key = "PASTE_YOUR_FORM_RECOGNIZER_KEY_HERE";
选择要复制并粘贴到应用程序中的代码示例:
重要
完成后,请记住将密钥从代码中删除,并且永远不要公开发布该密钥。 对于生产来说,请使用安全的方式存储和访问凭据,例如 Azure Key Vault。 有关详细信息,请参阅 Azure AI 服务安全性。
试用:布局模型
- 对于此示例,需要 URI 中的一个文档文件。 在本快速入门中,可使用示例文档。
- 我们已将文件 URI 值添加到文件顶部附近的
formUrl
变量中。 - 若要分析 URI 中的特定文件,将使用
beginRecognizeContent
方法。
将以下代码添加到布局应用程序的 key
变量下方的行中
const formUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf";
const formUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf";
async function recognizeContent() {
const client = new FormRecognizerClient(endpoint, new AzureKeyCredential(key));
const poller = await client.beginRecognizeContentFromUrl(formUrl);
const pages = await poller.pollUntilDone();
if (!pages || pages.length === 0) {
throw new Error("Expecting non-empty list of pages!");
}
for (const page of pages) {
console.log(
`Page ${page.pageNumber}: width ${page.width} and height ${page.height} with unit ${page.unit}`
);
for (const table of page.tables) {
for (const cell of table.cells) {
console.log(`cell [${cell.rowIndex},${cell.columnIndex}] has text ${cell.text}`);
}
}
}
}
recognizeContent().catch((err) => {
console.error("The sample encountered an error:", err);
});
试用:预生成模型
此示例演示了如何以发票为例,使用预先训练的模型来分析某些类型常见文档中的数据。 请参阅预先构建的概念页面,获取发票字段的完整列表
选择一个预生成模型
不止发票,还有几个预生成模型可供选择,每个模型都有自己的一组受支持的字段。 用于分析操作的模型取决于要分析的文档类型。 下面是文档智能服务目前支持的预生成模型:
将以下代码添加到预生成发票应用程序的 key
变量下方
const invoiceUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf";
async function recognizeInvoices() {
const client = new FormRecognizerClient(endpoint, new AzureKeyCredential(key));
const poller = await client.beginRecognizeInvoicesFromUrl(invoiceUrl);
const [invoice] = await poller.pollUntilDone();
if (invoice === undefined) {
throw new Error("Failed to extract data from at least one invoice.");
}
/**
* This is a helper function for printing a simple field with an elemental type.
*/
function fieldToString(field) {
const {
name,
valueType,
value,
confidence
} = field;
return `${name} (${valueType}): '${value}' with confidence ${confidence}'`;
}
console.log("Invoice fields:");
/**
* Invoices contain a lot of optional fields, but they are all of elemental types
* such as strings, numbers, and dates, so we will just enumerate them all.
*/
for (const [name, field] of Object.entries(invoice.fields)) {
if (field.valueType !== "array" && field.valueType !== "object") {
console.log(`- ${name} ${fieldToString(field)}`);
}
}
// Invoices also support nested line items, so we can iterate over them.
let idx = 0;
console.log("- Items:");
const items = invoice.fields["Items"]?.value;
for (const item of items ?? []) {
const value = item.value;
// Each item has several subfields that are nested within the item. We'll
// map over this list of the subfields and filter out any fields that
// weren't found. Not all fields will be returned every time, only those
// that the service identified for the particular document in question.
const subFields = [
"Description",
"Quantity",
"Unit",
"UnitPrice",
"ProductCode",
"Date",
"Tax",
"Amount"
]
.map((fieldName) => value[fieldName])
.filter((field) => field !== undefined);
console.log(
[
` - Item #${idx}`,
// Now we will convert those fields into strings to display
...subFields.map((field) => ` - ${fieldToString(field)}`)
].join("\n")
);
}
}
recognizeInvoices().catch((err) => {
console.error("The sample encountered an error:", err);
});
在本快速入门中,你将使用以下 API 提取表单和文档中的结构化数据:
先决条件
Azure 订阅 - 创建试用订阅
-
- 你的 Python 安装应包含 pip。 可以通过在命令行上运行
pip --version
来检查是否安装了 pip。 通过安装最新版本的 Python 获取 pip。
- 你的 Python 安装应包含 pip。 可以通过在命令行上运行
Azure AI 服务或文档智能资源。 具有 Azure 订阅后,在 Azure 门户中创建单服务或多服务文档智能资源以获取密钥和终结点。 可以使用免费定价层 (
F0
) 试用该服务,然后再升级到付费层进行生产。提示
如果计划通过一个终结点/密钥访问多个 Azure AI 服务,请创建 Azure AI 服务资源。 请创建仅供文档智能访问的文档智能资源。 请注意,如果你打算使用 Microsoft Entra 身份验证,则需要单一服务资源。
部署资源后,选择“转到资源”。 需要从创建的资源获取密钥和终结点,以便将应用程序连接到文档智能 API。 稍后需要在本快速入门中将密钥和终结点粘贴到代码中:
设置
在本地环境中打开一个终端窗口,并使用 pip 安装适用于 Python 的 Azure 文档智能客户端库:
pip install azure-ai-formrecognizer
创建新的 Python 应用程序
在首选编辑器或 IDE 中创建一个名为 form_recognizer_quickstart.py 的新 Python 应用程序。 然后导入以下库:
import os
from azure.ai.formrecognizer import FormRecognizerClient
from azure.core.credentials import AzureKeyCredential
为 Azure 资源终结点和密钥创建变量
endpoint = "YOUR_FORM_RECOGNIZER_ENDPOINT"
key = "YOUR_FORM_RECOGNIZER_KEY"
此时,Python 应用程序应包含以下代码行:
import os
from azure.core.exceptions import ResourceNotFoundError
from azure.ai.formrecognizer import DocumentAnalysisClient
from azure.core.credentials import AzureKeyCredential
endpoint = "YOUR_FORM_RECOGNIZER_ENDPOINT"
key = "YOUR_FORM_RECOGNIZER_KEY"
选择要复制并粘贴到应用程序中的代码示例:
重要
完成后,请记住将密钥从代码中删除,并且永远不要公开发布该密钥。 对于生产来说,请使用安全的方式存储和访问凭据,例如 Azure Key Vault。 有关详细信息,请参阅 Azure AI 服务安全性。
试用:布局模型
- 对于此示例,需要 URI 中的一个文档文件。 在本快速入门中,可使用示例文档。
- 我们已将文件 URI 值添加到文件顶部附近的
formUrl
变量中。 - 若要分析 URI 中的特定文件,将使用
begin_recognize_content_from_url
方法。
将以下代码添加到布局应用程序的 key
变量下方的行中
def format_bounding_box(bounding_box):
if not bounding_box:
return "N/A"
return ", ".join(["[{}, {}]".format(p.x, p.y) for p in bounding_box])
def recognize_content():
# sample document
formUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf"
form_recognizer_client = FormRecognizerClient(
endpoint=endpoint, credential=AzureKeyCredential(key)
)
poller = form_recognizer_client.begin_recognize_content_from_url(formUrl)
form_pages = poller.result()
for idx, content in enumerate(form_pages):
print(
"Page has width: {} and height: {}, measured with unit: {}".format(
content.width, content.height, content.unit
)
)
for table_idx, table in enumerate(content.tables):
print(
"Table # {} has {} rows and {} columns".format(
table_idx, table.row_count, table.column_count
)
)
print(
"Table # {} location on page: {}".format(
table_idx, format_bounding_box(table.bounding_box)
)
)
for cell in table.cells:
print(
"...Cell[{}][{}] has text '{}' within bounding box '{}'".format(
cell.row_index,
cell.column_index,
cell.text,
format_bounding_box(cell.bounding_box),
)
)
for line_idx, line in enumerate(content.lines):
print(
"Line # {} has word count '{}' and text '{}' within bounding box '{}'".format(
line_idx,
len(line.words),
line.text,
format_bounding_box(line.bounding_box),
)
)
if line.appearance:
if (
line.appearance.style_name == "handwriting"
and line.appearance.style_confidence > 0.8
):
print(
"Text line '{}' is handwritten and might be a signature.".format(
line.text
)
)
for word in line.words:
print(
"...Word '{}' has a confidence of {}".format(
word.text, word.confidence
)
)
for selection_mark in content.selection_marks:
print(
"Selection mark is '{}' within bounding box '{}' and has a confidence of {}".format(
selection_mark.state,
format_bounding_box(selection_mark.bounding_box),
selection_mark.confidence,
)
)
print("----------------------------------------")
if __name__ == "__main__":
recognize_content()
试用:预生成模型
此示例演示了如何以发票为例,使用预先训练的模型来分析某些类型常见文档中的数据。 请参阅预先构建的概念页面,获取发票字段的完整列表
选择一个预生成模型
不止发票,还有几个预生成模型可供选择,每个模型都有自己的一组受支持的字段。 用于分析操作的模型取决于要分析的文档类型。 下面是文档智能服务目前支持的预生成模型:
将以下代码添加到预生成发票应用程序的 key
变量下方
def recognize_invoice():
invoiceUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf"
form_recognizer_client = FormRecognizerClient(
endpoint=endpoint, credential=AzureKeyCredential(key)
)
poller = form_recognizer_client.begin_recognize_invoices_from_url(
invoiceUrl, locale="en-US"
)
invoices = poller.result()
for idx, invoice in enumerate(invoices):
vendor_name = invoice.fields.get("VendorName")
if vendor_name:
print(
"Vendor Name: {} has confidence: {}".format(
vendor_name.value, vendor_name.confidence
)
)
vendor_address = invoice.fields.get("VendorAddress")
if vendor_address:
print(
"Vendor Address: {} has confidence: {}".format(
vendor_address.value, vendor_address.confidence
)
)
vendor_address_recipient = invoice.fields.get("VendorAddressRecipient")
if vendor_address_recipient:
print(
"Vendor Address Recipient: {} has confidence: {}".format(
vendor_address_recipient.value, vendor_address_recipient.confidence
)
)
customer_name = invoice.fields.get("CustomerName")
if customer_name:
print(
"Customer Name: {} has confidence: {}".format(
customer_name.value, customer_name.confidence
)
)
customer_id = invoice.fields.get("CustomerId")
if customer_id:
print(
"Customer Id: {} has confidence: {}".format(
customer_id.value, customer_id.confidence
)
)
customer_address = invoice.fields.get("CustomerAddress")
if customer_address:
print(
"Customer Address: {} has confidence: {}".format(
customer_address.value, customer_address.confidence
)
)
customer_address_recipient = invoice.fields.get("CustomerAddressRecipient")
if customer_address_recipient:
print(
"Customer Address Recipient: {} has confidence: {}".format(
customer_address_recipient.value,
customer_address_recipient.confidence,
)
)
invoice_id = invoice.fields.get("InvoiceId")
if invoice_id:
print(
"Invoice Id: {} has confidence: {}".format(
invoice_id.value, invoice_id.confidence
)
)
invoice_date = invoice.fields.get("InvoiceDate")
if invoice_date:
print(
"Invoice Date: {} has confidence: {}".format(
invoice_date.value, invoice_date.confidence
)
)
invoice_total = invoice.fields.get("InvoiceTotal")
if invoice_total:
print(
"Invoice Total: {} has confidence: {}".format(
invoice_total.value, invoice_total.confidence
)
)
due_date = invoice.fields.get("DueDate")
if due_date:
print(
"Due Date: {} has confidence: {}".format(
due_date.value, due_date.confidence
)
)
purchase_order = invoice.fields.get("PurchaseOrder")
if purchase_order:
print(
"Purchase Order: {} has confidence: {}".format(
purchase_order.value, purchase_order.confidence
)
)
billing_address = invoice.fields.get("BillingAddress")
if billing_address:
print(
"Billing Address: {} has confidence: {}".format(
billing_address.value, billing_address.confidence
)
)
billing_address_recipient = invoice.fields.get("BillingAddressRecipient")
if billing_address_recipient:
print(
"Billing Address Recipient: {} has confidence: {}".format(
billing_address_recipient.value,
billing_address_recipient.confidence,
)
)
shipping_address = invoice.fields.get("ShippingAddress")
if shipping_address:
print(
"Shipping Address: {} has confidence: {}".format(
shipping_address.value, shipping_address.confidence
)
)
shipping_address_recipient = invoice.fields.get("ShippingAddressRecipient")
if shipping_address_recipient:
print(
"Shipping Address Recipient: {} has confidence: {}".format(
shipping_address_recipient.value,
shipping_address_recipient.confidence,
)
)
print("Invoice items:")
for idx, item in enumerate(invoice.fields.get("Items").value):
item_description = item.value.get("Description")
if item_description:
print(
"......Description: {} has confidence: {}".format(
item_description.value, item_description.confidence
)
)
item_quantity = item.value.get("Quantity")
if item_quantity:
print(
"......Quantity: {} has confidence: {}".format(
item_quantity.value, item_quantity.confidence
)
)
unit = item.value.get("Unit")
if unit:
print(
"......Unit: {} has confidence: {}".format(
unit.value, unit.confidence
)
)
unit_price = item.value.get("UnitPrice")
if unit_price:
print(
"......Unit Price: {} has confidence: {}".format(
unit_price.value, unit_price.confidence
)
)
product_code = item.value.get("ProductCode")
if product_code:
print(
"......Product Code: {} has confidence: {}".format(
product_code.value, product_code.confidence
)
)
item_date = item.value.get("Date")
if item_date:
print(
"......Date: {} has confidence: {}".format(
item_date.value, item_date.confidence
)
)
tax = item.value.get("Tax")
if tax:
print(
"......Tax: {} has confidence: {}".format(tax.value, tax.confidence)
)
amount = item.value.get("Amount")
if amount:
print(
"......Amount: {} has confidence: {}".format(
amount.value, amount.confidence
)
)
subtotal = invoice.fields.get("SubTotal")
if subtotal:
print(
"Subtotal: {} has confidence: {}".format(
subtotal.value, subtotal.confidence
)
)
total_tax = invoice.fields.get("TotalTax")
if total_tax:
print(
"Total Tax: {} has confidence: {}".format(
total_tax.value, total_tax.confidence
)
)
previous_unpaid_balance = invoice.fields.get("PreviousUnpaidBalance")
if previous_unpaid_balance:
print(
"Previous Unpaid Balance: {} has confidence: {}".format(
previous_unpaid_balance.value, previous_unpaid_balance.confidence
)
)
amount_due = invoice.fields.get("AmountDue")
if amount_due:
print(
"Amount Due: {} has confidence: {}".format(
amount_due.value, amount_due.confidence
)
)
service_start_date = invoice.fields.get("ServiceStartDate")
if service_start_date:
print(
"Service Start Date: {} has confidence: {}".format(
service_start_date.value, service_start_date.confidence
)
)
service_end_date = invoice.fields.get("ServiceEndDate")
if service_end_date:
print(
"Service End Date: {} has confidence: {}".format(
service_end_date.value, service_end_date.confidence
)
)
service_address = invoice.fields.get("ServiceAddress")
if service_address:
print(
"Service Address: {} has confidence: {}".format(
service_address.value, service_address.confidence
)
)
service_address_recipient = invoice.fields.get("ServiceAddressRecipient")
if service_address_recipient:
print(
"Service Address Recipient: {} has confidence: {}".format(
service_address_recipient.value,
service_address_recipient.confidence,
)
)
remittance_address = invoice.fields.get("RemittanceAddress")
if remittance_address:
print(
"Remittance Address: {} has confidence: {}".format(
remittance_address.value, remittance_address.confidence
)
)
remittance_address_recipient = invoice.fields.get("RemittanceAddressRecipient")
if remittance_address_recipient:
print(
"Remittance Address Recipient: {} has confidence: {}".format(
remittance_address_recipient.value,
remittance_address_recipient.confidence,
)
)
if __name__ == "__main__":
recognize_invoice()
运行应用程序
导航到 form_recognizer_quickstart.py 文件所在的文件夹。
在终端中键入以下命令:
python form_recognizer_quickstart.py
| 文档智能 REST API | Azure REST API 参考 |
在本快速入门中,你将使用以下 API 提取表单和文档中的结构化数据:
先决条件
Azure 订阅 - 创建试用订阅
已安装 cURL。
PowerShell 6.0 及以上版本,或类似的命令行应用程序。
Azure AI 服务或文档智能资源。 具有 Azure 订阅后,在 Azure 门户中创建单服务或多服务文档智能资源以获取密钥和终结点。 可以使用免费定价层 (
F0
) 试用该服务,然后再升级到付费层进行生产。提示
如果计划通过一个终结点/密钥访问多个 Azure AI 服务,请创建 Azure AI 服务资源。 请创建仅供文档智能访问的文档智能资源。 请注意,如果你打算使用 Microsoft Entra 身份验证,则需要单一服务资源。
部署资源后,选择“转到资源”。 需要从创建的资源获取密钥和终结点,以便将应用程序连接到文档智能 API。 稍后需要在本快速入门中将密钥和终结点粘贴到代码中:
选择要复制并粘贴到应用程序中的代码示例:
重要
完成后,请记住将密钥从代码中删除,并且永远不要公开发布该密钥。 对于生产来说,请使用安全的方式存储和访问凭据,例如 Azure Key Vault。 有关详细信息,请参阅 Azure AI 服务安全性。
试用:布局模型
- 对于此示例,需要 URI 中的一个文档文件。 在本快速入门中,可使用示例文档。
- 将
{endpoint}
替换为你通过文档智能订阅获取的终结点。 - 将
{key}
替换为从上一步复制的密钥。 - 将
\"{your-document-url}
替换为示例文档 URL:
https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf
请求
curl -v -i POST "https://{endpoint}/formrecognizer/v2.1/layout/analyze" -H "Content-Type: application/json" -H "Ocp-Apim-Subscription-Key: {key}" --data-ascii "{'urlSource': '{your-document-url}'}"
Operation-Location
你将收到 202 (Success)
响应,其中包含“Operation-Location”标头。 此头的值包含一个可用于查询异步操作状态和获取结果的结果 ID:
https://cognitiveservice/formrecognizer/v2.1/layout/analyzeResults/{resultId}。
在以下示例中,analyzeResults/
后面作为 URL 一部分的字符串就是结果 ID。
https://cognitiveservice/formrecognizer/v2/layout/analyzeResults/54f0b076-4e38-43e5-81bd-b85b8835fdfb
获取布局结果
调用分析布局 API 后,可调用获取分析布局结果 API,来获取操作状态和已提取的数据 。 运行该命令之前,请进行以下更改:
- 将
{endpoint}
替换为你通过文档智能订阅获取的终结点。 - 将
{key}
替换为从上一步复制的密钥。 - 将
{resultId}
替换为上一步中的结果 ID。
请求
curl -v -X GET "https://{endpoint}/formrecognizer/v2.1/layout/analyzeResults/{resultId}" -H "Ocp-Apim-Subscription-Key: {key}"
检查结果
你将收到包含 JSON 内容的 200 (success)
响应。
请查看以下发票图像及其相应的 JSON 输出。
"readResults"
节点包含每一行文本,以及其各自在页面上的边界框位置。selectionMarks
节点显示每个选择标记(复选框、单选按钮)以及其状态是selected
还是unselected
。"pageResults"
部分包含提取的表。 对于每个表,将会提取文本、行和列索引、行和列跨距、边界框等。
响应正文
试用:预生成模型
- 对于此示例,我们将使用预生成模型来分析一个发票文档。 在本快速入门中,可使用示例发票文档。
选择一个预生成模型
不止发票,还有几个预生成模型可供选择,每个模型都有自己的一组受支持的字段。 用于分析操作的模型取决于要分析的文档类型。 下面是文档智能服务目前支持的预生成模型:
运行该命令之前,请进行以下更改:
将
{endpoint}
替换为你通过文档智能订阅获取的终结点。将
{key}
替换为从上一步复制的密钥。将
\"{your-document-url}
替换为示例发票 URL:https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf
请求
curl -v -i POST https://{endpoint}/formrecognizer/v2.1/prebuilt/invoice/analyze" -H "Content-Type: application/json" -H "Ocp-Apim-Subscription-Key: {key}" --data-ascii "{'urlSource': '{your invoice URL}'}"
Operation-Location
你将收到 202 (Success)
响应,其中包含“Operation-Location”标头。 此头的值包含一个可用于查询异步操作状态和获取结果的结果 ID:
https://cognitiveservice/formrecognizer/v2.1/prebuilt/receipt/analyzeResults/{resultId}
在以下示例中,analyzeResults/
后面作为 URL 一部分的字符串就是结果 ID:
https://cognitiveservice/formrecognizer/v2.1/prebuilt/invoice/analyzeResults/54f0b076-4e38-43e5-81bd-b85b8835fdfb
获取账单结果
调用分析账单 API 后,可调用获取分析账单结果 API,来获取操作状态和已提取的数据 。 运行该命令之前,请进行以下更改:
- 将
{endpoint}
替换为你通过文档智能密钥获取的终结点。 可以在文档智能资源“概述”选项卡中找到它。 - 将
{resultId}
替换为上一步中的结果 ID。 - 将
{key}
替换为你的密钥。
请求
curl -v -X GET "https://{endpoint}/formrecognizer/v2.1/prebuilt/invoice/analyzeResults/{resultId}" -H "Ocp-Apim-Subscription-Key: {key}"
检查响应
你将收到包含 JSON 输出的 200 (Success)
响应。
"readResults"
字段包含从发票中提取的每行文本。"pageResults"
包含从发票中提取的表和选择标记。"documentResults"
字段包含发票中最相关部分的键/值信息。
请参阅示例发票文档。
响应正文
请参阅 GitHub 上的完整示例输出。
就是这样,做得很棒!
后续步骤
若要获得增强的体验和高级模型质量,请尝试文档智能工作室。
工作室支持使用 v2.1 标记的数据训练的任何模型。
更改日志提供有关从 v3.1 迁移到 v4.0 的详细信息。