文档智能 v3.1 迁移

项目
01/03/2025

此内容适用于： v3.1 (GA) v3.0 (GA) v2.1 (GA)

重要

文档智能 REST API v3.1 在 REST API 请求和分析响应 JSON 中引入了中断性变更。

从 v3.1 API 版本迁移

预览版 API 会定期弃用。如果使用的是预览版 API 版本，请更新应用程序以面向 GA API 版本。若要使用 SDK 从预览 API 版本迁移到 2023-11-30 (GA) API 版本，请更新到特定于语言的 SDK 的当前版本。

分析功能

模型 ID	文本提取	段落	段落角色	选择标记	表	键值对	语言	条形码	文档分析	公式*	StyleFont*	OCR 高分辨率*
prebuilt-read	✓	✓					O	O		O	O	O
预生成布局	✓	✓	✓	✓	✓		O	O		O	O	O
预生成文档	✓	✓	✓	✓	✓	✓	O	O		O	O	O
prebuilt-businessCard	✓								✓
prebuilt-idDocument	✓						O	O	✓	O	O	O
预生成的发票	✓			✓	✓	O	O	O	✓	O	O	O
prebuilt-receipt	✓						O	O	✓	O	O	O
prebuilt-healthInsuranceCard.us	✓						O	O	✓	O	O	O
prebuilt-tax.us.w2	✓			✓			O	O	✓	O	O	O
prebuilt-tax.us.1098	✓			✓			O	O	✓	O	O	O
prebuilt-tax.us.1098E	✓			✓			O	O	✓	O	O	O
prebuilt-tax.us.1098T	✓			✓			O	O	✓	O	O	O
prebuilt-contract	✓	✓	✓	✓			O	O	✓	O	O	O
{ customModelName }	✓	✓	✓	✓	✓		O	O	✓	O	O	O

✓ - 已启用 O - 可选公式/StyleFont/OCR 高分辨率* - 高级功能会产生额外成本

从 v3.0 迁移

相较于 v3.0，文档智能 v3.1 引入了几个新特性和功能：

条形码提取。
附加功能，包括高分辨率、公式和字体属性提取。
用于文档拆分和分类的自定义分类模型。
发票和收据模型中的语言扩展和新字段支持。
ID 文档模型中的新文档类型支持。
新的预生成医疗保险卡模型。
prebuilt-read 模型中支持 Office/HTML 文件，无需范围框即可提取单词和段落。不再支持嵌入图像。如果为 Office/HTML 文件请求附加功能，则会返回空数组，不会出现错误。
自定义提取和分类模型的模型过期 - 我们的新自定义模型基于大型基础模型生成，我们会定期更新该模型以改进质量。所有自定义模型都引入了到期日期，以便停用相应的基础模型。自定义模型过期后，需要使用最新的 API 版本（基础模型）重新训练模型。

GET /documentModels/{customModelId}?api-version={apiVersion}
{
  "modelId": "{customModelId}",
  "description": "{customModelDescription}",
  "createdDateTime": "2023-09-24T12:54:35Z",
  "expirationDateTime": "2025-01-01T00:00:00Z",
  "apiVersion": "2023-07-31",
  "docTypes": { ... }
}

自定义神经模型生成配额 - 每个订阅每月可为每个区域生成的神经模型数量有限。我们扩展结果 JSON 以包含配额和已用信息，以帮助你了解当前使用情况，这是 GET /info 返回的资源信息的一部分。

{
  "customDocumentModels": { ... },
  "customNeuralDocumentModelBuilds": {
    "used": 1,
    "quota": 10,
    "quotaResetDateTime": "2023-03-01T00:00:00Z"
  }
}

分析操作的可选 features 查询参数可以选择启用特定功能。某些高级功能可能会产生额外的计费。有关详细信息，请参阅分析功能列表。
如果可能，扩展提取的货币字段对象以输出规范化货币代码字段。目前，当前字段可返回金额（如123.45）和货币符号（如 $）。此功能将货币符号映射到规范 ISO 4217 代码（例如 USD）。模型可以选择性地利用全局文档内容来消除歧义或推断货币代码。

{
  "fields": {
    "Total": {
      "type": "currency",
      "content": "$123.45",
      "valueCurrency": {
        "amount": 123.45,
        "currencySymbol": "$",
        "currencyCode": "USD"
      },
      ...
    }
  }
}

除了模型质量改进外，我们还强烈建议你更新应用程序以使用 v3.1，以受益于这些新功能。

从 v2.1 或 v2.0 迁移

文档智能 v3.1 是最新的 GA 版本，功能最丰富，覆盖的语言和文档类型最多，并改进了模型质量。有关 v3.1 中提供的特性和功能，请参阅模型概述。

从 v3.0 开始，智能文档 REST API 经过了重新设计，提高了可用性。本节介绍了文档智能 v2.0、v2.1 和 v3.1 之间的差异，以及如何迁移到较新版本的 API。

注意

REST API 2023-07-31 版本包括 REST API 分析响应 JSON 中的中断性变更。
boundingBox 属性在每个实例中重命名为 polygon。

对 REST API 终结点的更改

v3.1 REST API 通过将 documentModels 和 modelId 分配给布局分析和预生成模型，将布局分析、预生成模型和自定义模型的分析操作组合成一对操作。

POST 请求

https://{your-form-recognizer-endpoint}/formrecognizer/documentModels/{modelId}?api-version=2023-07-31

GET 请求

https://{your-form-recognizer-endpoint}/formrecognizer/documentModels/{modelId}/AnalyzeResult/{resultId}?api-version=2023-07-31

分析操作

请求有效负载和调用模式保持不变。
分析操作指定输入文档和特定于内容的配置，它通过响应中的 Operation-Location 标头返回分析结果 URL。
通过 GET 请求轮询此分析结果 URL，检查分析操作的状态（请求之间的最小建议间隔为 1 秒）。
成功后，状态设置为成功，并在响应正文中返回 analyzeResult。如果遇到错误，则将状态设置为 failed，并返回错误。

型号	v2.0	v2.1	v3.1
请求 URL 前缀	https://{your-form-recognizer-endpoint}/formrecognizer/v2.0	https://{your-form-recognizer-endpoint}/formrecognizer/v2.1	https://{your-form-recognizer-endpoint}/formrecognizer
常规文档	空值	空值	`/documentModels/prebuilt-document:analyze`
布局	/layout/analyze	/layout/analyze	`/documentModels/prebuilt-layout:analyze`
自定义	/custom/models/{modelId}/analyze	/custom/{modelId}/analyze	`/documentModels/{modelId}:analyze`
发票	不可用	/prebuilt/invoice/analyze	`/documentModels/prebuilt-invoice:analyze`
回执	/prebuilt/receipt/analyze	/prebuilt/receipt/analyze	`/documentModels/prebuilt-receipt:analyze`
ID 文档	不可用	/prebuilt/idDocument/analyze	`/documentModels/prebuilt-idDocument:analyze`
名片	不可用	/prebuilt/businessCard/analyze	`/documentModels/prebuilt-businessCard:analyze`
W-2	空值	空值	`/documentModels/prebuilt-tax.us.w2:analyze`
医疗保险卡	空值	空值	`/documentModels/prebuilt-healthInsuranceCard.us:analyze`
合约	空值	空值	`/documentModels/prebuilt-contract:analyze`

分析请求正文

通过请求正文提供要分析的内容。可使用 URL 或 base64 编码数据来构造请求。

要指定可公开访问的 Web URL，请将 Content-Type 设置为 application/json 并发送以下 JSON 正文：

{
  "urlSource": "{urlPath}"
}

文档智能 v3.0 还支持 Base 64 编码：

{
  "base64Source": "{base64EncodedContent}"
}

其他支持的参数

继续支持的参数：

pages：仅分析文档中的特定部分页。从数字 1 开始编制索引的要分析的页码列表。例如： “1-3,5,7-9”
locale：文本识别和文档分析的区域设置提示。值只能包含语言代码（例如 en、fr）或 BCP 47 语言标记（例如“en-US”）。

不再支持的参数：

includeTextDetails

新的响应格式更紧凑，始终返回完整输出。

对分析结果的更改

分析响应重构为以下顶级结果，以支持多页元素。

pages
tables
keyValuePairs
entities
styles
documents

注意

analyzeResult 响应更改包含许多更改，例如从页面属性向上移动到 analyzeResult 中的顶级属性。


{
// Basic analyze result metadata
"apiVersion": "2022-08-31", // REST API version used
"modelId": "prebuilt-invoice", // ModelId used
"stringIndexType": "textElements", // Character unit used for string offsets and lengths:
// textElements, unicodeCodePoint, utf16CodeUnit // Concatenated content in global reading order across pages.
// Words are generally delimited by space, except CJK (Chinese, Japanese, Korean) characters.
// Lines and selection marks are generally delimited by newline character.
// Selection marks are represented in Markdown emoji syntax (:selected:, :unselected:).
"content": "CONTOSO LTD.\nINVOICE\nContoso Headquarters...", "pages": [ // List of pages analyzed
{
// Basic page metadata
"pageNumber": 1, // 1-indexed page number
"angle": 0, // Orientation of content in clockwise direction (degree)
"width": 0, // Page width
"height": 0, // Page height
"unit": "pixel", // Unit for width, height, and polygon coordinates
"spans": [ // Parts of top-level content covered by page
{
"offset": 0, // Offset in content
"length": 7 // Length in content
}
], // List of words in page
"words": [
{
"text": "CONTOSO", // Equivalent to $.content.Substring(span.offset, span.length)
"boundingBox": [ ... ], // Position in page
"confidence": 0.99, // Extraction confidence
"span": { ... } // Part of top-level content covered by word
}, ...
], // List of selectionMarks in page
"selectionMarks": [
{
"state": "selected", // Selection state: selected, unselected
"boundingBox": [ ... ], // Position in page
"confidence": 0.95, // Extraction confidence
"span": { ... } // Part of top-level content covered by selection mark
}, ...
], // List of lines in page
"lines": [
{
"content": "CONTOSO LTD.", // Concatenated content of line (may contain both words and selectionMarks)
"boundingBox": [ ... ], // Position in page
"spans": [ ... ], // Parts of top-level content covered by line
}, ...
]
}, ...
], // List of extracted tables
"tables": [
{
"rowCount": 1, // Number of rows in table
"columnCount": 1, // Number of columns in table
"boundingRegions": [ // Polygons or Bounding boxes potentially across pages covered by table
{
"pageNumber": 1, // 1-indexed page number
"polygon": [ ... ], // Previously Bounding box, renamed to polygon in the 2022-08-31 API
}
],
"spans": [ ... ], // Parts of top-level content covered by table // List of cells in table
"cells": [
{
"kind": "stub", // Cell kind: content (default), rowHeader, columnHeader, stub, description
"rowIndex": 0, // 0-indexed row position of cell
"columnIndex": 0, // 0-indexed column position of cell
"rowSpan": 1, // Number of rows spanned by cell (default=1)
"columnSpan": 1, // Number of columns spanned by cell (default=1)
"content": "SALESPERSON", // Concatenated content of cell
"boundingRegions": [ ... ], // Bounding regions covered by cell
"spans": [ ... ] // Parts of top-level content covered by cell
}, ...
]
}, ...
], // List of extracted key-value pairs
"keyValuePairs": [
{
"key": { // Extracted key
"content": "INVOICE:", // Key content
"boundingRegions": [ ... ], // Key bounding regions
"spans": [ ... ] // Key spans
},
"value": { // Extracted value corresponding to key, if any
"content": "INV-100", // Value content
"boundingRegions": [ ... ], // Value bounding regions
"spans": [ ... ] // Value spans
},
"confidence": 0.95 // Extraction confidence
}, ...
],
"styles": [
{
"isHandwritten": true, // Is content in this style handwritten?
"spans": [ ... ], // Spans covered by this style
"confidence": 0.95 // Detection confidence
}, ...
], // List of extracted documents
"documents": [
{
"docType": "prebuilt-invoice", // Classified document type (model dependent)
"boundingRegions": [ ... ], // Document bounding regions
"spans": [ ... ], // Document spans
"confidence": 0.99, // Document splitting/classification confidence // List of extracted fields
"fields": {
"VendorName": { // Field name (docType dependent)
"type": "string", // Field value type: string, number, array, object, ...
"valueString": "CONTOSO LTD.",// Normalized field value
"content": "CONTOSO LTD.", // Raw extracted field content
"boundingRegions": [ ... ], // Field bounding regions
"spans": [ ... ], // Field spans
"confidence": 0.99 // Extraction confidence
}, ...
}
}, ...
]
}

生成或训练模型

模型对象在新的 API 中有三个更新

modelId 现在是可在模型上为用户可读的名称设置的属性。
modelName 已重命名为 description
buildMode 是一个新属性，对于自定义表单模型，其值为 template，对于自定义神经模型，其值为 neural。

调用 build 操作以训练模型。请求有效负载和调用模式保持不变。生成操作指定模型和训练数据集，它通过响应中的 Operation-Location 标头返回结果。通过 GET 请求轮询此模型操作 URL，检查生成操作的状态（请求之间的最小建议间隔为 1 秒）。与 v2.1 不同，此 URL 不是模型的资源位置。相反，可从给定的 modelId 构造模型 URL，还可从响应中的 resourceLocation 属性中检索模型 URL。成功后，状态设置为 succeeded，结果包含自定义模型信息。如果遇到错误，则状态设置为 failed，并返回错误。

以下代码是使用 SAS 令牌的示例生成请求。设置前缀或文件夹路径时，请注意尾部斜杠。

POST https://{your-form-recognizer-endpoint}/formrecognizer/documentModels:build?api-version=2022-08-31

{
  "modelId": {modelId},
  "description": "Sample model",
  "buildMode": "template",
  "azureBlobSource": {
    "containerUrl": "https://{storageAccount}.blob.core.chinacloudapi.cn/{containerName}?{sasToken}",
    "prefix": "{folderName/}"
  }
}

对组合模型的更改

模型组合现在仅限于单级嵌套。组合的模型现在与添加 modelId 和 description 属性的自定义模型一致。

POST https://{your-form-recognizer-endpoint}/formrecognizer/documentModels:compose?api-version=2022-08-31
{
  "modelId": "{composedModelId}",
  "description": "{composedModelDescription}",
  "componentModels": [
    { "modelId": "{modelId1}" },
    { "modelId": "{modelId2}" },
  ]
}

对复制模型的更改

复制模型的调用模式保持不变：

使用调用 authorizeCopy 的目标资源授权复制操作。现在是 POST 请求。
将授权提交到源资源，以复制调用 copyTo 的模型
轮询返回的操作，以验证操作是否成功完成

对复制模型函数的唯一更改是：

authorizeCopy 上的 HTTP 操作现在是 POST 请求。
授权有效负载包含提交复制请求所需的全部信息。

授权复制

POST https://{targetHost}/formrecognizer/documentModels:authorizeCopy?api-version=2022-08-31
{
  "modelId": "{targetModelId}",
  "description": "{targetModelDescription}",
}

使用授权操作中的响应正文来构造复制请求。

POST https://{sourceHost}/formrecognizer/documentModels/{sourceModelId}:copyTo?api-version=2022-08-31
{
  "targetResourceId": "{targetResourceId}",
  "targetResourceRegion": "{targetResourceRegion}",
  "targetModelId": "{targetModelId}",
  "targetModelLocation": "https://{targetHost}/formrecognizer/documentModels/{targetModelId}",
  "accessToken": "{accessToken}",
  "expirationDateTime": "2021-08-02T03:56:11Z"
}

对列出模型的更改

“列出”模型已扩展，现在可返回预生成模型和自定义模型。所有预生成模型名以 prebuilt- 开始。但只返回状态为成功的模型。要列出失败或正在进行中的模型，请参阅列出操作。

示例列出模型请求

GET https://{your-form-recognizer-endpoint}/formrecognizer/documentModels?api-version=2022-08-31

对获取模型的更改

由于获取模型现在包含预生成模型，因此获取操作将返回 docTypes 字典。每个文档类型说明包括名称、可选说明、字段架构和可选字段置信度。字段架构说明可能随文档类型一起返回的字段列表。

GET https://{your-form-recognizer-endpoint}/formrecognizer/documentModels/{modelId}?api-version=2022-08-31

新的获取信息操作

服务上的 info 操作返回自定义模型计数和自定义模型限制。

GET https://{your-form-recognizer-endpoint}/formrecognizer/info? api-version=2022-08-31

示例响应

{
  "customDocumentModels": {
    "count": 5,
    "limit": 100
  }
}

通过