Document Information Extraction is a service provided on BTP. It leverages machine learning and  you can upload business documents such as invoice, purchase order to receive extracted information.

The purpose of this blog post is to demonstrate how to integrate Document Information Extraction with UI5 application. We will upload an invoice and get extracted information displayed on the app.

The code is available at GitHub as always.

Application behavior

When you upload an invoice pdf, the app posts the file to Document Information Extraction. Next, press “Refresh” button until the extraction job finishes. Finally, Extracted data will be displayed on the screen.

You can download sample invoices from the following tutorial page.
Use Machine Learning to Extract Information from Documents with Document Information Extraction Trial UI

 

Prerequisites for running the application

  • An instance of Document Information Extraction and its service key (you can run booster to create them automatically)
  • Destination pointing to Document Information Extraction API

Destination

Destination

 

Property Value
Name doc-info-extraction
Type HTTP
URL “url” in the service key + “/v1”
Proxy Type Internet
Authentication OAuth2ClientCredentials
Client ID “uaa.clientid” in the service key
Client Secret “uaa.clientsecret” in the service key
Token Service URL “uaa.url” int the service key + /oautn/token

API used for the application

The application uses two API endpoints of Document Information Extraction. You can find API documentation here.
POST /document/jobs

This endpoint is used to upload a file along with options to tell the service what type of document you are going to upload and which fields you want to have back. Instead of passing exact fields, you can also specify a template (you need to define it beforehand). For more information, please refer to the document.

The following screenshot shows a request executed from Postman. For headers Content-Type: multipart/form-data is set.

GET   /document/jobs/{id}

As you see int the picture above, POST request returns id which you can use to retrieve the extraction results. At first the status may be “RUNNING”.

After some time (say, 10 seconds), the status will become “DONE” and you will get extraction results.

 

 

UI5 code

The key parts are as follows.

  1. Uploading a file to Document Information Extraction
  2. Retrieving extraction results

I used ts-app (TypeScript) template of generator-ui5.
Please note that the app needs to be deployed to BTP to function.

1. Uploading a file to Document Information Extraction

When you presses “Upload” button, the app will get the uploaded file and post it to /document/jobs endpoint. After successful upload, you will get an id which you’ll use later to fetch extraction results.

	public async handleUploadPress(): Promise<void> {
		if(this._jobId) {
			MessageBox.confirm((this.getResourceBundle() as ResourceBundle).getText("confirmText"), {
				onClose: async (oAction: string) => {
					if (oAction === "OK") {
						this._resetData()
						await this._uploadImage()
					}
				}
			})			
		} else {
			await this._uploadImage()
		}
	}

	private async _uploadImage(): Promise<void> {
		//prepare form data
		const oFileUploader = this.byId("fileUploader") as FileUploader
		const oUploadedFile = oFileUploader.oFileUpload.files[0] as File
		const blob = new Blob([oUploadedFile], { type: oUploadedFile.type })

		const formData = new FormData()
		formData.append("file", blob, oUploadedFile.name)

		const options = (this.getOwnerComponent().getModel("options") as JSONModel).getData() as Options
		formData.append('options', JSON.stringify(options))

		//call die
		const response = await this._postToDie(formData)
		this._jobId = response.id;

		// enable refresh button
		(this.getView().getModel("viewModel") as JSONModel).setProperty("/refreshEnabled", true)		
	}

	private async _postToDie(formData:FormData): Promise<Response> {
		const dieUrl = this._getbaseUrl() + "/document/jobs"
		const response  = await fetch(dieUrl, {
			method: 'POST',
			body: formData
		})
		return response.json()
	}

	private _getbaseUrl(): string {
		const appId = this.getOwnerComponent().getManifestEntry("/sap.app/id")
		const appPath = appId.replaceAll(".", "/")
		const appModulePath = jQuery.sap.getModulePath(appPath) as string
		return appModulePath + "/doc-info-extraction"
	}

To post a job to Document Information Extraction, “options” object is required as described in “API used for the application” section. For this sample app, options are configured as below.

{
    "clientId": "default",
    "extraction": {
        "headerFields": [
            "documentNumber",
            "purchaseOrderNumber",
            "documentDate",
            "dueDate",
            "grossAmount",
            "currencyCode"
        ],
        "lineItemFields": [
            "description",
            "quantity",
            "unitOfMeasure",
            "unitPrice",
            "netAmount"
        ]
    },
    "documentType": "invoice"    
}

 

2. Retrieving extraction results

When you press “Refresh” button on the screen, the app will try to fetch extraction status from  /document/jobs/{id} endpoint. If it is done, extracted fields will be stored into view model and displayed on the UI.

* In a real-world scenario, it would be preferable to retrieve the results automatically, rather than having the user refresh the page.

	public async onRefresh(): Promise<void> {
		const response = await this._getStatus()
		if (response.status === "DONE") {
			this._setInvoiceData(response.extraction)
			const viewModel = this.getView().getModel("viewModel") as JSONModel
			viewModel.setProperty("/refreshEnabled", false)
			viewModel.setProperty("/editable", true)
		
		} else if (response.status === "PENDING") {
			MessageToast.show((this.getResourceBundle() as ResourceBundle).getText("pendingText"))
		}
	}

	private async _getStatus(): Promise<any> {
		const dieUrl = this._getbaseUrl() + "/document/jobs" + "/" + this._jobId
		const response  = await fetch(dieUrl, {
			method: 'GET'
		})
		return response.json()
	}

	private _setInvoiceData(extractedData: any): void {
		const invoice = {}

		//set header
		const invoiceHeader = (extractedData.headerFields as Item[]).reduce((acc, curr) => {
			acc[curr.name] = curr.value
			return acc
		}, {}) 

		//set items
		const invoiceItems = (extractedData.lineItems as Item[][]).reduce((acc, item) => {			
			const lineItem = item.reduce((acc, curr) => {
				acc[curr.name] = curr.value
				return acc
			} , {})
			acc.push(lineItem)
			return acc
		}, [])

		invoice["header"] = invoiceHeader;
		invoice["items"] = invoiceItems;

		(this.getView().getModel("invoice") as JSONModel).setData(invoice)
	}

 

Closing

In this blog post, I have demonstrated how to upload a document to Document Information Extraction service using UI5. I hope this post will help you implement your own scenarios, such as using custom document types.

References

Sara Sampaio

Sara Sampaio

Author Since: March 10, 2022

0 0 votes
Article Rating
Subscribe
Notify of
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x