Google Vision PDF/TIFF Text Extraction: User Guide

Google Vision
PDF/TIFF Text Extraction

USER GUIDE
Version: 6.4
Document Revision: 1.0
For more information please contact:

[email protected] | UK: +44 (0) 870 879 3000 | US: +1 888 757 7476
www.blueprism.com
Contents
1. Introduction ..........................................................................................................................................................3
2. Solution Overview and Configuration ...................................................................................................................3
2.1. Limitations .....................................................................................................................................................3
3. Pre-Requisites and Environment Configuration ....................................................................................................4
3.1. Google Cloud Services Prerequisites .............................................................................................................4
3.2. Blue Prism Configuration ...............................................................................................................................4
4. Using the Skill ........................................................................................................................................................6
4.1. Common Parameters.....................................................................................................................................6
4.2. PDF Document Text Detection ......................................................................................................................6
4.3. TIFF Text Detection ........................................................................................................................................7
4.4. Operation Status............................................................................................................................................7
5. Support..................................................................................................................................................................8
6. Functional Tests ....................................................................................................................................................8
7. Troubleshooting Guidelines ..................................................................................................................................8
8. Frequently Asked Questions..................................................................................................................................8
The information contained in this document is the proprietary and confidential information of Blue Prism Limited and should not be
disclosed to a third party without the written consent of an authorised Blue Prism representative. No part of this document may be
reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying without the written
permission of Blue Prism Limited.
© Blue Prism Limited, 2001 – 2018
®Blue Prism is a registered trademark of Blue Prism Limited
All trademarks are hereby acknowledged and are used to the benefit of their respective owners.
Blue Prism is not responsible for the content of external websites referenced by this document.
Blue Prism Limited, Centrix House, Crow Lane East, Newton-le-Willows, WA12 9UY, United Kingdom
Registered in England: Reg. No. 4260035. Tel: +44 870 879 3000. Web: www.blueprism.com
Commercial in Confidence Page 2 of 8

1. Introduction
As the market for RPA grows, also grows the interest of what RPA can do and how easy it can integrate with every
ecosystem available. With the advent of Artificial Intelligence to the marketplace, interest has grown in capabilities
that provide integrations with different pre-trained AI services in the cloud.
This document focuses on the design of the integration between Blue Prism and Google’s Vision Cognitive Service.
Google provides these in the form of web services, which are consumed via RESTful APIs. The pros/cons of this
package are outside the scope of this document.
2. Solution Overview and Configuration

The basic design of the Google Vision Skill is to encapsulate the different AI Cognitive services offered by Google.
These integrations can be used as an easy bridge to connect the client’s processes to the different AI services
developed by Google.
The Blue Prism’s Google Vision Skill interacts with the Google Cognitive Services by using Blue Prism to construct a
REST call. Then, the response given back is handled by Blue Prism and then converted into easy-to-use outputs,
such as Text, Numbers, or Collections.
All of Google’s services require a service account, which is given to each party as part of their contract with Googles
authentication server. When registering with Google’s Cloud Platform, you can create service accounts which are
restricted based upon API services. These service accounts are part of the OAuth 2.0 authentication layer which
allow you to call the API’s seamlessly – don’t worry, Blue Prism handles all the flow for you. Once a service account
has been saved inside of Blue Prism, the basic data flow of an API call would be as such:
2.1. Limitations
The following limitations should be understood before attempting to use these integrations:
• The customer or partner is responsible for the configuration and maintenance of the relevant cloud
subscriptions and services. Blue Prism cannot provide any support on the configuration of the cloud
environment itself.
• Use of the APIs may incur additional costs, depending on usage.
• There is always a possibility with external services that the APIs will change. This Skill is provided as-is
without warranties, and support is provided by Blue Prism on a best endeavours basis and is not subject to
formal SLAs.
• The Vision API accepts PDF/TIFF files up to 2000 pages. Larger files will return an error.
• API keys are not supported for asyncBatchAnnotate requests.

• The account used for authentication must have access to the Cloud Storage bucket that you specify for the
output (roles/editor or roles/storage.objectCreator or above).
3. Pre-Requisites and Environment Configuration

This section outlines the pre-requisites that are required to use the integrations. Note that Blue Prism is not able to
provide any support in configuring the Google Cloud Services themselves.
3.1. Google Cloud Services Prerequisites

To implement the Google Cognitive Services integration, the following components are required:
• Subscription to Google Cloud Platform
• Enable the Vision API
• Obtain a service account with access to the Vision API
• To perform PDF/TIFF document text detection, make a POST request
3.2. Blue Prism Configuration

Before importing the Skill, which has been downloaded from the Digital Exchange, it is necessary that the following
information is obtained:
1. Service Account with access to the Vision API
The outlined requirements are explained in the next subsection.
If any conflict or overwrite messages appear during import, then please refer to the Release Manager section in the
Product Help.
3.2.1. Credentials
An individual credential, defined and stored in Credential Manager, will hold the Service Account information
needed to form an OAuth 2.0 request which responds with a bearer token. Each action has a common parameter
named “OAuth 2 (JWT Bearer Token) Authentication Credential Name” required to authenticate against the Google
Vision API.
This section will now describe how you can create one of these credentials which will then be used to authenticate
with Google on each call. This example will be for setting up the Google Vision Skill. To start, navigate to the
Security – Credentials tab within the System menu of BluePrism. Then, click the “New” button to the right of the
view. A new window will open, see figure 6.2.1.A below. Make sure you select the “Type” as “OAuth 2.0 (JWT
Bearer Token)”.

The name of the credential can be anything you wish but labelling it with respect to the API is recommended, for
example -> “Google Vision API Credential” – you could even restrict the credential by robot, but that side of the
configuration is down to you. The issuer is the email listed IAM Section of Google Cloud Platform -> Service
Accounts. When you originally created a Service Account it is also listed in the .json file which was downloaded on
your local machine. Finally, the private key is the private key listed in the .json file which was downloaded to your
workstation as you created your service account. When copy and pasting in the private key, you must include the
following information:
• • -----BEGIN PRIVATE KEY-----\n
• • \n-----END PRIVATE KEY-----\n
Save the credential and make note of the name, as this will need to be used as a parameter for each action listed in
this document. The Google Vision Skill has now been correctly configured.

4. Using the Skill
The following section outlines the individual configuration and usage of, the action in the Google Vision Skill.
1. PDF/TIFF Document Text Detection
4.1. Common Authentication Parameter

4.1.1. Inputs
Parameter Type Description
OAuth 2 (JWT Bearer Text The name of the credential which has the OAuth 2.0 information used for
Token) Authentication authentication with Google
Credential Name
4.2. PDF Document Text Detection

4.2.1. Request
Parameter Direction Type Description
GCS PDF File Path In Text File Name and Location (Bucket Name) in Google Cloud Storage
GCS Output Folder In Text Output folder in Google Cloud Storage
Batch Size In Number The batch Size parameter specifies how many pages of output
should be included in each output JSON file
4.2.2. Response
Response Content Out Text This output parameter contains operation code; can
be used to query the status of the operation
HTTP Code Status Out Text This output is not expected to be required except
for debugging
Response Headers Out Collection This output is not expected to be required except
for debugging
Operation Code Out Text Operation Code retrieved from the response of PDF
document text extraction request. This value is
used to call GCS API to retrieve status of the
operation

4.3. TIFF Text Detection
4.3.1. Request
GCS TIFF File Path In Text File Name and Location (Bucket Name) in Google Cloud Storage
GCS Output Folder In Text Output folder in Google Cloud Storage
Batch Size In Number The batch Size parameter specifies how many pages of output
should be included in each output JSON file
4.3.2. Response
HTTP Code Status Out Text This output is not expected to be required except
for debugging
for debugging
Operation Code Out Text Operation Code retrieved from the response of PDF
document text extraction request. This value is
used to call GCS API to retrieve status of the
operation
4.4. Operation Status

4.4.1. Request
Operation Code In Text Operation Code retrieved from the PDF/TIFF document text
extraction request
4.4.2. Response
HTTP Code Status Out Text This JSON output contains the state of the API
Operation. State includes text like “RUNNING”,
“DONE” etc. Query this JSON output to retrieve
state value.

for debugging
State Out Text Returns operation status (“RUNNING”, “DONE”) of

the submitted document for processing
5. Support
Support for these skills is provided by Blue Prism on a best endeavours basis and is not subject to formal SLAs. Full
details of how to obtain support are provided at:
https://portal.blueprism.com/customer-support/how-get-help-customer-support
The preferred channel of support is to create a support ticket on the Customer Portal. If this is not suitable for
some reason, alternatively Blue Prism can be contacted by the following channels:
• E-mail: [email protected]
• Phone: +44(0)330 321 0055 (UK, Europe, Middle East and Africa)
+1 844 321 0055 (North America)
+61 (2) 807 42915 (Asia Pacific)
6. Functional Tests
A test process is available from the Blue Prism Digital Exchange. However, a valid subscription is required for this to
run, no universal test account is available.
7. Troubleshooting Guidelines
In the even that unexpected behaviour is observed when using this skill, it is recommended that you investigate the
potential cause/solution using the error message/response content. Details of error messages and resolutions are
available from the Google Cloud Portal (https://cloud.google.com/vision/docs/pdf).
8. Frequently Asked Questions

There are no frequently asked questions at this stage.

Google Vision PDF/TIFF Text Extraction: User Guide

Uploaded by

Document Informationclick to expand document information

Document Informationclick to expand document information

Copyright:

Available Formats

Google Vision PDF/TIFF Text Extraction: User Guide

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Google Vision PDF/TIFF Text Extraction: User Guide

Uploaded by

Copyright:

Available Formats

Google Vision

PDF/TIFF Text Extraction

For more information please contact:

Commercial in Confidence Page 2 of 8

2. Solution Overview and Configuration

Commercial in Confidence Page 3 of 8

3. Pre-Requisites and Environment Configuration

3.1. Google Cloud Services Prerequisites

3.2. Blue Prism Configuration

1. Service Account with access to the Vision API

The outlined requirements are explained in the next subsection.

Commercial in Confidence Page 4 of 8

Commercial in Confidence Page 5 of 8

4.1. Common Authentication Parameter

4.2. PDF Document Text Detection

GCS Output Folder In Text Output folder in Google Cloud Storage

Commercial in Confidence Page 6 of 8

GCS Output Folder In Text Output folder in Google Cloud Storage

4.4. Operation Status

Commercial in Confidence Page 7 of 8

State Out Text Returns operation status (“RUNNING”, “DONE”) of

8. Frequently Asked Questions

Commercial in Confidence Page 8 of 8

You might also like