Google Vision PDF/TIFF Text Extraction: User Guide
Google Vision PDF/TIFF Text Extraction: User Guide
Google Vision PDF/TIFF Text Extraction: User Guide
The information contained in this document is the proprietary and confidential information of Blue Prism Limited and should not be
disclosed to a third party without the written consent of an authorised Blue Prism representative. No part of this document may be
reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying without the written
permission of Blue Prism Limited.
© Blue Prism Limited, 2001 – 2018
®Blue Prism is a registered trademark of Blue Prism Limited
All trademarks are hereby acknowledged and are used to the benefit of their respective owners.
Blue Prism is not responsible for the content of external websites referenced by this document.
Blue Prism Limited, Centrix House, Crow Lane East, Newton-le-Willows, WA12 9UY, United Kingdom
Registered in England: Reg. No. 4260035. Tel: +44 870 879 3000. Web: www.blueprism.com
This document focuses on the design of the integration between Blue Prism and Google’s Vision Cognitive Service.
Google provides these in the form of web services, which are consumed via RESTful APIs. The pros/cons of this
package are outside the scope of this document.
The Blue Prism’s Google Vision Skill interacts with the Google Cognitive Services by using Blue Prism to construct a
REST call. Then, the response given back is handled by Blue Prism and then converted into easy-to-use outputs,
such as Text, Numbers, or Collections.
All of Google’s services require a service account, which is given to each party as part of their contract with Googles
authentication server. When registering with Google’s Cloud Platform, you can create service accounts which are
restricted based upon API services. These service accounts are part of the OAuth 2.0 authentication layer which
allow you to call the API’s seamlessly – don’t worry, Blue Prism handles all the flow for you. Once a service account
has been saved inside of Blue Prism, the basic data flow of an API call would be as such:
2.1. Limitations
The following limitations should be understood before attempting to use these integrations:
• The customer or partner is responsible for the configuration and maintenance of the relevant cloud
subscriptions and services. Blue Prism cannot provide any support on the configuration of the cloud
environment itself.
• Use of the APIs may incur additional costs, depending on usage.
• There is always a possibility with external services that the APIs will change. This Skill is provided as-is
without warranties, and support is provided by Blue Prism on a best endeavours basis and is not subject to
formal SLAs.
• The Vision API accepts PDF/TIFF files up to 2000 pages. Larger files will return an error.
• API keys are not supported for asyncBatchAnnotate requests.
If any conflict or overwrite messages appear during import, then please refer to the Release Manager section in the
Product Help.
3.2.1. Credentials
An individual credential, defined and stored in Credential Manager, will hold the Service Account information
needed to form an OAuth 2.0 request which responds with a bearer token. Each action has a common parameter
named “OAuth 2 (JWT Bearer Token) Authentication Credential Name” required to authenticate against the Google
Vision API.
This section will now describe how you can create one of these credentials which will then be used to authenticate
with Google on each call. This example will be for setting up the Google Vision Skill. To start, navigate to the
Security – Credentials tab within the System menu of BluePrism. Then, click the “New” button to the right of the
view. A new window will open, see figure 6.2.1.A below. Make sure you select the “Type” as “OAuth 2.0 (JWT
Bearer Token)”.
Save the credential and make note of the name, as this will need to be used as a parameter for each action listed in
this document. The Google Vision Skill has now been correctly configured.
OAuth 2 (JWT Bearer Text The name of the credential which has the OAuth 2.0 information used for
Token) Authentication authentication with Google
Credential Name
GCS PDF File Path In Text File Name and Location (Bucket Name) in Google Cloud Storage
Batch Size In Number The batch Size parameter specifies how many pages of output
should be included in each output JSON file
4.2.2. Response
Parameter Direction Type Description
Response Content Out Text This output parameter contains operation code; can
be used to query the status of the operation
HTTP Code Status Out Text This output is not expected to be required except
for debugging
Response Headers Out Collection This output is not expected to be required except
for debugging
Operation Code Out Text Operation Code retrieved from the response of PDF
document text extraction request. This value is
used to call GCS API to retrieve status of the
operation
GCS TIFF File Path In Text File Name and Location (Bucket Name) in Google Cloud Storage
Batch Size In Number The batch Size parameter specifies how many pages of output
should be included in each output JSON file
4.3.2. Response
Parameter Direction Type Description
Response Content Out Text This output parameter contains operation code; can
be used to query the status of the operation
HTTP Code Status Out Text This output is not expected to be required except
for debugging
Response Headers Out Collection This output is not expected to be required except
for debugging
Operation Code Out Text Operation Code retrieved from the response of PDF
document text extraction request. This value is
used to call GCS API to retrieve status of the
operation
Operation Code In Text Operation Code retrieved from the PDF/TIFF document text
extraction request
4.4.2. Response
Parameter Direction Type Description
Response Content Out Text This output parameter contains operation code; can
be used to query the status of the operation
HTTP Code Status Out Text This JSON output contains the state of the API
Operation. State includes text like “RUNNING”,
“DONE” etc. Query this JSON output to retrieve
state value.
5. Support
Support for these skills is provided by Blue Prism on a best endeavours basis and is not subject to formal SLAs. Full
details of how to obtain support are provided at:
https://portal.blueprism.com/customer-support/how-get-help-customer-support
The preferred channel of support is to create a support ticket on the Customer Portal. If this is not suitable for
some reason, alternatively Blue Prism can be contacted by the following channels:
• E-mail: [email protected]
• Phone: +44(0)330 321 0055 (UK, Europe, Middle East and Africa)
+1 844 321 0055 (North America)
+61 (2) 807 42915 (Asia Pacific)
6. Functional Tests
A test process is available from the Blue Prism Digital Exchange. However, a valid subscription is required for this to
run, no universal test account is available.
7. Troubleshooting Guidelines
In the even that unexpected behaviour is observed when using this skill, it is recommended that you investigate the
potential cause/solution using the error message/response content. Details of error messages and resolutions are
available from the Google Cloud Portal (https://cloud.google.com/vision/docs/pdf).