How do you scale Google Cloud Document AI processing?

Question

From https://cloud.google.com/document-ai/docs/process-forms, I can see some example of processing single files. But in most cases, companies have buckets of documents. In that case, how do you scale the document ai processing? Do you use the document ai in conjunction with Spark? Or is there another way?

Kevin Eid · Accepted Answer · 2020-07-30 11:24:45Z

0

I could only find the following: batch_process_documents process many documents and return an async response that'll get saved in cloud storage.

From there, I think that we can parametrise our job by adding an input path of the bucket prefix and distribute the job over several machines.

All of that could be orchestrated via Airflow for example.

answered Jul 30, 2020 at 11:24

Kevin Eid

2365 silver badges9 bronze badges

Add a comment |

Holt Skinner · Accepted Answer · 2022-08-02 20:57:37Z

0

Answer recommended by Google Cloud Collective

You will need to use Batch Processing to handle multiple documents at once with Document AI.

This page in the Cloud Documentation shows how to make Batch Processing requests with REST and the Client Libraries.

https://cloud.google.com/document-ai/docs/send-request#batch-process

This codelab also illustrates how to do this in Python with the OCR Processor. https://codelabs.developers.google.com/codelabs/docai-ocr-python

answered Aug 2, 2022 at 20:57

Holt Skinner

2,2341 gold badge9 silver badges27 bronze badges

Add a comment |

Collectives™ on Stack Overflow

How do you scale Google Cloud Document AI processing?

2 Answers 2

Your Answer

Not the answer you're looking for? Browse other questions tagged
google-cloud-platform
google-cloud-dataproc
cloud-document-ai
google-cloud-ai
or ask your own question.

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged google-cloud-platformgoogle-cloud-dataproccloud-document-aigoogle-cloud-ai or ask your own question.

Related

Not the answer you're looking for? Browse other questions tagged
google-cloud-platform
google-cloud-dataproc
cloud-document-ai
google-cloud-ai
or ask your own question.