This page describes how to import your catalog information and keep it up to date.
The import procedures on this page apply to both recommendations and search. After you import data, both services are able to use that data, so you don't need to import the same data twice if you use both services.
Import catalog data from BigQuery
This tutorial shows you how to use a BigQuery table to import large amounts of catalog data with no limits.
To follow step-by-step guidance for this task directly in the Cloud Shell Editor, click Guide me:
Import catalog data from Cloud Storage
This tutorial shows you how to to import a large number of items to a catalog.
To follow step-by-step guidance for this task directly in the Cloud Shell Editor, click Guide me:
Import catalog data inline
This tutorial shows how to to import products into a catalog inline.
To follow step-by-step guidance for this task directly in the Cloud Shell Editor, click Guide me:
Before you begin
Before you can import your catalog information, you must have completed the instructions in Before you begin, specifically setting up your project, creating a service account, and adding the service account to your local environment.
You must have the Retail Admin IAM role to perform the import.
Catalog import best practices
High quality data is needed to generate high-quality results. If your data is missing fields or has placeholder values instead of actual values, the quality of your predictions and search results suffers.
When you import catalog data, ensure that you implement the following best practices:
Make sure to think carefully when determining which products or groups of products are primary and which are variants. Before you upload any data, see Product levels.
Change product level configuration after you have imported any data requires a significant effort.
Primary items are returned as search results or recommendations. Variant items are not.
For example, if the primary SKU group is "V-neck shirt", then the recommendation model returns one V-neck shirt item, and, perhaps, a crew-neck shirt and a scoop-neck shirt. However, if variants are not used and each SKU is a primary, then every color/size combination of V-neck shirt is returned as a distinct item on the recommendation panel: "Brown V-neck shirt, size XL", "Brown V-neck shirt, size L", through to "White V-neck shirt, size M", "White V-neck shirt, size S".
Observe the product item import limits.
For bulk import from Cloud Storage, the size of each file must be 2 GB or smaller. You can include up to 100 files at a time in a single bulk import request.
For inline import, import no more than 5,000 product items at a time.
Make sure that all required catalog information is included and correct.
Do not use placeholder values.
Include as much optional catalog information as possible.
Make sure your events all use a single currency, especially if you plan to use Google Cloud console to get revenue metrics. The Vertex AI Search for retail API does not support using multiple currencies per catalog.
Keep your catalog up to date.
Ideally, you should update your catalog daily. Scheduling periodic catalog imports prevents model quality from going down over time. You can schedule automatic, recurring imports when you import your catalog using the Search for Retail console. Alternatively, you can use Google Cloud Scheduler to automate imports.
Do not record user events for product items that have not been imported yet.
After importing catalog information, review the error reporting and logging information for your project.
A few errors are expected, but if you have a large number of errors, you should review them and fix any process issues that led to the errors.
About importing catalog data
You can import your product data from Merchant Center, Cloud Storage, BigQuery, or specify the data inline in the request. Each of these procedures are one-time imports with the exception of linking Merchant Center. Schedule regular catalog imports (ideally, daily) to ensure that your catalog is current. See Keep your catalog up to date.
You can also import individual product items. For more information, see Upload a product.
Catalog import considerations
This section describes the methods that can be used for batch importing of your catalog data, when you might use each method, and some of their limitations.
Merchant Center Syncing | Description | Imports catalog data through Merchant Center by linking the account with Vertex AI Search for retail. After linking, updates to catalog data in Merchant Center are synced in real time to Vertex AI Search for retail. |
---|---|---|
When to use | If you have an existing integration with Merchant Center. | |
Limitations |
Limited schema support. For example, product collections are not supported
by Merchant Center. Merchant Center becomes
the source of truth for data until it is unlinked, so any custom
attributes needed must be added to Merchant Center data.
Limited control. You cannot specify certain fields or sets of items to import from Merchant Center; all items and fields existing in Merchant Center are imported. |
|
BigQuery | Description | Import data from a previously loaded BigQuery table that uses the Vertex AI Search for retail schema or the Merchant Center schema. Can be performed using the Google Cloud console or curl. |
When to use |
If you have product catalogs with many attributes. BigQuery
import uses the Vertex AI Search for retail schema, which has more product
attributes than other import options, including key/value custom
attributes.
If you have large volumes of data. BigQuery import does not have a data limit. If you already use BigQuery. |
|
Limitations | Requires the extra step of creating a BigQuery table that maps to the Vertex AI Search for retail schema. | |
Cloud Storage | Description |
Import data in a JSON format from files loaded in a Cloud Storage
bucket. Each file must be 2 GB or smaller and up to 100 files at a time
can be imported. The import can be done using the Google Cloud console
or curl. Uses the Product JSON data format, which allows
custom attributes.
|
When to use | If you need to load a large amount of data in a single step. | |
Limitations | Not ideal for catalogs with frequent inventory and pricing updates because changes are not reflected immediately. | |
Inline import | Description |
Import using a call to the Product.import method. Uses
the ProductInlineSource object, which has fewer product
catalog attributes than the Vertex AI Search for retail schema, but supports custom
attributes.
|
When to use | If you have flat, non-relational catalog data or a high frequency of quantity or price updates. | |
Limitations | No more than 100 catalog items can be imported at a time. However, many load steps can be performed; there is no item limit. |
Purge catalog branches
If you are importing new catalog data to an existing branch, it is important that the catalog branch is empty. This ensures the integrity of data imported to the branch. When the branch is empty, you can import new catalog data and then link the branch to a merchant account.
If you are serving live predict or search traffic and plan to purge your default branch, consider first specifying another branch as the default before purging. Because the default branch will serve empty results after being purged, purging a live default branch can cause an outage.
To purge data from a catalog branch, complete the following steps:
Go to the Data> page in the Search for Retail console.
Go to the Data pageSelect a catalog branch from the Branch name field.
From the three-dot menu beside the Branch name field, choose Purge branch.
A message is displayed warning you that you are about to delete all data in the branch as well as any attributes created for the branch.
Enter the branch and click Confirm to purge the catalog data from the branch.
A long-running operation is started to purge data from the catalog branch. When the purge operation is complete, the status of the purge is displayed in the Product catalog list in the Activity status window.
Sync Merchant Center to Vertex AI Search for retail
For continuous synchronization between Merchant Center and Vertex AI Search for retail, you can link your Merchant Center account to Vertex AI Search for retail. After linking, the catalog information in your Merchant Center account is immediately imported to Vertex AI Search for retail.
When you set up a Merchant Center sync for Vertex AI Search for retail, you must have the Admin role assigned in Merchant Center. Although a Standard access role will permit you to read the Merchant Center feeds in the UI, when you attempt to sync Merchant Center to Vertex AI Search for retail, you will receive an error message. Therefore, before you can successfully sync your Merchant Center to Vertex AI Search for retail, you will need to upgrade your role accordingly.
While Vertex AI Search for retail is linked to the Merchant Center account, changes to your product data in the Merchant Center account are automatically updated within minutes in Vertex AI Search for retail. If you want to prevent Merchant Center changes from being synced to Vertex AI Search for retail, you can unlink your Merchant Center account.
Unlinking your Merchant Center account does not delete any products in Vertex AI Search for retail. To delete imported products, see Delete product information.
Sync your Merchant Center account
To sync your Merchant Center account, complete the following steps.
Console
-
Go to the Data> page in the Search for Retail console.
Go to the Data page - Click Import to open the Import Data panel.
- Choose Product catalog.
- Select Merchant Center Sync as your data source.
- Select your Merchant Center account. Check User Access if you don't see your account.
- Optional: Select Merchant Center feeds filter to import only offers from selected feeds.
If not specified, offers from all feeds are imported (including future feeds). - Optional: To import only offers targeted to certain countries or languages, expand Show Advanced Options and select Merchant Center countries of sale and languages to filter for.
- Select the branch you will upload your catalog to.
- Click Import.
curl
Check that the service account in your local environment has access to both the Merchant Center account and Vertex AI Search for retail. To check which accounts have access to your Merchant Center account, see User access for Merchant Center.
Use the
MerchantCenterAccountLink.create
method to establish the link.curl -X POST \ -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \ -H "Content-Type: application/json; charset=utf-8" \ --data '{ "merchantCenterAccountId": MERCHANT_CENTER_ID, "branchId": "BRANCH_ID", "feedFilters": [ {"primaryFeedId": PRIMARY_FEED_ID_1} {"primaryFeedId": PRIMARY_FEED_ID_2} ], "languageCode": "LANGUAGE_CODE", "feedLabel": "FEED_LABEL", }' \ "https://retail.googleapis.com/v2alpha/projects/PROJECT_ID/locations/global/catalogs/default_catalog/merchantCenterAccountLinks"
- MERCHANT_CENTER_ID: The ID of the Merchant Center account.
- BRANCH_ID: The ID of the branch to establish the link with. Accepts values '0', '1', or '2'.
- LANGUAGE_CODE: (OPTIONAL) The two-letter language code of
the products you want to import. As seen in
Merchant Center under
Language
column of the product. If not set, all languages are imported. - FEED_LABEL: (OPTIONAL) The feed label of the products you want to import. You can see the feed label in Merchant Center in the product's Feed Label column product. If not set, all feed labels are imported.
- FEED_FILTERS: (OPTIONAL) List of
primary feeds from which products will be
imported. Not selecting feeds means that all
Merchant Center account feeds are shared. The IDs
can be found in Content API datafeeds resource or by
visiting Merchant Center, selecting a feed and
getting the feed ID from the dataSourceId parameter in
the site URL. For example,
mc/products/sources/detail?a=MERCHANT_CENTER_ID&dataSourceId=PRIMARY_FEED_ID
.
To view your linked Merchant Center, go to the Search for Retail console Data page and click the Merchant Center button on the top right of the page. This opens the Linked Merchant Center Accounts panel. You can also add additional Merchant Center accounts from this panel.
See View aggregated information about your catalog for instructions on how to view the products that have been imported.
List your Merchant Center account links
List your Merchant Center account links.
Console
Go to the Data> page in the Search for Retail console.
Go to the Data pageClick the Merchant Center button on the top right of the page to open a list of your linked Merchant Center accounts.
curl
Use the MerchantCenterAccountLink.list
method
to list the links resource.
curl -X GET \ -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \ -H "Content-Type: application/json; charset=utf-8" \ "https://retail.googleapis.com/v2alpha/projects/PROJECT_NUMBER/locations/global/catalogs/default_catalog/merchantCenterAccountLinks"
Unlink your Merchant Center account
Unlinking your Merchant Center account stops that account from syncing catalog data to Vertex AI Search for retail. This procedure does not delete any products in Vertex AI Search for retail that have already been uploaded.
Console
Go to the Data> page in the Search for Retail console.
Go to the Data pageClick the Merchant Center button on the top right of the page to open a list of your linked Merchant Center accounts.
Click Unlink next to the Merchant Center account you're unlinking, and confirm your choice in the dialog that appears.
curl
Use the MerchantCenterAccountLink.delete
method to remove the MerchantCenterAccountLink
resource.
curl -X DELETE \ -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \ -H "Content-Type: application/json; charset=utf-8" \ "https://retail.googleapis.com/v2alpha/projects/PROJECT_NUMBER/locations/global/catalogs/default_catalog/merchantCenterAccountLinks/BRANCH_ID_MERCHANT_CENTER_ID"
Limitations on linking to Merchant Center
A Merchant Center account can be linked to any number of catalog branches, but a single catalog branch can only be linked to one Merchant Center account.
A Merchant Center account cannot be a multi-client account (MCA). However, you can link individual sub-accounts.
The first import after linking your Merchant Center account may take hours to finish. The amount of time depends on the number of offers in the Merchant Center account.
Any product modifications using API methods are disabled for branches linked to a Merchant Center account. Any changes to the product catalog data in those branches have to be made using Merchant Center. Those changes are then automatically synced to Vertex AI Search for retail.
The collection product type isn't supported for branches that use Merchant Center linking.
Your Merchant Center account can only be linked to empty catalog branches to ensure data correctness. In order to delete products from a catalog branch, see Delete product information.
Import catalog data from Merchant Center
Merchant Center is a tool you can use to make your store and product data available for Shopping ads and other Google services.
You can bulk import catalog data from Merchant Center as a one-time procedure from BigQuery using the Merchant Center schema (recommendations only).
Bulk import from Merchant Center
You can import catalog data from Merchant Center using the
Search for Retail console or the products.import
method. Bulk
importing is a one-time procedure, and is only supported for
recommendations.
To import your catalog from Merchant Center, complete the following steps:
Using the instructions in Merchant Center transfers, set up a transfer from Merchant Center into BigQuery.
You'll use the Google Merchant Center products table schema. Configure your transfer to repeat daily, but configure your dataset expiration time at 2 days.
If your BigQuery dataset is in another project, configure the required permissions so that Vertex AI Search for retail can access the BigQuery dataset. Learn more.
Import your catalog data from BigQuery into Vertex AI Search for retail.
Console
Go to the Data> page in the Search for Retail console.
Go to the Data pageClick Import to open the Import panel.
Choose Product catalog.
Select BigQuery as your data source.
Select the branch you will upload your catalog to.
Select Merchant Center as the data schema.
Enter the BigQuery table where your data is located.
Optional: Enter the location of a Cloud Storage bucket in your project as a temporary location for your data.
If not specified, a default location is used. If specified, the BigQuery and Cloud Storage bucket have to be in the same region.
Choose whether to schedule a recurring upload of your catalog data.
If this is the first time you are importing your catalog, or you are re-importing the catalog after purging it,select the product levels. Learn more about product levels.
Change product level configuration after you have imported any data requires a significant effort.
Click Import.
curl
If this is the first time you are uploading your catalog, or you are re-importing the catalog after purging it, set your product levels by using the
Catalog.patch
method. This operation requires the Retail Admin role. Learn more about product levels.ingestionProductType
: Supports the valuesprimary
(default) andvariant
.merchantCenterProductIdField
: Supports the valuesofferId
(default) anditemGroupId
. If you don't use Merchant Center, set to the default valueofferId
.
curl -X PATCH \ -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \ -H "Content-Type: application/json; charset=utf-8" \ --data '{ "productLevelConfig": { "ingestionProductType": "PRODUCT_TYPE", "merchantCenterProductIdField": "PRODUCT_ID_FIELD" } }' \ "https://retail.googleapis.com/v2/projects/PROJECT_ID/locations/global/catalogs/default_catalog"
Import your catalog using the
Products.import
method.- DATASET_ID: The ID of the BigQuery dataset.
- TABLE_ID: The ID of the BigQuery table holding your data.
- STAGING_DIRECTORY: Optional. A Cloud Storage directory that is used as an interim location for your data before it is imported into BigQuery. Leave this field empty to automatically create a temporary directory (recommended).
- ERROR_DIRECTORY: Optional. A Cloud Storage directory for error information about the import. Leave this field empty to automatically create a temporary directory (recommended).
dataSchema
: For thedataSchema
property, use valueproduct_merchant_center
. See the Merchant Center products table schema.
We recommend you don't specify staging or error directories, that way, a Cloud Storage bucket with new staging and error directories can be automatically created. These directories are created in the same region as the BigQuery dataset, and are unique to each import (which prevents multiple import jobs from staging data to the same directory, and potentially re-importing the same data). After three days, the bucket and directories are automatically deleted to reduce storage costs.
An automatically created bucket name includes the project ID, bucket region, and data schema name, separated by underscores (for example,
4321_us_catalog_retail
). The automatically created directories are calledstaging
orerrors
, appended by a number (for example,staging2345
orerrors5678
).If you specify directories, the Cloud Storage bucket must be in the same region as the BigQuery dataset, or the import will fail. Provide the staging and error directories in the format
gs://<bucket>/<folder>/
; they should be different.
curl -X POST \ -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \ -H "Content-Type: application/json; charset=utf-8" \ --data '{ "inputConfig":{ "bigQuerySource": { "datasetId":"DATASET_ID", "tableId":"TABLE_ID", "dataSchema":"product_merchant_center" } } }' \ "https://retail.googleapis.com/v2/projects/PROJECT_NUMBER/locations/global/catalogs/default_catalog/branches/0/products:import"
Import catalog data from BigQuery
To import catalog data in the correct format from BigQuery, use the Vertex AI Search for retail schema to create a BigQuery table with the correct format and load the empty table with your catalog data. Then, upload your data to Vertex AI Search for retail.
For more help with BigQuery tables, see Introduction to tables. For help with BigQuery queries, see Overview of querying BigQuery data.
To follow step-by-step guidance for this task directly in the Cloud Shell Editor, click Guide me:
To import your catalog:
If your BigQuery dataset is in another project, configure the required permissions so that Vertex AI Search for retail can access the BigQuery dataset. Learn more.
Import your catalog data to Vertex AI Search for retail.
Console
-
Go to the Data> page in the Search for Retail console.
Go to the Data page - Click Import to open the Import Data panel.
- Choose Product catalog.
- Select BigQuery as your data source.
- Select the branch you will upload your catalog to.
- Choose Retail Product Catalogs Schema. This is the Product schema for Vertex AI Search for retail.
- Enter the BigQuery table where your data is located.
- Optional: Under Show advanced options, enter the location of a
Cloud Storage bucket in your project as a temporary location for your data.
If not specified, a default location is used. If specified, the BigQuery and Cloud Storage bucket have to be in the same region. - If you do not have search enabled and you are using
the Merchant Center schema, select the product level.
You must select the product level if this is the first time you are importing your catalog or you are re-importing the catalog after purging it. Learn more about product levels. Changing product levels after you have imported any data requires a significant effort.
Important: You can't turn on search for projects with a product catalog that has been ingested as variants. - Click Import.
curl
If this is the first time you are uploading your catalog, or you are re-importing the catalog after purging it, set your product levels by using the
Catalog.patch
method. This operation requires the Retail Admin role.ingestionProductType
: Supports the valuesprimary
(default) andvariant
.merchantCenterProductIdField
: Supports the valuesofferId
anditemGroupId
. If you don't use Merchant Center, you don't need to set this field.
curl -X PATCH \ -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \ -H "Content-Type: application/json; charset=utf-8" \ --data '{ "productLevelConfig": { "ingestionProductType": "PRODUCT_TYPE", "merchantCenterProductIdField": "PRODUCT_ID_FIELD" } }' \ "https://retail.googleapis.com/v2/projects/PROJECT_ID/locations/global/catalogs/default_catalog"
Create a data file for the input parameters for the import.
Use the BigQuerySource object to point to your BigQuery dataset.
- DATASET_ID: The ID of the BigQuery dataset.
- TABLE_ID: The ID of the BigQuery table holding your data.
- PROJECT_ID: The project ID that the BigQuery source is in. If not specified, the project ID is inherited from the parent request.
- STAGING_DIRECTORY: Optional. A Cloud Storage directory that is used as an interim location for your data before it is imported into BigQuery. Leave this field empty to automatically create a temporary directory (recommended).
- ERROR_DIRECTORY: Optional. A Cloud Storage directory for error information about the import. Leave this field empty to automatically create a temporary directory (recommended).
dataSchema
: For thedataSchema
property, use valueproduct
(default). You'll use the Vertex AI Search for retail schema.
We recommend you don't specify staging or error directories, that way, a Cloud Storage bucket with new staging and error directories can be automatically created. These directories are created in the same region as the BigQuery dataset, and are unique to each import (which prevents multiple import jobs from staging data to the same directory, and potentially re-importing the same data). After three days, the bucket and directories are automatically deleted to reduce storage costs.
An automatically created bucket name includes the project ID, bucket region, and data schema name, separated by underscores (for example,
4321_us_catalog_retail
). The automatically created directories are calledstaging
orerrors
, appended by a number (for example,staging2345
orerrors5678
).If you specify directories, the Cloud Storage bucket must be in the same region as the BigQuery dataset, or the import will fail. Provide the staging and error directories in the format
gs://<bucket>/<folder>/
; they should be different.{ "inputConfig":{ "bigQuerySource": { "projectId":"PROJECT_ID", "datasetId":"DATASET_ID", "tableId":"TABLE_ID", "dataSchema":"product"} } }
Import your catalog information by making a
POST
request to theProducts:import
REST method, providing the name of the data file (here, shown asinput.json
).curl -X POST \ -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \ -H "Content-Type: application/json; charset=utf-8" -d @./input.json \ "https://retail.googleapis.com/v2/projects/PROJECT_NUMBER/locations/global/catalogs/default_catalog/branches/0/products:import"
You can check the status programmatically using the API. You should receive a response object that looks something like this:
{ "name": "projects/PROJECT_ID/locations/global/catalogs/default_catalog/operations/import-products-123456", "done": false }
The name field is the ID of the operation object. To request the status of this object, replace the name field with the value returned by the
import
method, until thedone
field returns astrue
:curl -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \ "https://retail.googleapis.com/v2/projects/PROJECT_ID/locations/global/catalogs/default_catalog/operations/import-products-123456"
When the operation completes, the returned object has a
done
value oftrue
, and includes a Status object similar to the following example:{ "name": "projects/PROJECT_ID/locations/global/catalogs/default_catalog/operations/import-products-123456", "metadata": { "@type": "type.googleapis.com/google.cloud.retail.v2.ImportMetadata", "createTime": "2020-01-01T03:33:33.000001Z", "updateTime": "2020-01-01T03:34:33.000001Z", "successCount": "2", "failureCount": "1" }, "done": true, "response": { "@type": "type.googleapis.com/google.cloud.retail.v2.ImportProductsResponse", }, "errorsConfig": { "gcsPrefix": "gs://error-bucket/error-directory" } }
You can inspect the files in the error directory in Cloud Storage to see if errors occurred during the import.
-
Go to the Data> page in the Search for Retail console.
Set up access to your BigQuery dataset
To set up access when your BigQuery dataset is in a different project than your Vertex AI Search for retail service, complete the following steps.
Open the IAM page in the Google Cloud console.
Select your Vertex AI Search for retail project.
Find the service account with the name Retail Service Account.
If you have not previously initiated an import operation, this service account might not be listed. If you do not see this service account, return to the import task and initiate the import. When it fails due to permission errors, return here and complete this task.
Copy the identifier for the service account, which looks like an email address (for example,
[email protected]
).Switch to your BigQuery project (on the same IAM & Admin page) and click person_add Grant Access.
For New principals, enter the identifier for the Vertex AI Search for retail service account and select the BigQuery > BigQuery User role.
Click Add another role and select BigQuery > BigQuery Data Editor.
If you do not want to provide the Data Editor role to the entire project, you can add this role directly to the dataset. Learn more.
Click Save.
Import catalog data from Cloud Storage
To import catalog data in JSON format, you create one or more JSON files that contain the catalog data you want to import, and upload it to Cloud Storage. From there, you can import it to Vertex AI Search for retail.
For an example of the JSON product item format, see Product item JSON data format.
For help with uploading files to Cloud Storage, see Upload objects.
Make sure the Vertex AI Search for retail service account has permission to read and write to the bucket.
The Vertex AI Search for retail service account is listed on the IAM page in the Google Cloud console with the name Retail Service Account. Use the service account's identifier, which looks like an email address (for example,
[email protected]
), when adding the account to your bucket permissions.Import your catalog data.
Console
-
Go to the Data> page in the Search for Retail console.
Go to the Data page - Click Import to open the Import Data panel.
- Choose Product catalog as your data source.
- Select the branch you will upload your catalog to.
- Choose Retail Product Catalogs Schema as the schema.
- Enter the Cloud Storage location of your data.
- If you do not have search enabled, select the product levels.
You must select the product levels if this is the first time you are importing your catalog or you are re-importing the catalog after purging it. Learn more about product levels. Changing product levels after you have imported any data requires a significant effort.
Important: You can't turn on search for projects with a product catalog that has been ingested as variants. - Click Import.
curl
If this is the first time you are uploading your catalog, or you are re-importing the catalog after purging it, set your product levels by using the
Catalog.patch
method. Learn more about product levels.ingestionProductType
: Supports the valuesprimary
(default) andvariant
.merchantCenterProductIdField
: Supports the valuesofferId
anditemGroupId
. If you don't use Merchant Center, you don't need to set this field.
curl -X PATCH \ -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \ -H "Content-Type: application/json; charset=utf-8" \ --data '{ "productLevelConfig": { "ingestionProductType": "PRODUCT_TYPE", "merchantCenterProductIdField": "PRODUCT_ID_FIELD" } }' \ "https://retail.googleapis.com/v2/projects/PROJECT_ID/locations/global/catalogs/default_catalog"
Create a data file for the input parameters for the import. Use the
GcsSource
object to point to your Cloud Storage bucket.You can provide multiple files, or just one; this example uses two files.
- INPUT_FILE: A file or files in Cloud Storage containing your catalog data.
- ERROR_DIRECTORY: A Cloud Storage directory for error information about the import.
The input file fields must be in the format
gs://<bucket>/<path-to-file>/
. The error directory must be in the formatgs://<bucket>/<folder>/
. If the error directory does not exist, it gets created. The bucket must already exist.{ "inputConfig":{ "gcsSource": { "inputUris": ["INPUT_FILE_1", "INPUT_FILE_2"] } }, "errorsConfig":{"gcsPrefix":"ERROR_DIRECTORY"} }
Import your catalog information by making a
POST
request to theProducts:import
REST method, providing the name of the data file (here, shown asinput.json
).curl -X POST \ -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \ -H "Content-Type: application/json; charset=utf-8" -d @./input.json \ "https://retail.googleapis.com/v2/projects/PROJECT_NUMBER/locations/global/catalogs/default_catalog/branches/0/products:import"
The easiest way to check the status of your import operation is to use the Google Cloud console. For more information, see See status for a specific integration operation.
You can also check the status programmatically using the API. You should receive a response object that looks something like this:
{ "name": "projects/PROJECT_ID/locations/global/catalogs/default_catalog/operations/import-products-123456", "done": false }
The name field is the ID of the operation object. You request the status of this object, replacing the name field with the value returned by the import method, until the
done
field returns astrue
:curl -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \ "https://retail.googleapis.com/v2/projects/PROJECT_ID/locations/global/catalogs/default_catalog/operations/[OPERATION_NAME]"
When the operation completes, the returned object has a
done
value oftrue
, and includes a Status object similar to the following example:{ "name": "projects/PROJECT_ID/locations/global/catalogs/default_catalog/operations/import-products-123456", "metadata": { "@type": "type.googleapis.com/google.cloud.retail.v2.ImportMetadata", "createTime": "2020-01-01T03:33:33.000001Z", "updateTime": "2020-01-01T03:34:33.000001Z", "successCount": "2", "failureCount": "1" }, "done": true, "response": { "@type": "type.googleapis.com/google.cloud.retail.v2.ImportProductsResponse" }, "errorsConfig": { "gcsPrefix": "gs://error-bucket/error-directory" } }
You can inspect the files in the error directory in Cloud Storage to see what kind of errors occurred during the import.
-
Go to the Data> page in the Search for Retail console.
Import catalog data inline
curl
You import your catalog information inline by
making a POST
request to the Products:import
REST method,
using the productInlineSource
object to specify your catalog
data.
Provide an entire product on a single line. Each product should be on its own line.
For an example of the JSON product item format, see Product item JSON data format.
Create the JSON file for your product and call it
./data.json
:{ "inputConfig": { "productInlineSource": { "products": [ { PRODUCT_1 } { PRODUCT_2 } ] } } }
Call the POST method:
curl -X POST \ -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \ -H "Content-Type: application/json; charset=utf-8" \ --data @./data.json \ "https://retail.googleapis.com/v2/projects/PROJECT_NUMBER/locations/global/catalogs/default_catalog/branches/0/products:import"
Java
Product item JSON data format
The Product
entries in your JSON file should look like the following examples.
Provide an entire product on a single line. Each product should be on its own line.
Minimum required fields:
{
"id": "1234",
"categories": "Apparel & Accessories > Shoes",
"title": "ABC sneakers"
}
{
"id": "5839",
"categories": "casual attire > t-shirts",
"title": "Crew t-shirt"
}
Complete object:
{
"name": "projects/PROJECT_NUMBER/locations/global/catalogs/default_catalog/branches/0/products/1234",
"id": "1234",
"categories": "Apparel & Accessories > Shoes",
"title": "ABC sneakers",
"description": "Sneakers for the rest of us",
"attributes": { "vendor": {"text": ["vendor123", "vendor456"]} },
"language_code": "en",
"tags": [ "black-friday" ],
"priceInfo": {
"currencyCode": "USD", "price":100, "originalPrice":200, "cost": 50
},
"availableTime": "2020-01-01T03:33:33.000001Z",
"availableQuantity": "1",
"uri":"http://example.com",
"images": [
{"uri": "http://example.com/img1", "height": 320, "width": 320 }
]
}
{
"name": "projects/PROJECT_NUMBER/locations/global/catalogs/default_catalog/branches/0/products/4567",
"id": "4567",
"categories": "casual attire > t-shirts",
"title": "Crew t-shirt",
"description": "A casual shirt for a casual day",
"attributes": { "vendor": {"text": ["vendor789", "vendor321"]} },
"language_code": "en",
"tags": [ "black-friday" ],
"priceInfo": {
"currencyCode": "USD", "price":50, "originalPrice":60, "cost": 40
},
"availableTime": "2020-02-01T04:44:44.000001Z",
"availableQuantity": "2",
"uri":"http://example.com",
"images": [
{"uri": "http://example.com/img2", "height": 320, "width": 320 }
]
}
Historical catalog data
Vertex AI Search for retail supports importing and managing historical catalog data. Historical catalog data can be helpful when you use historical user events for model training. Past product information can be used to enrich historical user event data and improve model accuracy.
Historical products are stored as expired products. They are not returned in
search responses, but are visible to the Update
, List
, and Delete
API
calls.
Import historical catalog data
When a product's expireTime
field is set to a past
timestamp, this product is considered as a historical product. Set the product
availability to
OUT_OF_STOCK to avoid impacting
recommendations.
We recommend using the following methods for importing historical catalog data:
- Calling the
Product.Create
method. - Inline importing expired products.
- Importing expired products from BigQuery.
Call the Product.Create
method
Use the Product.Create
method to create a Product
entry
with the expireTime
field set to a past timestamp.
Inline import expired products
The steps are identical to inline import, except that the products
should have the expireTime
fields set to a past
timestamp.
Provide an entire product on a single line. Each product should be on its own line.
An example of the ./data.json
used in the inline import request:
{ "inputConfig": { "productInlineSource": { "products": [ { "id": "historical_product_001", "categories": "Apparel & Accessories > Shoes", "title": "ABC sneakers", "expire_time": { "second": "2021-10-02T15:01:23Z" // a past timestamp } }, { "id": "historical product 002", "categories": "casual attire > t-shirts", "title": "Crew t-shirt", "expire_time": { "second": "2021-10-02T15:01:24Z" // a past timestamp } } ] } } }
Import expired products from BigQuery or Cloud Storage
Use the same procedures documented for
importing catalog data from BigQuery or
importing catalog data from Cloud Storage. However, make sure to set
the expireTime
field to a past timestamp.
Keep your catalog up to date
For best results, your catalog must contain current information. We recommend that you import your catalog on a daily basis to ensure that your catalog is current. You can use Google Cloud Scheduler to schedule imports, or choose an automatic scheduling option when you import data using the Google Cloud console.
You can update only new or changed product items, or you can import the entire catalog. If you import products that are already in your catalog, they are not added again. Any item that has changed is updated.
To update a single item, see Update product information.
Batch update
You can use the import method to batch update your catalog. You do this the same way you do the initial import; follow the steps in Import catalog data.
Monitor import health
To monitor catalog ingestion and health:
View aggregated information about your catalog and preview uploaded products on the Catalog tab of the Search for Retail Data page.
Assess if you need to update catalog data to improve the quality of search results and unlock search performance tiers on the Data quality page.
For more about how to check search data quality and view search performance tiers, see Unlock search performance tiers. For a summary of available catalog metrics on this page, see Catalog quality metrics.
To create alerts that let you know if something goes wrong with your data uploads, follow the procedures in Set up Cloud Monitoring alerts.
Keeping your catalog up to date is important for getting high-quality results. Use alerts to monitor the import error rates and take action if needed.
What's next
- Start recording user events.
- View aggregated information about your catalog.
- Set up data upload alerts.