8,792 questions
0
votes
0
answers
13
views
boto3 redshift client .describe_statement provides no error message
I've run an sql query through boto3 redshift client's execute_statement function, and subsequently checked the status of the query with describe_statement.
result = client_redshift.execute_statement(...
0
votes
0
answers
40
views
Redshift COPY query hangs. (Running but not submitted) (Table size just 30 rows)
I have been running COPY commands on Redshift in query editor and boto3 (programatically), successfully.
However, after yesterday, neither method works.
Although, cluster console shows that query is ...
0
votes
0
answers
23
views
Is there any way to get more detailed syntax error messages in Redshift SQL?
I am re-writing some 1000+ line queries which were written originally for a PostgreSQL DB and now have to run on near-identical data in Redshift (using Redshift Serverless and the QueryEditorV2). This ...
0
votes
0
answers
21
views
Create table from another table, but only the primary column
From a table, how can I create another table without the data and with only the primary key columns?
I am able to get the primary key columns using the following query:
SELECT column_name
FROM ...
1
vote
2
answers
39
views
fill gaps in outer join in redshift
I have a table that contains products and sales data. Unfortunately, not every month a given product was sold, and I would like unsold products in a given month to also be shown. In Oracle, the query ...
0
votes
2
answers
102
views
Unable to create AWS Glue Connection to RedShift Serverless
I am trying to create a connection between AWS Glue and the Redshift database in Glue. Currently, I am getting an error:
Connection creation is failed.
Create connection failed during validating ...
1
vote
1
answer
75
views
Is there a SQL feature to return the max value for every column?
Background: I have a unique setup with multiple tables, each with circa 1,600 columns, which I am trying to clean up (it creates benefits upstream). I know that around 80% of these columns will always ...
0
votes
1
answer
30
views
in GA4,the daily exported events are about to exceed 1 million. What should I do next?
I am using the Daily Export method to export some GA events to BigQuery, but the daily event volume is about to exceed 1 million. If I switch to the Streaming method, I might lose information such as ...
-1
votes
0
answers
48
views
Seeking Advice on Reducing Redshift Snapshot Costs [closed]
I'm decommissioning my Amazon Redshift cluster due to high costs. It was running on 4 ra3.xlplus nodes, and the snapshot size is 225TB. I've already paused the cluster to save on compute costs, but I ...
0
votes
1
answer
27
views
How to Extract Nested JSON Array in Redshift and Create a Table with Specific Columns?
I have a table in AWS Redshift with a column called json, which stores a nested JSON object as a string. I need to create a new table with three columns: sid, skill_name, and skill_vdd. The json ...
0
votes
1
answer
34
views
Amazon QuickSight | Recursive CTE in subquery are not supported
I would like to use a recursive CTE to generate a series of dates. I am using this CTE to cross join onto locations to generate a mapping table as a CTE where each location has a row for each date. ...
0
votes
0
answers
29
views
updating the two dimesional data in sql redshift
so lets say i have two tables inventory and demand.
both contains 3 keys location , matrial and yr_wk.
inv contains onhand and demand contains demand quantity.
INV data : location , matrial , yr_wk, ...
1
vote
0
answers
43
views
Datashare writes are not authorized by producer or associated by consumer
I am trying to query a datashare from AWS Data Exchange using Redshift in Python. This datashare, to be precise.
This is how I am attempting to run my Python code:
import os
import psycopg
os.environ[...
0
votes
1
answer
49
views
How to count from 0 and onward for every new ID?
I constructed the following query and it gives me the following output...
with tb1 as
(
select id
,row_number() over(order by id)-1 as lag_0
from id_store
),
tb2 as
(
select *
...
0
votes
1
answer
50
views
Recursive SQL not returning full scope of data
I have a logic to calculate number of patients and med dosage per period, however my redhsift recursive SQL is not returning full scope of data:
WITH recursive build (period_start_date, peq, prev_peq, ...
0
votes
2
answers
74
views
Carrying forward SQL
I have a query gets opening and closing balances for each month of customers from their transactions and then get a company total for the month. but if a customer does not have any transactions for ...
0
votes
0
answers
56
views
How best to to dynamically query S3 folders
We have an extremely large S3 bucket, which is divided into folders by date. e.g.
dt=2024-11-19
dt=2024-11-20
dt=2024-11-21
I run queries through Redshift, and was instructed to always filter by the ...
-1
votes
2
answers
30
views
Not able to create a redshift table using glue
Getting error: exception: java.sql.SQLException: Exception thrown in awaitResult:
the table is getting created flight_details but there is only one column dummy in it and the schema defined in create ...
2
votes
1
answer
68
views
Explode values in dbt
I'm trying to replicate the repeat_by and explode functions from Polars using dbt with a Redshift database but am having difficulty finding an equivalent solution.
Here's the sample Polars DataFrame ...
0
votes
0
answers
20
views
Which type of index should I build in this situation to speed up the query on a Hudi table?
I have a Hudi table generated by Spark; the schema was like:
id: int64
content: string
create_date: timestamp[ns]
This table was super large. Most of the queries we perform on this table involve ...
0
votes
1
answer
33
views
Redshift query takes longer in Databricks than in Redshift client
update
I have optimized the query so that the performance now is much better. So, the original question is not important or relevant anymore.
Original post:
I have a query that takes about 5 minutes ...
1
vote
1
answer
32
views
Getting the error when creating the redshift table from GLUE
The error output in the log is:
Failed to create table: An error occurred while calling o105.getSink.
: java.lang.RuntimeException: Temporary directory for redshift not specified. Please verify --...
0
votes
1
answer
24
views
Amazon Redshift, Datagrid find the lock and kill
I am trying to find the lock sql queries, and I used this code,
But when I wanted to cancel the list of queries, it did not work.
Could you please help?
SELECT
*
FROM
stv_locks;
-1
votes
1
answer
67
views
"SQL Error [XX000]: ERROR: Numeric column 2 precision and scales cannot be merged" when coding logic in REdhsift
Such code gives me an error of: "SQL Error [XX000]: ERROR: Numeric column 2 precision and scales cannot be merged"
WITH RECURSIVE build (PERIOD_START_DATE,PEQ,PREV_PEQ,repeated_patient, ...
-1
votes
1
answer
31
views
Copying File to Redshift from S3 Bucket
I am trying to move some data from a CSV in S3 to a Redshift table. The date column is giving me a hard time however.
When opened in a text editor, the date is formatted like this: "YYYY-MM-DD HH:...
-1
votes
2
answers
42
views
Translate excel formulas into Redshift SQL query
I am looking for help to translate excel formulas into Redshift SQL . With given fixed input values for rows 1 (Month ascending) and 2 (SU - some number) I have to calculate through particular cells ...
1
vote
1
answer
39
views
Handling nested fields in Amazon Redshift using COPY command when loading from dynamo db source
I am trying to load data into Redshift from a DynamoDB table. The table includes nested fields which does not get fetched correctly. The data inside the DynamoDB has structure like
"Status": ...
0
votes
0
answers
28
views
Host name for LocalStack Redshift for Amazon Redshift JDBC driver
I have setup LocalStack in my windows machine and started with redshift and few other services.
Then I created a cluster and a DB.
>awslocal redshift describe-clusters --cluster-identifier my-...
0
votes
0
answers
33
views
How to transfer data using Zero-ETL to a writable DB in Redshift
Im using Zero-Etl to move my data from Aurora to Redshift. But this moves the data to a read-only database. How do I then move my data to a full access database?
I have tried creating a materialized ...
0
votes
0
answers
33
views
How can I get Redshift query execution time
I want to find out the total amount of time the queries are running in Redshift.
Is there any query with which I can get this information?.
I tried to get this data using stl_query but because there ...
0
votes
0
answers
20
views
Refresh slowness of materialized view that select latest row for each group in a large table
I have a materialized view defined like this:
CREATE materialized VIEW order
AS SELECT *
FROM ( SELECT *, pg_catalog.row_number()
OVER(
PARTITION BY flow_id, order_id
...
0
votes
2
answers
35
views
Redshift query duplicates
I'm using python with redshift_connector, and analysing the data with pandas. When accessing a redshift db with selecting n columns, I got i lines. However when I wanted to add a new column to this ...
0
votes
1
answer
24
views
The difference between text255 and bytedict Redshift encodings
When selecting a compression encoding for a VARCHAR column in Redshift, two options that present themselves when the column contains a small set of potential string values are text255 and bytedict. ...
0
votes
1
answer
42
views
Best way to capture verbosity in Redshift error
I frequently run into issues with Redshift where I have multiple columns in the same table with the same varchar length. When i try to run an insert statement where a value is greater than the ...
0
votes
1
answer
60
views
Regex to identify all "special" characters
I have a table, sales, with a column, email_address. I want to filter to only include records in the sales table that have any characters that are neither alphanumeric nor special characters. These ...
0
votes
1
answer
91
views
Maintaining Consistency Between Python (e.g. Polars) Functions and SQL (e.g. Redshift) UDFs
I'm working on a data engineering project that processes data from multiple sources using Polars in Python and Redshift as a data warehouse. I need a robust strategy for keeping Python Polars ...
0
votes
1
answer
50
views
Error in creating external table in amazon redshift
I created a cluster with Amazon redshift. Using this connection, I opened the query editor to create an external table
CREATE EXTERNAL SCHEMA dynamodb_external_table2
FROM data catalog
database 'dev'
...
0
votes
1
answer
43
views
How to select rows that don't have overlapping dates
I have a table that contains dates for different products. Each row on this table represents a ticket/issue and the start and end date for the tickets. Each product can have multiple tickets, and ...
0
votes
1
answer
55
views
How to identify the same date and last month date from prior year based on the most recent date for current year in the table in SQL?
I am trying to summarize the overall profit and transactions results for Name and id grouped by new fields on Amazon Redshift. Data Example is provided below:
Table Data Example
After execution, I ...
0
votes
1
answer
38
views
Convert MD5 to BigInt on Amazon Redshift
I have the following C# code:
using System;
using System.Collections.Generic;
using System.Security.Cryptography;
using System.Text;
public class Program
{
public static void Main()
{
...
0
votes
0
answers
25
views
Dependency mismatching in requirement file in AWS Manager Workflows for Apache Airflow(MWAA)
I was trying to upgrade the version of Apache Airflow from 2.2.2 to 2.10.2 but I was getting the error with installing the packages in the requirement.txt file in dbt-redshift , dbt-postgres , dbt-...
0
votes
1
answer
42
views
How to Resolve 'Invalid JSONPath Format' Error When Importing Nested JSON into Amazon Redshift?
I'm trying to insert a nested JSON list into Amazon Redshift but I'm encountering an error: Invalid JSONPath format: Member is not an object. This may be due to incorrect usage of JSONPaths for ...
0
votes
0
answers
28
views
Is it possible to disable the Query Compilation Cache in Redshift?
Redshift has two caches of interest; Query Result Cache and Query Compilation Cache. For this question, I am only interested in the Query Compilation Cache. For control and testing purposes, I need to ...
0
votes
1
answer
35
views
Json Extract from column in amazon redshift
my query:
SELECT
CAST(employee_id AS INT) AS employee_id,
employee_email,
gender,
custom_fields_json
FROM
xxxxxx_employee
WHERE
employee_id IS NOT NULL
ORDER BY
...
0
votes
1
answer
41
views
Redshift Copy Command & parse to JSON
I'm trying to load data from S3 to Redshift using the COPY command.
The table where I'm trying to load have multiple columns, one of those is SUPER. I want to load JSON in that column.
For this ...
0
votes
1
answer
31
views
Redshift admin cannot see svl_user_info table
Even when I connect as the admin superuser, Redshift will not let me access the table:
svl_user_info
so queries like
select * from svl_user_info
do not work. The error message is:
ERROR: permission ...
0
votes
1
answer
50
views
How to return the list and count of unique combinations based on a primary column?
I have the below table structure and want to return a count of the unique combination of relationships based on product. Relationships has 14 unique values
customer ID
relationship
product ID
123
...
-1
votes
1
answer
53
views
How can I differentiate between null value and empty string in query result?
I ran the following sql statement in SQL Workbench/J for redshift:
select null, ' ', len(' ')
This is what it returned. How can I differentiate between the null value and the empty string? In ...
0
votes
0
answers
75
views
Redshift error: "permission denied on relation ___". But, I'm not using the table specified in the error message. What could cause this?
Summary
I'm using the Boto3 APIs (get_jobs & get_workflow) to create an AWS Glue resource inventory for myself. The only manipulation performed includes basic data cleansing (flattening the JSON, ...
0
votes
0
answers
36
views
Running redshift configuration settings within Excel's VBA?
In VBA I want to run enable_case_sensitive_identifier TO true just before the actual select statement. In sqlworkbench the script works as expected. This is running against an Amazon redshift database....