Questions tagged [google-bigquery]
Google BigQuery is a web service that lets you do interactive analysis of massive datasets—analyzing billions of rows in seconds. Scalable and easy to use, BigQuery lets developers and businesses tap into powerful data analytics on demand.
63 questions
0
votes
0
answers
29
views
BigQuery can't connection to MySQL Cloud SQL after upgrade from 5.7 to 8.0
I have the a similar problem to this here , but the suggestions there don't seem to work. When trying to execute a query I get this:
Invalid table-valued function EXTERNAL_QUERY Failed to connect to ...
0
votes
0
answers
42
views
How can a data warehouse like BigQuery parallelize queries that Postgres cannot?
In order to get more practice with database management and reporting tools, I have been working to load the recent SSN data leak into Postgres. Since it is such a large dataset, it has served as a ...
0
votes
0
answers
108
views
502 error streaming data to Big Query
From 28th of February 2024 I noticed more often problem with 502 error during streaming data from Make connector to BQ table.
It first occurred at a specific time once per day but it became more ...
0
votes
1
answer
29
views
Google Datastream errors on larger MySQL tables
I have set up a Datastream service, in order to replicate data from Cloud SQL (MySQL) to BigQuery.
Everything is set up correctly, connection works. But the weird thing is that only tables < 10mb ...
0
votes
0
answers
33
views
SQL: Perform join on multiple tables on Google BigQuery
Please, I am running a professional course in data analytics. Right now, I am on my Capstone project. But I am having challenge combining tables in BigQuery using SQL. The data is about bicycle trips ...
1
vote
1
answer
186
views
Aggregate SCD Type 2 data "as of" each day
Problem
When working with SCD type 2 data it's easy to see the state of a table "as of" a given point in time by using your date columns, eg: valid_from and valid_to. For example:
select * ...
0
votes
1
answer
45
views
Does the BigQuery API offer a way to retrieve info on scheduled queries?
Using the BigQuery C# API, I can retrieve a list of job IDs:
BigQueryClient _client = BigQueryClient.Create(...);
...
foreach (var page in _client.ListJobs(projectId).AsRawResponses())
if (page....
1
vote
1
answer
1k
views
Using COUNT returns the same value when counting two different columns
I am new to SQL and am using it for a school project and would like some help.
I have a table with multiple columns about a fictional company that provides a bike sharing service. One of the rows I am ...
0
votes
0
answers
236
views
Optimising a recursive SQL query that processes several million records in BQ
I need help optimizing a recursive SQL query in BQ.
I have hierarchical data stored in a table as parent-child relationships, i.e
will be stored as
parent_item_id
child_item_id
1
1
1
2
1
3
3
4
...
0
votes
1
answer
47
views
How to order strings in BigQuery to have lowercased characters ordered before uppercased characters?
I'm updating a reporting tool implementation so that it queries BigQuery instead of PostgreSQL. The PostgreSQL data is ingested in BigQuery. One requirement is for the report results to stay exactly ...
0
votes
1
answer
164
views
Does truncating table refresh materialized view on BigQuery?
Just need to make sure before truncating a table, will all data on materialized view that depend on that table wiped too?
Also if materialized view recreated (dropped and created again) after inserts ...
0
votes
2
answers
2k
views
How to Select The Lowest Date in one field that's higher than the date in another field
The below aim isn't actually what my query is for, but I'm using it as an analogy to explain more simply what I am trying to achieve:
I am trying to build a BigQuery Script which looks at all arcade ...
0
votes
1
answer
133
views
Airflow to BigQueryt data load taking forever
Im currently working as a junior data engineer. My main job right now is to move data from a DB in MYSQL (which gets updated every few minutes via webhooks) and send it to BigQuery as frequently as ...
0
votes
0
answers
82
views
Generate unique ids in BigQuery data import
I am using Zapier to import woocommerce orders into BigQuery and build a datawarehouse with paid orders from multiple sites. However, multiple sites generate an ID per site, not a unique id globally ...
0
votes
1
answer
196
views
Converting EAVT table into SCD type 2
After a lot of research and head picking, I'm still unable to find a good/clean solution to convert an entity-attribute-value-timestamp table to an scd type 2 dimension.
Here's the issue:
I have a CRM ...
1
vote
0
answers
543
views
Convert any string to url valid percent encoding in BigQuery
I am trying to convert any string with any set of special characters into a valid url of the format below.
In Bigquery
Example:
/artwork-v2/-̴̕ι-̶͔͛n̴e̷p̸u̴̒n̵uś̵̥o̵̙̾rt̷͗um̶̹͐-20380
encodes to:
/...
0
votes
1
answer
2k
views
Calculate Year over Year and Month over Month
I have table in BigQuery which keeps track of spend amount for every quarter starting June 2019. I need to calculate Year over year and Month over Month percent change. I've mentioned the appropriate ...
0
votes
2
answers
2k
views
UNNEST Operator in bigquery: strange behaviour when aggregating results
I'm trying to understand how the UNNEST operator works over a public database from Google which store CRUX data (Chrome UX Report).
At this page where some examples are provided.
I could understand ...
-1
votes
1
answer
4k
views
How to return missing rows from LEFT JOIN in BigQuery
How can we get BigQuery to return the rows in a LEFT JOIN which exist in TABLE A but are NULL in TABLE B?
-- find missing users. return rows which exist in A but not in B
select a.user_id
...
0
votes
1
answer
804
views
How can I group multiple records as a single .csv string line?
I have a relation where a User has multiple dogs (as many as 15) but each dog is contained in a single row in the table, and they all have a userId in common.
For example, Table `dogs`:
| User | ...
1
vote
0
answers
27
views
What is the appropriate SQL statement/s to produce a stacked bar chart for multiple random variable distributions?
The Objective
Graph the difference between the probability distributions of sets of random variables that come from a number of different sources using a stacked bar chart (or another type of graph ...
0
votes
1
answer
22
views
Select N preceding rows when column match is true
I am attempting to select 10 rows preceding any row where the keyPerformanceIndicator column is TRUE (order by descending date/time). I suspect this is somehow achieved through a window function in ...
0
votes
1
answer
666
views
Monthly Growth - Bigquery
I need your help again. I need to calculate the monthly growth for the trips.
The query that I have written is below:
SELECT EXTRACT(YEAR FROM DATE (stoptime)) AS Year, EXTRACT(MONTH FROM DATE (...
0
votes
1
answer
816
views
Calculate Percentage with JOINS and original value
I need your help. I'm trying to calculate percentages by using results from JOINING divided by the original value. I do not know how to combine both values. I try to find trips with station id which ...
1
vote
1
answer
175
views
how to get weekly unbound retention from bigquery?
I Need to get the weekly retention users.
If user has made the transaction in week 4, that user is there in Week 0, week 1, week 2, week 3, week 4.
If User made transaction in week 0, and week 3 then ...
1
vote
0
answers
799
views
Keeping the same order of selected records while inserting to another table
I am running a below query in big query
insert into `test-project.temp_test_dataset.cluster_records`
(pm,col1,col2,col3,total)
select pm,col1,col2,col3,total from (
select 'ingest' as pm,col1,col2,...
0
votes
1
answer
22
views
Verify day falls in month
I have a table in BigQuery with Xday, Xmonth, and Xyear columns but the data was user generated and apparently done without adequate input validation so some of it is nonsense. It's easy enough to do ...
0
votes
1
answer
865
views
How to find consecutive non zero values from a column
I have a table as follows
user timestamp counts
xyz 01-01-2020 00:05:00 12
xyz 01-01-2020 00:10:00 11
xyz 01-01-2020 00:15:00 45
xyz 01-01-2020 00:20:00 0
xyz 01-01-...
0
votes
1
answer
70
views
compare to total, compare to percent of a main category, compare over periods e.g. month over month to get growth %
This is kind of a big question. I am looking for maybe a general direction, possible solutions to my issue, if a specific solution is impossible to give over the forum.
I am working in BigQuery, ...
0
votes
0
answers
367
views
Choosing right database for storing bank transactions
I am starting a new project within GCP and I am trying to choose a right tool for storing bank transactions:
I don't need transactions, these will be basically write-only, no updates
I don't need ...
0
votes
2
answers
1k
views
Performing SQL counts with and without a WHERE clause from the same table
This is my first post so I apologize if I am not concise enough. I am trying to come up with an SQL query to identify data quality issues.
Here's the sample table:
DeviceOS Bytes
Roku 10,...
0
votes
1
answer
91
views
Calculate the total on events with two time conditions
I have a table in BigQuery that looks something like this:
schema = [
bigquery.SchemaField('timestamp', 'TIMESTAMP', mode='REQUIRED', description='Data point timestamp'),
bigquery....
0
votes
1
answer
264
views
Oracle golden gate to BigQuery
Im trying to setting up the Golden gate to sync the data to BigQuery. When I start pushing the initial load, my extractor exported all the data and even from the replicat stats Im able to see the ...
-1
votes
2
answers
3k
views
Query for average trip time for trips - BigQuery [closed]
Flowlogistic moves equipment and parts for the mining company by air. In 2019 Flowlogistic managed more than 8,000 flights for this customer. The log file contains a record of each of those flights, ...
0
votes
1
answer
450
views
Count missing values across columns and join back to original table
Here is the table:
I want to count the missing values across each row for t1, t2, t3,..., and create another column in the same table with the results as shown in the picture.
I can easily do this ...
1
vote
1
answer
38
views
How can I group together rankings by number of users?
I have a table which includes the following:
ID evar event_date ranking
1 landing 2019-01-01 1
1 content 2019-01-02 2
1 homepage 2019-01-03 3
2 ...
1
vote
1
answer
48
views
How to delist subset strings if it is a substring of another string
I have a list of strings in my database let say in a column
understand
understan
understa
underst
unders
under
I'm trying to find out How to delist subset strings if it is a substring of another ...
0
votes
1
answer
334
views
How to count number of distinct days from one table using two dates (for range) from another table for each row?
So I have one datatable with three columns: userid, mindate, maxdate.
I have another datatable that contains all the login logs for each user.
I need to make a query so that I can sum all the ...
0
votes
0
answers
121
views
Is it possible to join on column placeholders?
I have a query below in which i was wondering whether it is possible to join on my placeholders and if so are my joins in my CTE's correct?
WITH
DCM AS (SELECT
date,
placement as creative,
SUM(...
0
votes
1
answer
375
views
why there is no performance differences if I use where clause inside of inner join?
Basically I have two different types of query in BQ.
the first one:
select q2.name, q1.* , q2.val1 from table1 as q1
inner join
(select name,val1, val2 from table2) as q2
on q1.name = q2.name
and ...
0
votes
1
answer
808
views
BigQuery - load data with changing schema [closed]
I have situation when an external system is providing data to BigQuery instance as JSON files.
They are rows, that have been updated in that external system, I need this data in BQ for later ...
0
votes
1
answer
25
views
Restructure create and delete events on rows into single table
Similar to this problem: Using start and end event logs to create a table/view containing spans between the times of each log.
I have table that logs 'create' and 'delete' events for a resource that ...
1
vote
1
answer
5k
views
Retrieving executed query list in BigQuery via SQL
Is there a table similar to v$sql in Oracle where I can retrieve data associated with a particular query that was run in BigQuery using SQL?
1
vote
1
answer
1k
views
Merge Overlapping intervals in Bigquery
I have a bigquery table with following columns:
user_id, unique_id, start_timestamp(UTC), end_timestamp
unique_id is always unique, no repeating values. currently data is grouped by user_id. and then ...
0
votes
2
answers
3k
views
Select first and last row based on each user id mysql
I want to select every first and last event of every user id, if no first event exists,then just the last event instead . tried using partition over , but i am getting first 2 events instead.
Input:
...
0
votes
1
answer
584
views
How to convert timestamp to date & time
I am currently working on bigquery(standard sql), where I have a timestamp field which I want to convert to date and time in 2 separate columns. I tried doing this select EXTRACT(DATE FROM timestamp) ...
0
votes
1
answer
131
views
Select only those rows based on specific condition of user id
Below is an example of child's activity table. I only want to see all rows after 1st Tv activity for each user id. I tried grouping but it is not working for me
Table
ID Timestamp ...
2
votes
1
answer
1k
views
Can you use BigQuery to run on top of Bigtable
I need to run BigQuery on top of Bigtable live, not as an export. I have found the information stating it was in beta but only as an export function. I would like to run BigQuery against Bigtable data ...
0
votes
1
answer
3k
views
Counting uniques on a week by week basis
I've managed to find the unique # of session_ids within the first week of the the year with this query.
How do I get each successive week's unique count up to the last week of June?
SELECT COUNT (...
-1
votes
1
answer
231
views
Which database is suitable for serving big timeseries metrics with rollups? [closed]
I have big mysql table containing daily metrics for large number of subjects. Here is the hypotetical schema:
day DATE
subject_id INT
metric1
metric2
metric3
What I want is to find top X subjects (...