Skip to main content

All Questions

Tagged with
Filter by
Sorted by
Tagged with
0 votes
0 answers
358 views

Query Qubole data in Python

I'm trying to query Qubole data in Python, but running into some issues. Below is my code: from qds_sdk.qubole import Qubole Qubole.configure(api_token="api_token", api_url="https://us....
BirdPlay6's user avatar
0 votes
1 answer
782 views

How to safely insert parameters into a SQL query and get the resulting query?

I have to use a non DBAPI-compliant library to interact with a database (qds_sdk for Qubole). This library only allows to send raw SQL queries without parameters. Thus I would like a SQL injection-...
Roméo Després's user avatar
0 votes
2 answers
320 views

How to get Python in Qubole to save CSV and TXT files to Azure data lake?

I have Qubole connected to Azure data lake, and I can start a spark cluster, and run PySpark on it. However, I can't save any native Python output, like text files or CSVs. I can't save anything other ...
HT.'s user avatar
  • 211
0 votes
1 answer
477 views

How to change the timeout value when running commands on QDS

I've a spark-submit command that calls my python script. The code runs more than 36 hours, however because of the QDS timeout limit of 36 hours my command gets killed after 36 hours. Can someone help ...
Trupti's user avatar
  • 1
0 votes
1 answer
85 views

PySpark Machine Learning on Wide Data in Qubole

I have a large dataset, with roughly 250 features, that I would like to use in a gradient-boosted trees classifier. I have millions of observations, but I'm having trouble getting the model to work ...
ErrorJordan's user avatar
0 votes
1 answer
109 views

Scale plot size of matplotlib plots in Qubole Notebook

Is there a possibility of increasing the size of the plot plotted using z.showplot() in qubole notebooks. import matplotlib as plt plt.figure() plt.bar(pandas_df_hr_sg[:]['hour'],pandas_df_hr_sg[:]['...
Mustajib Mohammed Khan's user avatar
0 votes
1 answer
131 views

Qubole: How can I download scheduler result in python?

Like title, I managed myself download the Qubole result using the query id in python, however, is there a method that I can download the result using scheduler job ID instead of query ID? Thanks.
atsang01's user avatar
  • 227
0 votes
1 answer
177 views

Comparing one day worth of data from S3 buckets faster

Consider 2 data flows below 1. Front End Box ----> S3 Bucket-1 2. Front End Box ----> Kafka --> Storm ---> S3 Bucket-2 The logs from the boxes are being transferred to S3 buckets. The ...
Albatross's user avatar
  • 690