1

I have a setup to run jobs/code on Databricks clusters from a local IDE during development using spark connect. For the most part spark connect is essentially the same as an older spark session if used with the Dataframe api. But spark connect still does not support a sparkContext. So for operations requiring sparkContext the spark connect does not work. For example, we create a checkpoint directory using:

spark.sparkContext.setCheckpointDir("/FileStore/checkpoint")

The above does not work. How can we perform operations like these if trying to test databricks stuff from local IDE using spark connect?

3
  • Have you tried setting the checkpoint directory using spark.conf.set() instead? It is supported in spark connect. Commented Jun 14 at 21:53
  • Which IDE are you using? How did you make the connection from your IDE to Databricks?
    – Memristor
    Commented Jun 15 at 22:54
  • I am using VS Code and using Databricks connect to make connection from IDE to Databricks. Also spark.conf.set probably cannot set checkpoint dir: spark.apache.org/docs/latest/api/python/reference/api/…
    – Tarique
    Commented Jun 18 at 7:16

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.