I have a setup to run jobs/code on Databricks clusters from a local IDE during development using spark connect. For the most part spark connect is essentially the same as an older spark session if used with the Dataframe api. But spark connect still does not support a sparkContext. So for operations requiring sparkContext the spark connect does not work. For example, we create a checkpoint directory using:
spark.sparkContext.setCheckpointDir("/FileStore/checkpoint")
The above does not work. How can we perform operations like these if trying to test databricks stuff from local IDE using spark connect?
spark.conf.set()
instead? It is supported in spark connect.spark.conf.set
probably cannot set checkpoint dir: spark.apache.org/docs/latest/api/python/reference/api/…