I use Spark Structured Streaming (pyspark) to read data from Kafka topic. It works well but when I open executors stderr my whole log page is WARN from Kafka saying that
kafkadataconsumer is not running in uninterruptiblethread. it may hang when kafkadataconsumer's methods are interrupted because of kafka-1894
. How can I disable this warning or maybe fix consumer?
Spark: 3.1.1 with org.apache.spark:spark-sql-kafka-0-10_2.12:3.1.2
I tried several options but afaik kafka consumer doesn't know that it runs within some spark application so it is useless trying to set sparkContext.setLogLevel
. The most recent try was with something like this:
logger = spark._jvm.org.apache.log4j
logger.LogManager.getLogger("org.apache.kafka").setLevel(logger.Level.ERROR)
But it doesn't work :(
P.S. Yeah, I know that it is just warning and warning is not an error, but I think one executor generates a nearly 2k rows per second of these warnings so you can't find a useful prints. You either scroll for a really long time or waiting for log file to be loaded. So its kinda frustrating