Implementing Spark-Compatible json_tuple in Apache DataFusion
PR: apache/datafusion#20412
Background DataFusion-Comet accelerates Spark queries by offloading execution to Apache DataFusion. For this to work, DataFusion needs to support the Spark built-in functions that Comet encounters. json_tuple is one of them — it is commonly used in ETL pipelines to extract fields from JSON columns without defining a full schema.
Comet had an open issue requesting this. Without native support, queries using json_tuple would fall back to Spark’s own execution path, defeating the purpose of using Comet.