Cannot use V2 for streaming read #734

james-miles-ccy · 2023-04-22T07:36:07Z

Is there an existing issue for this?

I have searched the existing issues

Current Behavior

I am trying to read via V2 in streaming way, with no success. I was wondering if there is anything I can do to get this working?

the code is below:

df = spark.readStream.format("cloudFiles")
.option("cloudFiles.format", "excel")
.option("maxRowsInMemory", 20)
.schema(schema)
.load(file_path)

display(df)

the exception error is given below:

java.lang.UnsupportedOperationException: ExcelFileFormat as fallback format for V2 supports writing only

Expected Behavior

I was hoping it would generate a dataframe.

Steps To Reproduce

df = spark.readStream.format("cloudFiles")
.option("cloudFiles.format", "excel")
.option("maxRowsInMemory", 20)
.schema(schema)
.load(file_path)

display(df)

Environment

- Spark version:3.3.1
- Spark-Excel version:2.12:3.3.1_0.18.7
- OS:Windows
- Cluster environment:Databricks

Anything else?

No response

nightscape · 2023-04-23T09:26:55Z

The documentation reads like this is only supported for a few specific file formats:
https://docs.databricks.com/ingestion/auto-loader/options.html#file-format-options
Not sure if they are hard-coded somewhere, or one would need to implement a special API.
I don't have time to look into this, but if you're willing to give it a try yourself I can give you some guidance.

arcaputo3 · 2024-04-29T17:41:09Z

We have gotten this to work for other custom file formats with fixed schema. I wonder if we can apply a similar approach here while supporting provided schemas or inferred schemas.

james-miles-ccy changed the title ~~Cannot use V2 for streaming~~ Cannot use V2 for streaming read Apr 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot use V2 for streaming read #734

Cannot use V2 for streaming read #734

james-miles-ccy commented Apr 22, 2023

nightscape commented Apr 23, 2023

arcaputo3 commented Apr 29, 2024

Cannot use V2 for streaming read #734

Cannot use V2 for streaming read #734

Comments

james-miles-ccy commented Apr 22, 2023

Is there an existing issue for this?

Current Behavior

Expected Behavior

Steps To Reproduce

Environment

Anything else?

nightscape commented Apr 23, 2023

arcaputo3 commented Apr 29, 2024