We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When reading excel which contains the dates in the format MM/DD/YYYY, after reading using the below
data_frame = ( spark.read.format("excel") .option( "header", "true", ) .option("maxByteArraySize", 2147483647) .option("timestampFormat", "yyyy-MM-dd HH:mm:ss") .option("setErrorCellsToFallbackValues", "true") .option("maxRowsInMemory", 200) .load('ExcelReaderProblemExcel') )
Data frame result: [Row(Date MM/DD/YYYY='3/29/20'), Row(Date MM/DD/YYYY='3/14/21'), Row(Date MM/DD/YYYY='3/15/12'), Row(Date MM/DD/YYYY='3/16/00'), Row(Date MM/DD/YYYY='3/29/04'), Row(Date MM/DD/YYYY='3/29/04'), ] [Row(UTF-8 strings='Portégé'), Row(UTF-8 strings='Portégé'), Row(UTF-8 strings='Portégé'), Row(UTF-8 strings='Portégé'), Row(UTF-8 strings='Portégé'), Row(UTF-8 strings='Portégé'), ]
Since there are date from 2100, it could be correct if I directly use the above dates. ExcelReaderProblemExcel.xlsx
The date string should come out same as shown in Excel. But the date string came out with only 2 digit year instead of 4 digit year value.
No response
- Spark version: 3.5.0 - Spark-Excel version: spark-excel_2.12-3.5.0_0.20.3 - OS: Windows - Cluster environment: -
There is a similar issue related to this issue, which is still open
#351
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Am I using the newest version of the library?
Is there an existing issue for this?
Current Behavior
When reading excel which contains the dates in the format MM/DD/YYYY, after reading using the below
data_frame = (
spark.read.format("excel")
.option(
"header",
"true",
)
.option("maxByteArraySize", 2147483647)
.option("timestampFormat", "yyyy-MM-dd HH:mm:ss")
.option("setErrorCellsToFallbackValues", "true")
.option("maxRowsInMemory", 200)
.load('ExcelReaderProblemExcel')
)
Data frame result:
[Row(Date MM/DD/YYYY='3/29/20'),
Row(Date MM/DD/YYYY='3/14/21'),
Row(Date MM/DD/YYYY='3/15/12'),
Row(Date MM/DD/YYYY='3/16/00'),
Row(Date MM/DD/YYYY='3/29/04'),
Row(Date MM/DD/YYYY='3/29/04'),
]
[Row(UTF-8 strings='Portégé'),
Row(UTF-8 strings='Portégé'),
Row(UTF-8 strings='Portégé'),
Row(UTF-8 strings='Portégé'),
Row(UTF-8 strings='Portégé'),
Row(UTF-8 strings='Portégé'),
]
Since there are date from 2100, it could be correct if I directly use the above dates.
ExcelReaderProblemExcel.xlsx
Expected Behavior
The date string should come out same as shown in Excel.
But the date string came out with only 2 digit year instead of 4 digit year value.
Steps To Reproduce
No response
Environment
Anything else?
There is a similar issue related to this issue, which is still open
#351
The text was updated successfully, but these errors were encountered: