Skip to content

Latest commit

 

History

History
545 lines (527 loc) · 52.4 KB

43.0.0.md

File metadata and controls

545 lines (527 loc) · 52.4 KB

Apache DataFusion 43.0.0 Changelog

This release consists of 403 commits from 96 contributors. See credits at the end of this changelog for more information.

Breaking changes:

  • Remove Arc wrapping from create_udf's return_type #12489 (findepi)
  • Make make_scalar_function() result candidate for inlining, by removing the Arc #12477 (findepi)
  • Bump MSRV to 1.78 #12398 (comphead)
  • fix: DataFusion panics with "No candidates provided" #12469 (Weijun-H)
  • Implement PartialOrd for Expr and sub fields/structs without using hash values #12481 (ngli-me)
  • Add field trait method to WindowUDFImpl, remove return_type/nullable #12374 (jcsherin)
  • parquet: Make page_index/pushdown metrics consistent with row_group metrics #12545 (progval)
  • Make SessionContext::enable_url_table consume self #12573 (alamb)
  • LexRequirement as a struct, instead of a type #12583 (ngli-me)
  • Require Debug for AnalyzerRule, FunctionRewriter, and OptimizerRule #12556 (alamb)
  • Require Debug for TableProvider, TableProviderFactory and PartitionStream #12557 (alamb)
  • Require Debug for PhysicalOptimizerRule #12624 (AnthonyZhOon)
  • Rename aggregation modules, GroupColumn #12619 (alamb)
  • Update register_table functions args to take Into<TableReference> #12630 (JasonLi-cn)
  • Derive Debug for SessionStateBuilder, adding Debug requirements to fields #12632 (AnthonyZhOon)
  • Support REPLACE INTO for INSERT statements #12516 (fmeringdal)
  • Add PartitionEvaluatorArgs to WindowUDFImpl::partition_evaluator #12804 (jcsherin)
  • Convert rank / dense_rank and percent_rank builtin functions to UDWF #12718 (jatin510)
  • Bug-fix: MemoryExec sort expressions do NOT refer to the projected schema #12876 (berkaysynnada)
  • Minor: add flags for temporary ddl #12561 (hailelagi)
  • Convert BuiltInWindowFunction::{Lead, Lag} to a user defined window function #12857 (jcsherin)
  • Improve performance for physical plan creation with many columns #12950 (askalt)
  • Improve recursive unnest options API #12836 (duongcongtoai)
  • fix(substrait): disallow union with a single input #13023 (tokoko)
  • feat: support arbitrary expressions in LIMIT plan #13028 (jonahgao)
  • Remove unused LogicalPlan::CrossJoin as it is unused #13076 (buraksenn)
  • Minor: make Expr::volatile infallible #13206 (alamb)
  • Convert LexOrdering type to struct. #13146 (ngli-me)

Implemented enhancements:

  • feat(unparser): adding alias for table scan filter in sql unparser #12453 (Lordworms)
  • feat(substrait): set ProjectRel output_mapping in producer #12495 (vbarua)
  • feat:Support applying parquet bloom filters to StringView columns #12503 (my-vegetable-has-exploded)
  • feat: Support adding a single new table factory to SessionStateBuilder #12563 (Weijun-H)
  • feat(planner): Allowing setting sort order of parquet files without specifying the schema #12466 (devanbenz)
  • feat: add support for Substrait ExtendedExpression #12728 (westonpace)
  • feat(substrait): add intersect support to consumer #12830 (tokoko)
  • feat: Implement grouping function using grouping id #12704 (eejbyfeldt)
  • feat(substrait): add set operations to consumer, update substrait to 0.45.0 #12863 (tokoko)
  • feat(substrait): add wildcard handling to producer #12987 (tokoko)
  • feat: Add regexp_count function #12970 (Omega359)
  • feat: Decorrelate more predicate subqueries #12945 (eejbyfeldt)
  • feat: Run (logical) optimizers on subqueries #13066 (eejbyfeldt)
  • feat: Convert CumeDist to UDWF #13051 (jonathanc-n)
  • feat: Migrate Map Functions #13047 (jonathanc-n)
  • feat: improve type inference for WindowFrame #13059 (notfilippo)
  • feat: Move subquery check from analyzer to PullUpCorrelatedExpr (Fix TPC-DS q41) #13091 (eejbyfeldt)
  • feat: Add Date32/Date64 in aggregate fuzz testing #13041 (LeslieKid)
  • feat(substrait): support order_by in aggregate functions #13114 (bvolpato)
  • feat: Support Substrait's IntervalCompound type/literal instead of interval-month-day-nano UDT #12112 (Blizzara)
  • feat: Implement LeftMark join to fix subquery correctness issue #13134 (eejbyfeldt)
  • feat: support logical plan for EXECUTE statement #13194 (jonahgao)
  • feat(substrait): handle emit_kind when consuming Substrait plans #13127 (vbarua)
  • feat(substrait): AggregateRel grouping_expressions support #13173 (akoshchiy)

Fixed bugs:

  • fix: Panic/correctness issue in variance GroupsAccumulator #12615 (eejbyfeldt)
  • fix: coalesce schema issues #12308 (mesejo)
  • fix: Correct results for grouping sets when columns contain nulls #12571 (eejbyfeldt)
  • fix(substrait): remove optimize calls from substrait consumer #12800 (tokoko)
  • fix(substrait): consuming AggregateRel as last node #12875 (tokoko)
  • fix: Update TO_DATE, TO_TIMESTAMP scalar functions to support LargeUtf8, Utf8View #12929 (Omega359)
  • fix: Add Int32 type override for Dialects #12916 (peasee)
  • fix: using simple string match replace regex match for contains udf #12931 (zhuliquan)
  • fix: Dialect requires derived table alias #12994 (peasee)
  • fix: join swap for projected semi/anti joins #13022 (korowa)
  • fix: Verify supported type for Unary::Plus in sql planner #13019 (eejbyfeldt)
  • fix: Do NOT preserve names (aliases) of Exprs for simplification in TableScan filters #13048 (eejbyfeldt)
  • fix: planning of prepare statement with limit clause #13088 (jonahgao)
  • fix: add missing NotExpr::evaluate_bounds #13082 (crepererum)
  • fix: Order by mentioning missing column multiple times #13158 (eejbyfeldt)
  • fix: import JoinTestType without triggering unused_qualifications lint #13170 (smarticen)
  • fix: default UDWFImpl::expressions returns all expressions #13169 (Michael-J-Ward)
  • fix: date_bin() on timstamps before 1970 #13204 (mhilton)
  • fix: array_resize null fix #13209 (jonathanc-n)
  • fix: CSV Infer Schema now properly supports escaped characters. #13214 (mnorfolk03)

Documentation updates:

  • chore: Prepare 42.0.0 Release #12465 (andygrove)
  • Minor: improve ParquetOpener docs #12456 (alamb)
  • Improve doc wording around scalar authoring #12478 (findepi)
  • Minor: improve GroupsAccumulator docs #12501 (alamb)
  • Minor: improve GroupsAccumulatorAdapter docs #12502 (alamb)
  • Improve flamegraph profiling instructions #12521 (alamb)
  • docs: 📝 Add expected answers to DataFrame method examples #12564 (Eason0729)
  • parquet: Add finer metrics on operations covered by time_elapsed_opening #12585 (progval)
  • Update scalar_functions.md #12627 (Abdullahsab3)
  • Move kurtosis_pop to datafusion-functions-extra and out of core #12647 (dharanad)
  • Update introduction.md for blaze project #12577 (liyuance)
  • docs: improve the documentation for Aggregate code #12617 (alamb)
  • doc: Fix malformed hex string literal in user guide #12708 (kawadakk)
  • docs: Update DataFusion introduction to clarify that DataFusion does provide an "out of the box" query engine #12666 (andygrove)
  • Framework for generating function docs from embedded code documentation #12668 (Omega359)
  • Fix misformatted links on project index page #12750 (amoeba)
  • Add DocumentationBuilder::with_standard_argument to reduce copy/paste #12747 (alamb)
  • Minor: doc how field name is to be set for WindowUDF #12757 (jcsherin)
  • Port / Add Documentation for VarianceSample and VariancePopulation #12742 (alamb)
  • Transformed::new_transformed: Fix documentation formatting #12787 (progval)
  • Migrate documentation for all string functions from scalar_functions.md to code #12775 (Omega359)
  • Minor: add README to Catalog Folder #12797 (jonathanc-n)
  • Remove redundant aggregate/window/scalar function documentation #12745 (alamb)
  • Improve description of function migration #12743 (alamb)
  • Crypto Function Migration #12840 (jonathanc-n)
  • Minor: more doc to MemoryPool module #12849 (2010YOUY01)
  • Migrate documentation for all core functions from scalar_functions.md to code #12854 (Omega359)
  • Migrate documentation for Aggregate Functions to code #12861 (jonathanc-n)
  • Wordsmith project description #12778 (matthewmturner)
  • Migrate Regex Functions from static docs #12886 (jonathanc-n)
  • Migrate documentation for all math functions from scalar_functions.md to code #12908 (juroberttyb)
  • Combine the logic of rank, dense_rank and percent_rank udwf to reduce duplications #12893 (jatin510)
  • Migrate Array function Documentation to code #12948 (jonathanc-n)
  • Minor: fix Aggregation Docs from review #12880 (jonathanc-n)
  • Minor: expr-doc small fixes #12960 (jonathanc-n)
  • docs: Add documentation about conventional commits #12971 (andygrove)
  • Migrate datetime documentation to code #12966 (jatin510)
  • Fix CI on main ( regenerate function docs) #12991 (alamb)
  • Split output batches of joins that do not respect batch size #12969 (alihan-synnada)
  • Minor: Fixed regexpr_match docs #13008 (jonathanc-n)
  • Minor: Fix spelling in regexpr_count docs #13014 (jonathanc-n)
  • Update version to 42.1.0, add CHANGELOG (#12986) #12989 (alamb)
  • Added expresion to "with_standard_argument" #12926 (jonathanc-n)
  • Migrate documentation for regr* aggregate functions to code #12871 (alamb)
  • Minor: Add documentation for cot #13069 (alamb)
  • Documentation: Add API deprecation policy #13083 (comphead)
  • docs: Fixed generate_series docs #13097 (jonathanc-n)
  • [docs]: migrate lead/lag window function docs to new docs #13095 (buraksenn)
  • minor: Add deprecated policy to the contributor guide contents #13100 (comphead)
  • Introduce binary_as_string parquet option, upgrade to arrow/parquet 53.2.0 #12816 (goldmedal)
  • Convert ntile builtIn function to UDWF #13040 (jatin510)
  • docs: Added Special Functions Page #13102 (jonathanc-n)
  • [docs]: added alternative_syntax function for docs #13140 (jonathanc-n)
  • Minor: Delete old cume_dist and percent_rank docs #13137 (jonathanc-n)
  • docs: Add alternative syntax for extract, trim and substring. #13143 (Omega359)
  • docs: switch completely to generated docs for scalar and aggregate functions #13161 (Omega359)
  • Minor: improve testing docs, mention cargo nextest #13160 (alamb)
  • minor: Update HOWTO to help with updating new docs #13172 (jonathanc-n)
  • Add config option skip_physical_aggregate_schema_check #13176 (alamb)
  • Enable reading StringViewArray by default from Parquet (8% improvement for entire ClickBench suite) #13101 (alamb)
  • Forward port changes for 42.2.0 release (#13191) #13193 (alamb)
  • [minor] overload from_unixtime func to have optional timezone parameter #13130 (buraksenn)

Other:

  • Impl convert_to_state for GroupsAccumulatorAdapter (faster median for high cardinality aggregates) #11827 (Rachelint)
  • Upgrade sqlparser-rs to 0.51.0, support new interval logic from sqlparse-rs #12222 (samuelcolvin)
  • Do not silently ignore unsupported CREATE TABLE and CREATE VIEW syntax #12450 (alamb)
  • use FileFormat::get_ext as the default file extension filter #12417 (waruto210)
  • fix interval units parsing #12448 (samuelcolvin)
  • test(substrait): update TPCH tests #12462 (vbarua)
  • Add "Extended Clickbench" benchmark for median and approx_median for high cardinality aggregates #12438 (alamb)
  • date_trunc small update for readability #12479 (findepi)
  • cleanup array_has #12460 (samuelcolvin)
  • chore: bump chrono to 0.4.38 #12485 (my-vegetable-has-exploded)
  • Remove deprecated ScalarUDF::new #12487 (findepi)
  • Remove deprecated config setup functions #12486 (findepi)
  • Remove unnecessary shifts in gcd() #12480 (findepi)
  • Return TableProviderFilterPushDown::Exact when Parquet Pushdown Enabled #12135 (itsjunetime)
  • Update substrait requirement from 0.41 to 0.42, prost-build to 0.13.2 #12483 (dependabot[bot])
  • Faster strpos() string function for ASCII-only case #12401 (goldmedal)
  • Specialize ASCII case for substr() #12444 (2010YOUY01)
  • Improve SQLite subquery tables aliasing unparsing #12482 (sgrebnov)
  • Minor: use Option rather than Result for not found suggestion #12512 (alamb)
  • Remove deprecated datafusion_physical_expr::functions module #12505 (findepi)
  • Remove deprecated AggregateUDF::new #12508 (findepi)
  • Make required_guarantees output to be deterministic #12484 (austin362667)
  • Deprecate unused ScalarUDF::fun #12506 (findepi)
  • Remove deprecated WindowUDF::new #12507 (findepi)
  • Preserve the order of right table in NestedLoopJoinExec #12504 (alihan-synnada)
  • Improve benchmark for ltrim #12513 (Rachelint)
  • Fix: check ambiguous column reference #12467 (HuSen8891)
  • Minor: move imports to top in row_hash.rs #12530 (Rachelint)
  • tests: Fix typo in config setting name #12535 (progval)
  • Expose DataFrame select_exprs method #12520 (milenkovicm)
  • Replace some usages of Expr::to_field with Expr::qualified_name #12522 (jonahgao)
  • Bump aws-sdk-sso to 1.43.0, aws-sdk-sts to 1.43.0 and aws-sdk-ssooidc from 1.40.0 to 1.44.0 in /datafusion-cli #12409 (dependabot[bot])
  • Fix NestedLoopJoin performance regression #12531 (alihan-synnada)
  • Produce informative error message on insert plan type mismatch #12540 (findepi)
  • Fix unparse table scan with the projection pushdown #12534 (goldmedal)
  • Automate sqllogictest for String, LargeString and StringView behavior #12525 (goldmedal)
  • Fix unparsing offset #12539 (Stazer)
  • support EXTRACT on intervals and durations #12514 (nrc)
  • Support List type coercion for CASE-WHEN-THEN expression #12490 (goldmedal)
  • Sort metrics alphabetically in EXPLAIN ANALYZE output #12568 (progval)
  • Add RuntimeEnv::try_new and deprecate RuntimeEnv::new #12566 (OussamaSaoudi)
  • Reorgnize the StringView tests in sqllogictests #12572 (goldmedal)
  • fix parquet infer statistics for BinaryView types #12575 (XiangpengHao)
  • Minor: add example to of assert_batches_eq #12580 (alamb)
  • Use qualified aliases to simplify searching DFSchema #12546 (jonahgao)
  • return absent stats when filters are pushed down #12471 (waruto210)
  • Minor: add new() function for ParquetReadOptions #12579 (Smith-Cruise)
  • make Debug for MemoryExec prettier #12582 (samuelcolvin)
  • Add SessionStateBuilder::with_object_store method #12578 (OussamaSaoudi)
  • Fix and Improve Sort Pushdown for Nested Loop and Hash Join #12559 (berkaysynnada)
  • Add Docs and Examples and helper methods to PhysicalSortExpr #12589 (alamb)
  • Warn instead of error for unused imports #12588 (samuelcolvin)
  • Update prost-build requirement from =0.13.2 to =0.13.3 #12587 (dependabot[bot])
  • Add JOB benchmark dataset [1/N] (imdb dataset) #12497 (doupache)
  • Improve documentation and add Display impl to EquivalenceProperties #12590 (alamb)
  • physical-plan: Cast nested group values back to dictionary if necessary #12586 (brancz)
  • Support Date32 for date_trunc function #12603 (goldmedal)
  • Avoid RowConverter for multi column grouping (10% faster clickbench queries) #12269 (jayzhan211)
  • Refactor to support recursive unnest in physical plan #11577 (duongcongtoai)
  • Use original value when comparing with dictionary column in unparser #12610 (Sevenannn)
  • Fix to unparse the plan with multiple UNION statements into an SQL string #12605 (goldmedal)
  • Keep the float information in scalar_to_sql #12609 (Sevenannn)
  • Add Dictionary String (UTF8) type to String sqllogictests #12621 (goldmedal)
  • Improve SanityChecker error message #12595 (alamb)
  • Improve performance of trim for string view (10%) #12395 (Rachelint)
  • Simplify update_skip_aggregation_probe method #12332 (lewiszlw)
  • Minor: Encapsulate type check in GroupValuesColumn, avoid panic #12620 (alamb)
  • Fix sort node deserialization from proto #12626 (palaska)
  • Minor: improve documentation to StringView trim #12629 (alamb)
  • [MINOR]: Simplifications Sort Operator #12639 (akurmustafa)
  • [Minor] Remove redundant member from RepartitionExec #12638 (akurmustafa)
  • implement nested identifier access #12614 (Lordworms)
  • [MINOR]: Rename get_arrayref_at_indices to take_arrays #12654 (akurmustafa)
  • [MINOR]: Use take_arrays in repartition , fix build #12657 (doupache)
  • Add binary_view to string_view coercion #12643 (doupache)
  • [Minor] Improve error message when bitwise_* operator takes wrong unsupported type #12646 (dharanad)
  • Minor: Add github link to code that was upstreamed #12660 (alamb)
  • Minor: Improve documentation on execution error handling #12651 (alamb)
  • Adds WindowUDFImpl::reverse_exprtrait method + Support for IGNORE NULLS #12662 (jcsherin)
  • Fill in missing Debug fields for SessionState #12663 (AnthonyZhOon)
  • Minor: add partial assertion for skip aggregation probe #12640 (Rachelint)
  • Add more functions for string sqllogictests #12665 (goldmedal)
  • Update rstest requirement from 0.22.0 to 0.23.0 #12678 (dependabot[bot])
  • Minor: Change LiteralGuarantee try_new to new #12669 (pgwhalen)
  • Refactor PrimitiveGroupValueBuilder to use MaybeNullBufferBuilder #12623 (alamb)
  • Add value_from_statisics to AggregateUDFImpl, remove special case for min/max/count aggregate statistics #12296 (edmondop)
  • Provide field and schema metadata missing on distinct aggregations. #12691 (wiedld)
  • [MINOR]: Simplify required_input_ordering of BoundedWindowAggExec #12656 (akurmustafa)
  • handle 0 and NULL value of NTH_VALUE function #12676 (thinh2)
  • Improve documentation for AggregateUDFImpl::value_from_stats #12689 (alamb)
  • Add support for external tables with qualified names #12645 (OussamaSaoudi)
  • Fix Regex signature types #12690 (blaginin)
  • Refactor ByteGroupValueBuilder to use MaybeNullBufferBuilder #12681 (alamb)
  • Simplify match patterns in coercion rules #12711 (findepi)
  • Remove aggregate functions dependency on frontend #12715 (findepi)
  • Minor: Remove clone in transform_to_states #12707 (jayzhan211)
  • Refactor tests for union sorting properties, add tests for unions and constants #12702 (alamb)
  • Fix: support Qualified Wildcard in count aggregate function #12673 (HuSen8891)
  • Reduce code duplication in PrimitiveGroupValueBuilder with const generics #12703 (alamb)
  • Disallow duplicated qualified field names #12608 (eejbyfeldt)
  • Optimize base64/hex decoding by pre-allocating output buffers (~2x faster) #12675 (simonvandel)
  • Allow DynamicFileCatalog support to query partitioned file #12683 (goldmedal)
  • Support LIMIT Push-down logical plan optimization for Extension nodes #12685 (austin362667)
  • Fix AvroReader: Add union resolving for nested struct arrays #12686 (JonasDev1)
  • Adds macros for creating WindowUDF and WindowFunction expression #12693 (jcsherin)
  • Support unparsing plans with both Aggregation and Window functions #12705 (sgrebnov)
  • Fix strpos invocation with dictionary and null #12712 (findepi)
  • Add IMDB(JOB) Benchmark [2/N] (imdb queries) #12529 (austin362667)
  • Minor: avoid clone while calculating union equivalence properties #12722 (alamb)
  • Simplify streaming_merge function parameters #12719 (mertak-synnada)
  • Provide field and schema metadata missing on cross joins, and union with null fields. #12729 (wiedld)
  • Minor: Update string tests for strpos #12739 (alamb)
  • Apply type_union_resolution to array and values #12753 (jayzhan211)
  • fix equal_to in PrimitiveGroupValueBuilder #12758 (Rachelint)
  • Fix equal_to in ByteGroupValueBuilder #12770 (alamb)
  • Allow boolean Expr simplification even when nullable #12746 (eejbyfeldt)
  • Fix unnest conjunction with selecting wildcard expression #12760 (goldmedal)
  • Improve round scalar function unparsing for Postgres #12744 (sgrebnov)
  • Fix stack overflow calculating projected orderings #12759 (alamb)
  • Upgrade arrow/parquet to 53.1.0 / fix clippy #12724 (alamb)
  • Account for constant equivalence properties in union, tests #12562 (alamb)
  • Minor: clarify comment about empty dependencies #12786 (alamb)
  • Introduce Signature::String and return error if input of strpos is integer #12751 (jayzhan211)
  • Minor: improve docs on MovingMin/MovingMax #12790 (alamb)
  • Add union sorting equivalence end to end tests #12721 (alamb)
  • Fix bug in TopK aggregates #12766 (avantgardnerio)
  • Minor: clean up TODO comments in unnest.slt #12795 (goldmedal)
  • Refactor DependencyMap and Dependencies into structs #12761 (alamb)
  • Remove unnecessary DFSchema::check_ambiguous_name #12805 (jonahgao)
  • API from ParquetExec to ParquetExecBuilder #12799 (alamb)
  • Minor: add documentation note about NullState #12791 (alamb)
  • Chore: Move aggregate statistics optimizer test from core to optimizer crate #12783 (jayzhan211)
  • Clarify documentation on ArrowBytesMap and ArrowBytesViewMap #12789 (alamb)
  • Bump cookie and express in /datafusion/wasmtest/datafusion-wasm-app #12825 (dependabot[bot])
  • Remove unused dependencies and features #12808 (jonahgao)
  • Add Aggregation fuzzer framework #12667 (Rachelint)
  • Retry apt-get and rustup on CI #12714 (findepi)
  • Support creating tables via SQL with FixedSizeList column (e.g. a int[3]) #12810 (jandremarais)
  • Make HashJoinExec::join_schema public #12807 (progval)
  • Fix convert_to_state bug in GroupsAccumulatorAdapter #12834 (alamb)
  • Fix: approx_percentile_cont_with_weight Panic #12823 (jonathanc-n)
  • Fix clippy error on wasmtest #12844 (jonahgao)
  • Fix panic on wrong number of arguments to substr #12837 (eejbyfeldt)
  • Fix Bug in Display for ScalarValue::Struct #12856 (avantgardnerio)
  • Support DictionaryString for Regex matching operators #12768 (blaginin)
  • Minor: Small comment changes in sql folder #12838 (jonathanc-n)
  • Add DuckDB struct test and row as alias #12841 (jayzhan211)
  • Support struct coercion in type_union_resolution #12839 (jayzhan211)
  • Added check for aggregate functions in optimizer rules #12860 (jonathanc-n)
  • Optimize iszero function (3-5x faster) #12881 (simonvandel)
  • Macro for creating record batch from literal slice #12846 (timsaucer)
  • Implement special min/max accumulator for Strings and Binary (10% faster for Clickbench Q28) #12792 (alamb)
  • Make PruningPredicate's rewrite public #12850 (adriangb)
  • octet_length + string view == ❤️ #12900 (Omega359)
  • Remove Expr clones in select_to_plan #12887 (jonahgao)
  • Minor: added to docs in expr folder #12882 (jonathanc-n)
  • Print undocumented functions to console while generating docs #12874 (alamb)
  • Fix: handle NULL offset of NTH_VALUE window function #12851 (HuSen8891)
  • Optimize signum function (3-25x faster) #12890 (simonvandel)
  • re-export PartitionEvaluatorArgs from datafusion_expr::function #12878 (Michael-J-Ward)
  • Unparse Sort with pushdown limit to SQL string #12873 (goldmedal)
  • Add spilling related metrics for aggregation #12888 (2010YOUY01)
  • Move equivalence fuzz testing to fuzz test binary #12767 (alamb)
  • Remove unused math_expressions.rs #12917 (jonahgao)
  • Improve AggregationFuzzer error reporting #12832 (alamb)
  • Import Arc consistently #12899 (findepi)
  • Optimize isnan (2-5x faster) #12889 (simonvandel)
  • Minor: Move StringArrayType, StringViewArrayBuilder, etc outside of string module #12912 (Omega359)
  • Remove redundant unsafe in test #12914 (findepi)
  • Ensure that math functions fulfil the ColumnarValue contract #12922 (joroKr21)
  • Optimization: support push down limit when full join #12963 (JasonLi-cn)
  • Implement GroupColumn support for StringView / ByteView (faster grouping performance) #12809 (Rachelint)
  • Implement native support StringView for REGEXP_LIKE #12897 (tlm365)
  • Minor: Refactor benchmark imports to use util module #12885 (loloxwg)
  • Fix zero data type in expr % 1 simplification #12913 (eejbyfeldt)
  • Optimize performance of math::cot (~2x faster) #12910 (tlm365)
  • Expand wildcard expressions in distinct on #12941 (epsio-banay)
  • chores: remove redundant clone #12964 (JasonLi-cn)
  • Fix: handle NULL input in lead/lag window function #12811 (HuSen8891)
  • Fix logical vs physical schema mismatch for aliased now() #12951 (wiedld)
  • Optimize performance of math::trunc (~2.5x faster) #12909 (tlm365)
  • Minor: Add slt test for DISTINCT ON with wildcard #12968 (alamb)
  • Fix 'Too many open files' on fuzz test. #12961 (dhegberg)
  • Increase minimum supported Rust version (MSRV) to 1.79 #12962 (findepi)
  • Unparse SubqueryAlias without projections to SQL #12896 (goldmedal)
  • Fix 2 bugs related to push down partition filters #12902 (eejbyfeldt)
  • Move TableConstraint to Constraints conversion #12953 (findepi)
  • Added current_timestamp alias #12958 (jonathanc-n)
  • Improve unparsing for ORDER BY, UNION, Windows functions with Aggregation #12946 (sgrebnov)
  • Handle one-element array return value in ScalarFunctionExpr #12965 (joroKr21)
  • Add links to new_constraint_from_table_constraints doc #12995 (findepi)
  • Fix:fix HashJoin projection swap #12967 (my-vegetable-has-exploded)
  • refactor(substrait): refactor ReadRel consumer #12983 (tokoko)
  • Move SMJ join filtered part out of join_output stage. LeftOuter, LeftSemi #12764 (comphead)
  • Remove logical cross join in planning #12985 (Dandandan)
  • [MINOR]: Use arrow take_arrays, remove datafusion take_arrays #13013 (akurmustafa)
  • Don't preserve functional dependency when generating UNION logical plan #12979 (Sevenannn)
  • [Minor]: Add data based sort expression test #12992 (akurmustafa)
  • Removed last usages of scalar_inputs, scalar_input_types and inputs2 to use arrow unary/binary for performance #12972 (buraksenn)
  • Minor: Update release instructions to include new crates #13024 (alamb)
  • Extract CSE logic to datafusion_common #13002 (peter-toth)
  • Enhance table scan unparsing to avoid unnamed subqueries. #13006 (goldmedal)
  • Fix count on all null VALUES clause #13029 (findepi)
  • Support filter in cross join elimination #13025 (Dandandan)
  • [minor]: remove same util functions from the code base. #13026 (akurmustafa)
  • Improve AggregateFuzz testing: generate random queries #12847 (alamb)
  • Fix functions with Volatility::Volatile and parameters #13001 (agscpp)
  • refactor: Incorporate RewriteDisjunctivePredicate rule into SimplifyExpressions #13032 (eejbyfeldt)
  • Move filtered SMJ right join out of join_partial phase #13053 (comphead)
  • Remove functions and types deprecated since 37 #13056 (findepi)
  • Minor: Cleaned physical-plan Comments #13055 (jonathanc-n)
  • improve the condition checking for unparsing table_scan #13062 (goldmedal)
  • minor: simplify associated item bound of hash_array_primitive #13070 (jonahgao)
  • extended log.rs tests for unary/binary and f32/f64 casting #13034 (buraksenn)
  • Fix check_not_null_constraints null detection #13033 (findepi)
  • [Minor] Update info/list of TPC-DS queries #13075 (Dandandan)
  • Fix logical vs physical schema mismatch for UNION where some inputs are constants #12954 (wiedld)
  • Improve CSE stats #13080 (peter-toth)
  • Infer data type from schema for Values and add struct coercion to coalesce #12864 (jayzhan211)
  • [minor]: use arrow take_batch instead of get_record_batch_indices #13084 (akurmustafa)
  • chore: Added a number of physical planning join benchmarks #13085 (mnorfolk03)
  • Fix more instances of schema missing metadata #13068 (itsjunetime)
  • Bug-fix / Limit with_new_exprs() #13109 (berkaysynnada)
  • Minor: doc IMDB in benchmark README #13107 (2010YOUY01)
  • removed --prefer_hash_join option from parquet_filter command. #13106 (neyama)
  • Make CI error if a function has no documentation #12938 (alamb)
  • Allow using cargo nextest for running tests #13045 (alamb)
  • Add benchmark for memory-limited aggregation #13090 (2010YOUY01)
  • Add clickbench parquet based queries to sql_planner benchmark #13103 (Omega359)
  • Improve documentation and examples for SchemaAdapterFactory, make record_batch "hygenic" #13063 (alamb)
  • Move filtered SMJ Left Anti filtered join out of join_partial phase #13111 (comphead)
  • Improve TableScan with filters pushdown unparsing (multiple filters) #13131 (sgrebnov)
  • Raise a plan error on union if column count is not the same between plans #13117 (Omega359)
  • Add basic support for unnest unparsing #13129 (sgrebnov)
  • Improve TableScan with filters pushdown unparsing (joins) #13132 (sgrebnov)
  • Report offending plan node when In/Exist subquery misused #13155 (findepi)
  • Remove unused assert_analyzed_plan_ne test helper #13121 (findepi)
  • Fix Utf8View as Join Key #13115 (demetribu)
  • Add Support for modulus operation in substrait #13108 (LatrecheYasser)
  • unify cast_to function of ScalarValue #13122 (JasonLi-cn)
  • Add unused_qualifications rustic lint with deny lint level. #13086 (dhegberg)
  • [Optimization] Infer predicate under all JoinTypes #13081 (JasonLi-cn)
  • Support negate arithmetic expression in substrait #13112 (LatrecheYasser)
  • Fix to_char signature ordering #13126 (Omega359)
  • chore: re-export functions_window_common::ExpressionArgs #13149 (Michael-J-Ward)
  • minor: Fix build on main #13159 (eejbyfeldt)
  • minor: Update test case for issue #5771 showing it is resolved #13180 (eejbyfeldt)
  • Test LIKE with dynamic pattern #13141 (findepi)
  • Increase fuzz testing of streaming group by / low cardinality columns #12990 (alamb)
  • FFI initial implementation #12920 (timsaucer)
  • Report file location and offset when CSV schema mismatch #13185 (findepi)
  • Round robin polling between tied winners in sort preserving merge #13133 (jayzhan211)
  • Fix rendering of dictionary empty string values in SLT tests #13198 (findepi)
  • Improve push down filter of join #13184 (JasonLi-cn)
  • Minor: Reduce indirection for finding changlog #13199 (alamb)
  • Support DictionaryArray in OVER clause #13153 (adriangb)
  • Allow testing records with sibling whitespace in SLT tests and add more string tests #13197 (findepi)
  • Use single file write when an extension is present in the path. #13079 (dhegberg)
  • Deprecate ScalarUDF::invoke and invoke_no_args for invoke_batch #13179 (findepi)
  • consider volatile function in simply_expression #13128 (Lordworms)
  • Fix CI compile failure due to merge conflict #13219 (alamb)
  • Revert "Improve push down filter of join (#13184)" #13229 (eejbyfeldt)
  • Derive Clone for more ExecutionPlans #13203 (alamb)
  • feat(logical-types): add NativeType and LogicalType #12853 (notfilippo)
  • Apply projection to Statistics in FilterExec #13187 (alamb)
  • Minor: make LeftJoinData into a struct in CrossJoinExec #13227 (alamb)
  • Deprecate invoke and invoke_no_args in favor of invoke_batch #13174 (findepi)
  • Support timestamp(n) SQL type #13231 (findepi)
  • Remove elements deprecated since v 38. #13245 (findepi)

Credits

Thank you to everyone who contributed to this release. Here is a breakdown of commits (PRs merged) per contributor.

    68	Andrew Lamb
    34	Piotr Findeisen
    24	Jonathan Chen
    19	Emil Ejbyfeldt
    17	Jax Liu
    12	Bruce Ritchie
    11	Jonah Gao
     9	Jay Zhan
     8	Mustafa Akur
     8	kamille
     7	Sergei Grebnov
     7	Tornike Gurgenidze
     6	JasonLi
     6	Oleks V
     6	Val Lorentz
     6	jcsherin
     5	Burak Şen
     5	Samuel Colvin
     5	Yongting You
     5	dependabot[bot]
     4	HuSen
     4	Jagdish Parihar
     4	Simon Vandel Sillesen
     4	wiedld
     3	Alihan Çelikcan
     3	Andy Grove
     3	AnthonyZhOon
     3	Austin Liu
     3	Berkay Şahin
     3	Daniel Hegberg
     3	Daniël Heres
     3	Lordworms
     3	Michael J Ward
     3	OussamaSaoudi
     3	Qianqian
     3	Tai Le Manh
     3	Victor Barua
     3	doupache
     3	ngli-me
     3	yi wang
     2	Adrian Garcia Badaracco
     2	Alex Huang
     2	Brent Gardner
     2	Dharan Aditya
     2	Dmitrii Blaginin
     2	Duong Cong Toai
     2	Filippo Rossi
     2	Georgi Krastev
     2	June
     2	Max Norfolk
     2	Peter Toth
     2	Tim Saucer
     2	Yasser Latreche
     2	peasee
     2	waruto
     1	Abdullah Sabaa Allil
     1	Agaev Guseyn
     1	Albert Skalt
     1	Andrey Koshchiy
     1	Arttu
     1	Baris Palaska
     1	Bruno Volpato
     1	Bryce Mecum
     1	Daniel Mesejo
     1	Dmitry Bugakov
     1	Eason
     1	Edmondo Porcu
     1	Eduard Karacharov
     1	Frederic Branczyk
     1	Fredrik Meringdal
     1	Haile
     1	Jan
     1	JonasDev1
     1	Justus Flerlage
     1	Leslie Su
     1	Marco Neumann
     1	Marko Milenković
     1	Martin Hilton
     1	Matthew Turner
     1	Nick Cameron
     1	Paul
     1	Smith Cruise
     1	Tomoaki Kawada
     1	WeblWabl
     1	Weston Pace
     1	Xiangpeng Hao
     1	Xwg
     1	Yuance.Li
     1	epsio-banay
     1	iamthinh
     1	juroberttyb
     1	mertak-synnada
     1	neyama
     1	smarticen
     1	zhuliquan
     1	张林伟

Thank you also to everyone who contributed in other ways such as filing issues, reviewing PRs, and providing feedback on this release.