Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CH-247] support async reader when reading parquet from hdfs #248

Open
wants to merge 670 commits into
base: clickhouse_backend
Choose a base branch
from

Conversation

binmahone
Copy link

Changelog category (leave one):

  • New Feature
  • Improvement
  • Bug Fix (user-visible misbehaviour in official stable or prestable release)
  • Performance Improvement
  • Backward Incompatible Change
  • Build/Testing/Packaging Improvement
  • Documentation (changelog entry is not required)
  • Not for changelog (changelog entry is not required)

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
...

Information about CI checks: https://clickhouse.tech/docs/en/development/continuous-integration/

kitaisreal and others added 30 commits December 14, 2021 22:30
Backport ClickHouse#32117 to 21.9: Dictionaries custom query condition fix
Backport ClickHouse#32359 to 21.9: Fix usage of non-materialized skip indexes
Backport ClickHouse#31859 to 21.9: keeper session timeout doesn't work
Backport ClickHouse#32201 to 21.9: Try fix 'Directory tmp_merge_<part_name>' already exists
Backport ClickHouse#32755 to 21.9: fix crash fuzzbits with multiply same fixedstring
taiyang-li and others added 21 commits November 18, 2022 13:54
* WIP

* two way to partition now

* WIP: use actions dag to compare is too slow

* WIP

* fixed code style

* remove debug codes

* fixed code style

* fixed header include

* fixed a bug in tcp q20

* support range partition in shuffle splitter

* remove unused headers

* support expresions caculate in range paritioning
…lickHouse#228)

* fix bug in issue 225

* fix bug of issue 225

* finish debug
…arquet files (ClickHouse#213)

* optimization for reading parquet files. reducing the open operation for the same file

* update
* support function explode

* gluten_191

* finish debug

* fix code style

* improve step desc

* remove useless code

* fix build error

* update pb

* remove empty line
…ouse#231)

* add spark check_overflow function and cast toDecimal32/64/128

* fix check overflow allow null
…lickHouse#237)

* implement function ascii

* implement function ascii

* Update src/Functions/ascii.cpp

Co-authored-by: Vladimir C <[email protected]>

* Update src/Functions/ascii.cpp

Co-authored-by: Vladimir C <[email protected]>

* merge master and fix conflict

* modify as requested

* change as requested

* register function ascii

* add mapping for ascii

* enable limits for functions using FunctionTokens

* fix comment

* reset to original solution

* rename  alphaTokens to SplitByAlphaImpl

* improve doc and uts

* fix bug

* add mapping for splitByRegexp

* add function positive_modulo

* add ducument

* fix type deduction of positive_modulo

* add function positive_modulo

* add ducument

* fix type deduction of positive_modulo

* add notice

* fix typo

* fix typo

* fix bug

* fix ub error

* fix ub error

* pmod: compatibility with Spark, better documentation

* register function pmod

* add mapping for pmod

* add function factorial

* add missed file

* update as request

* extract test of function factorial

* modify return type from Int64 to UInt64

* fix doc

* fix doc

* register function factorial

* add mapping for factorial

* add function canonicalRand

* add perf test

* revert rand.xml

* register function canonicalRand

* add mapping for canonicalRand

* Add function concatWs

* fix code style

* rename function

* rename files

* add alias concat_ws

* revert CMakeLists

* register function concat_ws

* add mapping for concat_ws

Co-authored-by: Vladimir C <[email protected]>
Co-authored-by: Alexey Milovidov <[email protected]>
…D/DATE_SUB/DATEDIFF/modulo/concat_ws/collect_list/FROM_UNIXTIME (ClickHouse#230)

* support function remainder

* support collect_list

* support function from_unixtime partly

* support datediff

* support datediff

* Add support of Date32 arguments

* Add ToExtendedRelativeDayNumImpl

* Add transforms for other arguments lke year, quarter etc

* Add test 02457_datediff_via_unix_epoch

* Set UTC for 02457_datediff_via_unix_epoch

* Fix message about allowed argument types

* Fix Date32 argument in dispatchConstForSecondColumn

* Add 02458_datediff_date32 test

* Update documentation

* Use {} in error message formatting

* Add UTC for toDate32 in tests

* Add toStableRelativeHourNum

* Remove UTC from 02458_datediff_date32 and 02457_datediff_via_unix_epoch tests

* Remove toExtendedReplated; Add template argument is_extended_result

* Add toStableRelativeHourNum to gtest_DateLUTImpl.cpp

* Replace is_extended_result by ResultPrecision

* finish dev

* remove duplicated log

* finish dev

* support to_unixtimestamp and unixtimestamp function

Co-authored-by: Roman Vasin <[email protected]>
@kyligence-git
Copy link
Collaborator

Can one of the admins verify this patch?

@binmahone
Copy link
Author

this PR finishes #247

@binmahone
Copy link
Author

test this please

@baibaichen
Copy link
Collaborator

With submdoule, are we going to remove ch_parquet/arrow?

@lwz9103 lwz9103 force-pushed the clickhouse_backend branch 2 times, most recently from dc60d55 to 8066113 Compare May 26, 2023 03:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.