Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ClickHouse? #145

Open
nelsonic opened this issue Aug 20, 2024 · 2 comments
Open

ClickHouse? #145

nelsonic opened this issue Aug 20, 2024 · 2 comments
Labels
discuss Share your constructive thoughts on how to make progress with this issue help wanted If you can help make progress with this issue, please comment! question A question needs to be answered before progress can be made on this issue research Research required; be specific

Comments

@nelsonic
Copy link
Member

Opening to capture some basic knowledge ... 📝

We have recently been forced to use ClickHouse
as part of deploying Plausible Analytics ("Community Edition") dwyl/learn-analytics#4 ...
Don't have anything against it. Just wonder if the volume of data we are likely to see for a basic website justifies the expense/overhead of having two databases (Postgres and ClickHouse ...) 💭

https://clickhouse.com
image

https://github.com/ClickHouse/ClickHouse
image

@nelsonic nelsonic added help wanted If you can help make progress with this issue, please comment! question A question needs to be answered before progress can be made on this issue discuss Share your constructive thoughts on how to make progress with this issue research Research required; be specific labels Aug 20, 2024
@timadevelop
Copy link

as part of deploying Plausible Analytics ("Community Edition") dwyl/learn-analytics#4 ...

Did you manage to solve the issue with backups? afair people backed up whole machine volume, not database per se...

ClickHouse is top-notch for analytics, it was build specifically for analytics, easy to get up and running, but it gets complex to manage once you're in production. There are some concerns about TimescaleDB performance and CE licensing in comparison to ClickHouse, but overall for lean companies Timescale is much easier to deal with I think.
I know people who gave up Plausible just because of the backup strategy, maybe it's better now?

@nelsonic
Copy link
Member Author

Yeah, the Plausible backup story isn't "fixed" yet.
Which is why we are trying to figure out if we can use Postgres for the Analytics data
instead of ClickHouse - which we agree is better for higher volumes of data ...

Reading: https://clickhouse.com/docs/en/faq/general/why-clickhouse-is-so-fast#performance-when-inserting-data
image

"We recommend inserting data in packets of at least 1000 rows, or no more than a single request per second."

This means the Application has to temporarily store the rows in memory before inserting. 😕

Replication uses Zookeper: https://clickhouse.com/docs/en/architecture/replication https://en.wikipedia.org/wiki/Apache_ZooKeeper (Java)
Nothing "wrong" with that. Just noting that it's not a simple setup. 💭

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss Share your constructive thoughts on how to make progress with this issue help wanted If you can help make progress with this issue, please comment! question A question needs to be answered before progress can be made on this issue research Research required; be specific
Projects
None yet
Development

No branches or pull requests

2 participants