-
Notifications
You must be signed in to change notification settings - Fork 140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updates for this project #7
Comments
Hi Marina, Thanks a lot for working on this. I was aware of many of the issues you listed and have the fix in my local but was not able to commit. Please goahead in committing your changes. I will merge the changes. Also, can you send me across the TODO list and I can create tickets/issues for it. Thanks. |
Great, I've created a Pull request - please take a look! thanks! |
Hi, Krishna, I was also thinking about further scaling of this application. The current requirement to have one instance of this consumer running as a separate JVM process per one partition, and has to have its own properties file and log file, is not scalable enough. It works great if you have 1, or a couple of partitions, but if you start scaling out to dozens and hundreds of partitions - this will not work anymore. For my use case, I would like to have at least 20 partitions - and maintaining 20 different configs/ logs/ startup scripts is not viable anymore.
It is a lot of work, but I think it would take the Kafka-ES consumer to the next scalability level :) thanks! |
sorry, I closed the issue by mistake :) - I'm re-opening it now |
Hi @mpopova-yottaa : Very glad to see your updates and improvements for this project. Yes, I agree with all of the those. I want to work hand in hand with you. As a next step, I will try to break down the above improvements into small issues and we can start working on it. Between, I am yet to merge your changes into the main branch. I reviewed it and see couple of places which needs fix and I will comment on those. I also wanted to do a quick test on the new changes. Have you been using the new changed in your fork in your consumer ? Thank You once again ! |
Great! I have tested all this using the regular JVM startup mode - not 'jsvc' one - as I figured this would be the preferred way anyway. If you think it is important to keep both modes - we could remove the Main from Manifest.MF at all and require to specify it as part of the startup command. This way you could run either the Main or the Daemon version. In addition to the list of TODO items I posted earlier, here are couple more thoughts:
So, these are just ideas , up for discussion. Thanks for the great work on this project! Marina |
Krishna, also , I think your idea to split the list of TODOs into smaller tasks is great! thanks! |
Hi @mpopova-yottaa, @ppine7 : Thanks. I started working on these items. Can we use the new branch(name is: branch2.0) to push our changes so that we can keep the master branch intact of the working code until we work on our improvements ? We can start merging our changes in milestones. I will start creating the issues for individual improvements and I will also pick up couple. Thanks. |
Sure, sounds good - lets use the new branch. I've added one minor change last night for the IndexHandler - but will modify it slightly to be a bit more generic: to use your approach of specifying the IndexHandler implementation class as a property in the .properties file. |
Sure, i will goahead and review the pull request tonight. I am able to comment quickly on the below with the initial glance.
Also, I will comment on the questions you raised about High Level Consumer. Short answer for now: Its good to have the offset management in our code for few reasons, but as you mentioned, the reintialization part is not reliable and most the time I end up restarting the broker and consumer. Apologies for the late replies, I will try to be on active on this going forward. Thanks and your effort is much appreciated. |
Thanks , Krishna, sounds great! No worry about timing - I keep pushing changes fast because I really want to use this project in production very soon (in a couple of weeks) - but have to make sure it is scalable enough to handle multiple partitions and a load of a few thousand events per second. As to the boolean flag: Yes, I agree - the flag is needed. Actually, it is still there (isStartingFirstTime) - and I use it - I just was setting it immediately to 'false' as soon as the offset is committed for the first time. I did that to fix a specific use case I ran into when it kept doing the "first time" logic multiple times if it so happened that no events were arriving from Kafka for some prolonged period of time right after the start. And since we are on the subject of discussing new tasks/improvements - here is one more list of smaller / minor improvements I suggest: — parameterize all necessary variables (like Kafka timeouts , etc.) - there are many hard-coded timeouts/sleep times in the KafkaClient thanks! |
Hi, Krishna,
I've picked up your kafka-elasticsearch-standalone-consumer project as the base for Kafka-to-ES functionality I need to implement. As part of this work, I would like to contribute the changes/updates/fixes to your original project. I have pushed my initial changes into my fork already: https://github.com/ppine7/kafka-elasticsearch-standalone-consumer,
and I envision many more changes to come, according to the TODO items I've added to the code as well.
If you are Ok with adding these changes into the main project - I can create a Pull request. If you would rather keep your project as is , I can just keep working on my Fork.
Thanks !
Marina
The text was updated successfully, but these errors were encountered: