-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Production readiness / desired contributions #25
Comments
Thanks for starting this discussion! We've talked about the idea of establishing an explicit roadmap, but since this is open source software, it's tough to have one, since we all contribute to scratch our own itches. Having said that, I can share what I am really interested in seeing in the immediate future:
I'd be interested @magro on what you think would be needed to "go to production"? Especially if the three things I highlighted above were done? Other areas that might be interesting are to think about what it would take to use Chorus with some of the existing ecommerce packages out there. What would the data integration look like to make Chorus and SAP Hybris work together for example? What about auto suggest? What should a out of the box autosuggest for Chorus look like? |
Thank you for getting the discussion started. I think the current status is that Chorus is more like a template application from which you can 'copy/paste' when you build your own application. I would say that the Solr - Querqy - SMUI integration probably won't need too many adjustments (if any?) in a real-life application. I'd also say that the Solr schema is a reasonably starting point but I fully agree with @epugh's third point: we don't cover all aspects of a typical e-commerce schema yet. I also agree that deployment via Kubernetes should be on our list. I agree that it is difficult for us to follow a roadmap but I also think we need to reach agreement what solutions we pursue: Do we accept any solution that solves the problem or do we only accept what we have seen to work well in e-commerce search practice? If we follow the latter approach, I wonder how many of us do have working experience with https://github.com/bloomberg/solr-operator - on the other hand we could kindly ask @tboeghk and @JohannesDaniel and try to explore their hands-on experience in this field (=Solr via Kubernetes) ;-). Re analytics: That would be a great step forward. I think the problem is not in providing a software for event collection. Most search teams just plug themselves into what their overall application uses anyway. It also seems to me that tracking is moving to the server-side to a considerable degree (at least in Europe). What we could provide is a common schema for the events - as a guidance to what to collect and maybe to establish a common format - and then provide the tools to extract search relevance information from the data, maybe via BigQuery etc. In my opinion, our largest gap is calculating search relevance judgments from those events, for example to use them with RRE or Quaerite. Another big area that we do not cover is indexing. What I see in practice is that many teams are struggling with indexing times. Often it's less about the indexing speed of the search engine but about collecting data from several sources and about transforming the data into the target format. A common solution is to add another data storage that holds the (partially) transformed data outside the search engine (DB, Kafka) so that most data can be loaded from there during indexing without having to go back to the original data source. While it cannot be in the scope of Chorus to provide a complete ETL solution, we could look into adding this additional storage as a good practice. |
Here is the JIRA tracking the |
Hi, I like the idea of exploring the Bloomberg Solr-K8 solution. I have no experience with it so far as we have our own deployment definitions for Solr. We have been running Solr stably on K8 for almost two years now. In general, there are only two special things to consider in the context of Solr K8 deployments
|
Many thanks for all the insights! I think that having a template for deploying these components to k8s in production would be really awesome! A path towards this could be to migrate the docker compose based setup to e.g. minikube or kind. As you wrote there are some parts not covered, like tracking/analytics, or an indexing/preprocessing stage. Maybe there could be a diagram showing a typical setup, to show what's covered and what's not. These potential additions could also be represented by tickets of course. I have the feeling that the even more interesting or valuable things might be what you mentioned around typical ecommerce data models, like variants etc - not sure. |
You are touching on the fact that the docs need a LOT of work! Not sure if that is something you'd be interesting in leading? LIkewise the minikube idea. Here is a link to my "framework" for relevancy, not specific to Chorus, but how I think about what a Relevance Framework would look like, and what some FOSS tools are: https://docs.google.com/presentation/d/1EspLVQa9d2qZB55rBouSGSPtqzdsQt_waaiG3aKfxzE/edit#slide=id.g93b06afd5d_0_0 |
Ha, I shouldn't have asked ;-) A local k8s setup would be more effort, maybe @JohannesDaniel could support here? I could also see what I can come up with, but not sure when I'll find time for this. And we should agree on the tooling, i.e. minikube or kind (I'd prefer kind because I made good experiences with it, but I also never used minikube...). Thanks for the relevancy framework slides! Could this be a basis for visualizing what Chorus covers and what not? |
Based on information from querqy#25
I just submitted a PR as a starting point. Some statements are probably wrong, but I'm sure you'll point them out :-) |
Great initiative to provide an OSS stack for e-commerce search!
So far I haven't taken a closer look at the components and the integration yet. But I would be interested in which ones you would describe yourself as ready for production and which you would rather not use in production yourself. This could also be added to the README if this is possible.
I would also be interested in which components you would like to get contributions / in which parts would you say things should be improved (thinking about it, maybe this is just a different way to ask the same thing as before ;-)). Maybe you could create issues for things where you'd like to receive contributions...?
The text was updated successfully, but these errors were encountered: