-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Renovate: Repurpose into "CrateDB Ecosystem Catalog" #107
Conversation
(ml-tools)= | ||
# Machine Learning with CrateDB | ||
|
||
This documentation section lists machine learning applications and frameworks | ||
which can be used together with CrateDB. Relevant tutorials can be found within | ||
the [CrateDB Guide: Machine Learning Tutorials] section of the documentation. | ||
Machine learning applications and frameworks | ||
which can be used together with CrateDB. | ||
|
||
::::{card} {material-outlined}`lightbulb;2em` Tutorials | ||
:margin: 0 0 5 5 | ||
:shadow: md | ||
:link: guide:ml | ||
:link-type: ref | ||
|
||
Learn how to integrate CrateDB with machine learning frameworks and tools, | ||
for MLOps and Vector database operations. | ||
+++ | ||
{tag}`MLOps` {tag}`Vector Store` {tag}`Embeddings` | ||
{tag}`Hybrid Search` {tag}`LLM` {tag}`RAG` | ||
:::: | ||
|
||
|
||
## LangChain |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There has been a guidance flaw on this page. This has been improved now, the page starts right away by adding navigation gravity towards the tutorials section, where an avid reader may follow right away. That it is a navigation element, becomes immediately obvious, because the whole card item is a link.
Otherwise, a reader of general information may just go on consuming the catalog/gallery items, in order to learn more about them.
-- https://crate-clients-tools--107.org.readthedocs.build/en/107/integrate/ml.html
Screenshot
Use dashboard and other data visualization applications and toolkits for | ||
Dashboard and other data visualization applications and toolkits for | ||
visualizing data stored inside CrateDB. | ||
|
||
::::{card} {material-outlined}`lightbulb;2em` Tutorials | ||
:margin: 0 0 5 5 | ||
:shadow: md | ||
:link: guide:visualization | ||
:link-type: ref | ||
|
||
Guidelines about data analysis and visualization with CrateDB. | ||
+++ | ||
{tag}`DataViz` {tag}`EDA` {tag}`BI` | ||
:::: | ||
|
||
|
||
(apache-superset)= | ||
(preset)= |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another sample of guidance improvements, also to highlight and provide navigation to corresponding tutorials.
-- https://crate-clients-tools--107.org.readthedocs.build/en/107/integrate/visualize.html
Screenshot
:::{rubric} scikit-learn | ||
::: | ||
_Machine Learning in Python._ | ||
|
||
- Simple and efficient tools for predictive data analysis | ||
- Accessible to everybody, and reusable in various contexts | ||
- Built on NumPy, SciPy, and matplotlib | ||
|
||
:::{rubric} pandas | ||
::: | ||
_The open source data analysis and manipulation tool._ | ||
|
||
Pandas is a software library written for the Python programming | ||
language for data manipulation and analysis. In particular, it offers data structures | ||
and operations for manipulating numerical tables and time series. | ||
|
||
:::{rubric} Project Jupyter | ||
::: | ||
_Interactive computing across all programming languages._ | ||
|
||
JupyterLab is the latest web-based interactive development environment for notebooks, | ||
code, and data. Its flexible interface allows users to configure and arrange workflows | ||
in data science, scientific computing, computational journalism, and machine learning. | ||
A modular design invites extensions to expand and enrich functionality. | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The section about scikit-learn and friends was a bit empty. Thanks for reporting, @surister.
Pandas (stylized as pandas) is a software library written for the Python programming | ||
language for data manipulation and analysis. In particular, it offers data structures | ||
and operations for manipulating numerical tables and time series. | ||
|
||
:::{rubric} Data Model | ||
::: | ||
- Pandas is built around data structures called Series and DataFrames. Data for these | ||
collections can be imported from various file formats such as comma-separated values, | ||
JSON, Parquet, SQL database tables or queries, and Microsoft Excel. | ||
- A Series is a 1-dimensional data structure built on top of NumPy's array. | ||
- Pandas includes support for time series, such as the ability to interpolate values | ||
and filter using a range of timestamps. | ||
- By default, a Pandas index is a series of integers ascending from 0, similar to the | ||
indices of Python arrays. However, indices can use any NumPy data type, including | ||
floating point, timestamps, or strings. | ||
- Pandas supports hierarchical indices with multiple values per data point. An index | ||
with this structure, called a "MultiIndex", allows a single DataFrame to represent | ||
multiple dimensions, similar to a pivot table in Microsoft Excel. Each level of a | ||
MultiIndex can be given a unique name. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The section about pandas also was a bit thin, so it has been expanded. /cc @surister
Server down or defunct: 500 Server Error: Internal Server Error for url.
After all the tutorials have been refactored into the CrateDB Guide, the enumeration of catalog items became a bit of a lost place. This improvement concludes the renovation on this end, by effectively repurposing it into a ecosystem software catalog/gallery, similar to how others are running them. Other than this, the patch also adds concise navigation elements to the top of each page, in order to add gravity towards the tutorial items.
About
After all the tutorials have been refactored into the CrateDB Guide with GH-82 recently, the enumeration of catalog items here became a bit of a lost place. Thanks for reporting, @geragray.
Details
Ecosystem Catalog
The patch concludes the renovation on this end, by effectively repurposing the documentation section into a (preliminary) ecosystem software catalog/gallery, in the spirit how others are running them.
Gravity to Tutorials
To remedy a guidance flaw, the patch also adds concise navigation elements to the top of each page within its "Integrations" section, by adding gravity towards the corresponding tutorial items, now located within The CrateDB Guide.
Two samples of that have been outlined below, corresponding feedback is very much welcome.
Preview
References