πΒ Generates a ranked markdown list of awesome libraries and tools.
Getting Started β’ Documentation β’ Support β’ Report a Bug β’ Contribution β’ Changelog
The best-of-generator is a CLI tool to generate a markdown page of ranked open-source projects based on a list of projects defined in a yaml
file. It is integrated with different package managers - such as PyPI, NPM, Conda, and Docker Hub - to automatically collect a variety of project metadata and calculate project-quality scores. It also comes with a GitHub Action workflow for a fully automized update process.
π§ββοΈ Create your own best-of list in just 3 minutes with this guide.
- πΒ Generates a beautiful markdown page from a
yaml
list. - πΒ Integrates various package managers (npm, pypi, conda ...).
- π₯Β Calculates a project-quality score based on a variety of metrics.
- πΒ Identifies trending projects based on collected metrics.
- πΒ GitHub Action workflow for automated weekly updates.
π§ββοΈ If you want to create your own best-of list, we strongly recommend to follow this guide instead of setting up best-of manually. With the guide, it will only take about 3 minutes to get you started. It is already set-up to automatically run the best-of generator via our GitHub Action and includes other useful template files. Installing the best-of CLI tool is not required.
- Install best-of generator via pip:
pip install best-of
- Create a
projects.yaml
file based on the documented structure. This file should contain at least one project. For example:projects: - name: "best-of-ml-python" github_id: "ml-tooling/best-of-ml-python"
- Run best-of generator via command-line:
best-of generate -g <GITHUB_API_TOKEN> ./projects.yaml
You can find further information on how to configure the projects.yaml
file and additional features in the documentation section below.
This project is maintained by Benjamin RΓ€thlein, Lukas Masuch, Jan Kalkan, and Johannes Rieke. Please understand that we won't be able to provide individual support via email. We also believe that help is much more valuable if it's shared publicly so that more people can benefit from it.
Type | Channel |
---|---|
π¨Β Bug Reports | |
πΒ Feature Requests | |
π©βπ»Β Usage Questions | |
π’Β Announcements | |
βΒ Other Requests |
YAML Structure β’ Projects β’ Categories β’ Labels β’ Configuration β’ Project Quality Score β’ Trending Projects β’ CLI β’ GitHub Action β’ Python API
The best-of generator is a CLI tool to generate a markdown page from a list of projects configured in a yaml
file. The documentation sections below will provide information on the projects.yaml
structure, on its different sections (projects, labels, categories & configuration), on some of the best-of features (e.g. project-quality score & trending projects), and instructions on how to run the markdown generation via the command-line interface or via GitHub Actions.
The projects.yaml
file has the following structure:
configurations
(optional): Can be used to overwrite the default configuration of the best-of list. More information in the configuration section.categories
(required): All used categories should be listed here with at least a descriptive title. More information in the categories section.labels
(optional): Used labels can be added here to extend the label with additional aspects (e.g. URL, image, description). More information in the labels section.projects
(required): All projects that are supposed to be shown in the generated markdown page should be listed here. More information in the projects section.
The following yaml
shows a small example:
# Optional: change the default configuration
configuration:
markdown_header_file: "config/header.md"
markdown_footer_file: "config/footer.md"
# Optional: add categories
categories:
- category: "data-engineering"
title: "Machine Learning & Data Engineering"
subtitle: "Best-of lists about machine learning, data engineering, data science, or other topics related to big data."
# Optional: add labels
labels:
- label: "python"
image: "https://www.python.org/static/favicon.ico"
description: "Best-of list with Python projects"
# Required: list of all projects
projects:
- name: "best-of-ml-python"
github_id: "ml-tooling/best-of-ml-python"
labels: ["python"]
category: "data-engineering"
A project is the main component of a best-of list. In most cases, a project is hosted on GitHub and released on different package managers. Such a project should be added with the github_id
and the IDs of all the package managers it is released to. However, it is also possible to add projects which are not hosted on GitHub or released on a package manager, as shown in the example below.
projects:
# Projects with different package managers:
- name: "Tensorflow"
github_id: "tensorflow/tensorflow"
pypi_id: "tensorflow"
conda_id: "conda-forge/tensorflow"
dockerhub_id: "tensorflow/tensorflow"
- name: "Best-of Generator"
pypi_id: "best-of"
github_id: "best-of-lists/best-of-generator"
# Link to another project collection:
- name: "Best-of Overview"
homepage: "https://best-of.org"
resource: True
# Project that is not on GitHub:
- name: "Quart"
pypi_id: "quart"
homepage: "https://gitlab.com/pgjones/quart"
description: "Quart is a Python ASGI web microframework with the same API as Flask."
license: "MIT"
star_count: 772
show: True
The example above will be rendered as shown below:
Every project can also be expanded to show additional project information (by clicking on the project), for example:
Property | Description |
---|---|
name |
Name of the project. This name is required to be unique on the best-of list. |
Optional Properties: | |
github_id |
GitHub ID of the project based on user or organization and the repository name, e.g. best-of-lists/best-of-generator . If the project is hosted on GitLab, please use the gitlab_id property. |
category |
Category that this project is most related to. You can find all available category IDs in the projects.yaml file. The project will be sorted into the Others category if no category is provided. |
labels |
List of labels that this project is related to. You can find all available label IDs in the projects.yaml file. |
license |
License of the project. If set, license information from GitHub or package managers will be overwritten. Can be a custom URL pointing to more information in case it is not a standard license. `allowed_licenses` must be set to "all" or contain the URL in order to show the project. |
description |
Short description of the project. If set, the description from GitHub or package managers will be overwritten. |
homepage |
Homepage URL of the project. Only use this property if the project homepage is different from the GitHub URL. |
docs_url |
Documentation URL of the project. Only use this property if the project documentation site is different from the GitHub URL. |
resource |
If True , the project will be marked as a resource. Resources are not ranked and will always be shown on top of the category. You can use this to link to another best-of list section or website that contains additional projects. |
group |
If True , the project will be used as top project for grouping a set of related projects. group_id also needs to be set to the shared group ID. |
group_id |
Group ID that can be used to group this project to other projects. For every group, there needs to be one project with group set to True . |
show |
If True , the project will always be shown even when the project would be actual hidden (e.g. dead project, risky licenses, to few stars...). Only use this property if you are sure that this project needs to be shown. |
ignore |
If True , the project will be ignored. This also means that it will not be included in the hidden projects section. However, the project metadata will still be collected. |
Supported Integrations: | |
pypi_id |
Project ID on the Python package index (PyPi). |
conda_id |
Project ID on the conda package manager. If the main package is provided on a different channel, prefix the ID with the given channel: e.g. conda-forge/tensorflow |
npm_id |
Project ID on the Node package manager (npm). |
dockerhub_id |
Project ID on the Docker Hub container registry. |
maven_id |
Artifact ID on Maven central, e.g. org.apache.flink:flink-core . |
github_id |
GitHub ID of the project based on user or organization and the repository name, e.g. best-of-lists/best-of-generator . |
gitlab_id |
GitLab ID of the project based on user or organization and the repository name, e.g. best-of-lists/best-of-generator . |
gitee_id |
Gitee ID of the project based on user or organization and the repository name, e.g. best-of-lists/best-of-generator . You can generate an access token and set option --gitee-key . |
greasy_fork_id |
Greasy Fork ID of the project. This is the number in script's URL, e.g. 299792458 for https://greasyfork.org/scripts/299792458-speed-of-light . If set, homepage and description on Greasy Fork will take precedence over those on GitHub. |
While you can theoretically overwrite all project metadata, we suggest to only set the properties which the best-of generator is not able to find on GitHub or the configured package managers. There are also other undocumented properties, but for most projects those properties should not be overwritten.
Additional undocumented project metadata (click to expand...)
- created_at
- update_at
- github_url
- github_release_downloads
- github_dependent_project_count
- last_commit_pushed_at
- star_count
- commit_count
- dependent_project_count
- contributor_count
- fork_count
- monthly_downloads
- open_issue_count
- closed_issue_count
- release_count
- latest_stable_release_published_at
- latest_stable_release_number
- trending
- helm_id
- brew_id
- apt_id
- yum_id
- snap_id
- maven_id
- dnf_id
- yay_id
- <PACKAGE_MANAGER>_url
- <PACKAGE_MANAGER>_latest_release_published_at
- <PACKAGE_MANAGER>_dependent_project_count
A category allows to add additional structure to the best-of list by grouping related projects into a shared category. Thereby, every project is grouped into exactly one category. If no category is provided with the project metadata, the project will be categorized into Others
.
categories:
- category: "data-engineering"
title: "Machine Learning & Data Engineering"
subtitle: "Best-of lists about machine learning, data engineering, data science, or other topics related to big data."
projects:
- name: "best-of-ml-python"
github_id: "ml-tooling/best-of-ml-python"
category: "data-engineering"
The example above will be rendered as shown below:
Property | Description |
---|---|
category |
ID of the category. This ID should also be used for adding a project to this category. |
title |
Category name used as the header of the category section. |
Optional Properties: | |
subtitle |
Short description about the category shown under the title. |
ignore |
If True , the category and all its projects will be ignored. |
A label allows to highlight similarities or special features shared between projects. Compared to categories, a project can have any number of labels. The labels are shown as badges attached to the project description. It can have only an image (favicons are recommended), only a name, or both. We recommend to use image labels (or only very short labels) since the usage of labels will shorten the visible description text of a project.
labels:
- label: "python"
image: "https://www.python.org/static/favicon.ico"
description: "Best-of list with Python projects"
- label: "libraries"
name: "libraries"
projects:
- name: "best-of-ml-python"
github_id: "ml-tooling/best-of-ml-python"
labels: ["libraries", "python"]
category: "data-engineering"
The example above will be rendered as shown below:
Property | Description |
---|---|
label |
ID of the label. This ID should also be used for adding the label to a project. |
Optional Properties: | |
image |
URL to an image. If a valid URL is provided, the image will be shown wherever the label is used. |
name |
Name of the label. If a name is provided, the name will be shown wherever the label is used. |
description |
Short description of the label. If show_labels_in_legend configuration is True and an image is set, this description will also be shown in the legend (explanations). |
ignore |
If True , the label will not be shown anywhere. |
url |
If url is set, the label will be a rendered as a link wherever it is used. |
Many aspects of the best-of list can be configured. Since most default values are selected to support the widest range of different lists, changing the default configuration is not required for most cases.
configuration:
min_stars: 0
min_projectrank: 0
allowed_licenses: ["all"]
markdown_header_file: "config/header.md"
markdown_footer_file: "config/footer.md"
The configuration example above changes the default configuration to show all projects regardless of star count (via min_stars
), projectrank (via min_projectrank
), or license (via allows_licenses
). It also configures a header (via markdown_header_file
) and footer (via markdown_footer_file
) markdown files that will be attached to the generated content.
Config | Description | Default | |
---|---|---|---|
output_file |
The markdown output file. | ./README.md |
|
markdown_header_file |
Path to a markdown file that will be attached above the generated content. | ||
markdown_footer_file |
Path to a markdown file that will be attached below the generated content. | ||
output_generator |
Select the markdown generator to use for generating the output markdown page. Currently, only markdown-list is supported. |
markdown-list |
|
project_inactive_months |
Number of months without activity until a project is marked as inactive. | 6 |
|
project_dead_months |
Number of months without activity until a project is marked as dead. | 12 |
|
project_new_months |
Number of months since creation to mark a project as newcomer. | 6 |
|
min_projectrank |
Project will be hidden if it has a smaller projectrank (quality score). | 10 |
|
min_stars |
Project will be hidden if it has a less stars on GitHub. | 100 |
|
require_license |
If True , all projects without a detected license will be hidden. |
True |
|
require_repo |
If True , all projects without a source repository - configure via github_id or gitlab_id or gitee_id - will be hidden. |
False |
|
min_description_length |
The minimum length of the project description. If the length is less, the project will not be shown. | 10 |
|
max_description_length |
The maximum length of the project description. | 55 |
|
ascii_description |
If True , all non-ASCII characters in the project description will be removed. Useful for filtering out distractive emoji, but hurtful in non-English cases. (Note: GitHub emoji commands (e.g. :smile: ) are always removed.) |
True |
|
projects_history_folder |
The folder used for storing history files (csv files with project metadata). If null , no history files will be created. |
./history |
|
generate_install_hints |
If False , the install hint code block for the package managers will not be shown. |
True |
|
generate_toc |
If True , generate a table of content with all categories. |
True |
|
category_heading |
How categories headings are generated. If simple , headings will be ## Category , and IDs are set by GitHub. If robust , headings will be <h2 id='category-id'>Category</h2> . (TOC relies on these IDs.) If all of your categories' names are ASCII, use simple . |
simple | |
generate_legend |
If True , generate a legend containing explanations for the used emojis. |
True |
|
sort_by |
The project property used to sort the projects within a category. | projectrank |
|
max_trending_projects |
The number of trending projects to show for trending up as well as down. | 5 |
|
hide_empty_categories |
If True , empty categories will not be shown. |
False |
|
hide_project_license |
If True , the project license badge will not be shown. |
False |
|
hide_license_risk |
If True , the risk indicator for uncommon or risky licenses will not be shown. |
False |
|
show_labels_in_legend |
If True , image labels will be listed in the legend (explanation) if they also have a description. |
True |
|
allowed_licenses |
List of allowed licenses (spdx format). A project with a different license will be hidden. Use ["all"] to allow all licenses. |
selection of common open-source licenses | |
extension_script |
Path to a python script which is loaded before project collection or markdown generation to allow extensibility. |
All projects in a best-of list are ranked and sorted by a project-quality score (also called projectrank
). The score is calculated based on various metrics automatically collected from GitHub and different package managers. The score is just a sum of points which a project collects for various aspects and metrics. The score only has a meaning when it is compared to the project-quality score of other projects. We currently use the following aspects to calculate the score:
This calculation is just chosen by experience. There is no scientific proof that this really reflects the quality of a project.
- Has homepage link & description:
+ 1
- Has an existing GitHub repository:
+ 1
- Has a license:
+ 1
- Has a commonly used license (e.g. MIT):
+ 1
- Has multiple releases:
+ 1
- Has stable releases based on semantic version:
+ 1
- Has a release that is less than 6 months old:
+ 1
- Repo was update in the last 3 months:
+ 1
- Is older than 6 months:
+ 1
- Metrics from GitHub & package mangers:
- Number of stars:
+ log(COUNT / 2)
- Number of contributors:
+ log(COUNT / 2) - 1
- Number of commits:
+ log(COUNT / 2) - 1
- Number of forks:
+ log(COUNT / 2)
- Number of monthly downloads:
+ log(COUNT / 2) - 1
- Number of dependent projects:
+ log(COUNT / 1.5)
- Number of watchers:
+ log(COUNT / 2) - 1
- Number of closed issues:
+ log(COUNT / 2) - 1
- Greasy Fork fan score:
+ log(COUNT / 2) - 1
- Number of stars:
The best-of list is able to automatically identify trending projects by comparing project-quality scores between the metadata of the current generation with the latest history file. If the history is activated (projects_history_folder
is not set to null
), the best-of generation will automatically create a <YYYY-MM-dd>_changes.md
file in the configured history folder for every update and a latest-changes.md
file in the folder of the generated markdown page. These files contain a list of projects that are trending up (higher quality score since last update) and down (lower quality score since last update) as well as a list of all added projects since the last update, as shown in the following example:
The GitHub Action workflow uses these markdown files to automatically create releases for every update. This allows to persist a useful changelog over many updates and enables readers to get valuable email updates whenever the list is updated (by watching for release events).
To use the CLI, you need to have the best-of generator installed via pip:
pip install best-of
best-of generate [OPTIONS] PATH
Generates a best-of markdown page from a yaml
file.
Arguments:
PATH
: Path to theyaml
file containing the best-of metadata (e.g../projects.yaml
).
Options:
-g
,--github-key
TEXT
: GitHub API Token (from https://github.com/settings/tokens).-l
,--libraries-key
TEXT
: Libraries.io API Key (from https://libraries.io/api).--gitee-key
TEXT
: Gitee API Key (from https://gitee.com/profile/personal_access_tokens).--help
: Show this message and exit.
π§ββοΈ If you want to create your own best-of list, we strongly recommend to follow this guide. With the guide, it will only take about 3 minutes to get you started. It already includes this GitHub Action and some other useful template files. Further manual steps for setting up the GitHub Action are not required.
The best-of-update-action makes it very easy to set-up automated scheduled updates for your best-of markdown page. Please refer to the best-of-update-action documentation for more detailed information about the GitHub Action and the workflow.
Usage of the Python API is not well documented yet and currently not recommended.
The best-of generator can also be used and integrated via its Python API. The full Python API documentation can be found here.
The generated README file is not displayed completely (click to expand...)
GitHub only renders the first 512 kb of the main README.md
file and will cut off the rendered version as soon as it has processed the first 512 kb of the raw markdown content. The rendering is only cut off when viewing the readme on the main repo page. If you directly select the README.md
file, it will render in its entirety. To mitigate this issue, we optimized the markdown generation to require the minimum amount of characters. However, if you have a very large list of projects (more than 800), you might reach the 512 kb limit (check the file size of the generated README.md
file). In this case, we suggest to extract some of the categories or projects into smaller best-of lists.
- Pull requests are encouraged and always welcome. Read our contribution guidelines and check out help-wanted issues.
- Submit GitHub issues for any feature request and enhancement, bugs, or documentation problems.
- By participating in this project, you agree to abide by its Code of Conduct.
- The development section below contains information on how to build and test the project after you have implemented some changes.
Requirements: Docker and Act are required to be installed on your machine to execute the containerized build process.
To simplify the process of building this project from scratch, we provide build-scripts - based on universal-build - that run all necessary steps (build, check, test, and release) within a containerized environment. To build and test your changes, execute the following command in the project root folder:
act -b -j build
Refer to our contribution guides for more detailed information on our build scripts and development process.
Licensed MIT. Created and maintained with β€οΈΒ by developers from Berlin.