Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(deps): bump the minor group with 2 updates #784

Merged
merged 2 commits into from
Apr 25, 2024

Conversation

dependabot[bot]
Copy link
Contributor

@dependabot dependabot bot commented on behalf of github Apr 24, 2024

Bumps the minor group with 2 updates: semantic-text-splitter and mypy.

Updates semantic-text-splitter from 0.11.0 to 0.12.0

Release notes

Sourced from semantic-text-splitter's releases.

v0.12.0 - Centralized Chunk Configuration

What's New

This release is a big API change to pull all chunk configuration options into the same place, at initialization of the splitters. This was motivated by two things:

  1. These settings are all important to deciding how to split the text for a given use case, and in practice I saw them often being set together anyway.
  2. To prep the library for new features like chunk overlap, where error handling has to be introduced to make sure that invariants are kept between all of the settings. These errors should be handled as sson as possible before chunking the text.

Overall, I think this has aligned the library with the usage I have seen in the wild, and pulls all of the settings for the "domain" of chunking into a single unit.

Breaking Changes

Rust

  • Trimming is now enabled by default. This brings the Rust crate in alignment with the Python package. But for every use case I saw, this was already being set to true, and this does logically make sense as the default behavior.
  • TextSplitter and MarkdownSplitter now take a ChunkConfig in their ::new method
    • This bring the ChunkSizer, ChunkCapacity and trim settings into a single struct that can be instantiated with a builder-lite pattern.
    • with_trim_chunks method has been removed from TextSplitter and MarkdownSplitter. You can now set trim in the ChunkConfig struct.
  • ChunkCapacity is now a struct instead of a Trait. If you were using a custom ChunkCapacity, you can change your impl to a From<TYPE> for ChunkCapacity instead. and you should be able to still pass it in to all of the same methods.
    • This also means ChunkSizers take a concrete type in their method instead of an impl

Migration Examples

Default settings:

/// Before
let splitter = TextSplitter::default().with_trim_chunks(true);
let chunks = splitter.chunks("your document text", 500);
/// After
let splitter = TextSplitter::new(500);
let chunks = splitter.chunks("your document text");

Hugging Face Tokenizers:

/// Before
let tokenizer = Tokenizer::from_pretrained("bert-base-cased", None).unwrap();
let splitter = TextSplitter::new(tokenizer).with_trim_chunks(true);
let chunks = splitter.chunks("your document text", 500);
/// After
let tokenizer = Tokenizer::from_pretrained("bert-base-cased", None).unwrap();
let splitter = TextSplitter::new(ChunkConfig::new(500).with_sizer(tokenizer));
let chunks = splitter.chunks("your document text");

Tiktoken:

... (truncated)

Changelog

Sourced from semantic-text-splitter's changelog.

v0.12.0

What's New

This release is a big API change to pull all chunk configuration options into the same place, at initialization of the splitters. This was motivated by two things:

  1. These settings are all important to deciding how to split the text for a given use case, and in practice I saw them often being set together anyway.
  2. To prep the library for new features like chunk overlap, where error handling has to be introduced to make sure that invariants are kept between all of the settings. These errors should be handled as sson as possible before chunking the text.

Overall, I think this has aligned the library with the usage I have seen in the wild, and pulls all of the settings for the "domain" of chunking into a single unit.

Breaking Changes

Rust

  • Trimming is now enabled by default. This brings the Rust crate in alignment with the Python package. But for every use case I saw, this was already being set to true, and this does logically make sense as the default behavior.
  • TextSplitter and MarkdownSplitter now take a ChunkConfig in their ::new method
    • This bring the ChunkSizer, ChunkCapacity and trim settings into a single struct that can be instantiated with a builder-lite pattern.
    • with_trim_chunks method has been removed from TextSplitter and MarkdownSplitter. You can now set trim in the ChunkConfig struct.
  • ChunkCapacity is now a struct instead of a Trait. If you were using a custom ChunkCapacity, you can change your impl to a From<TYPE> for ChunkCapacity instead. and you should be able to still pass it in to all of the same methods.
    • This also means ChunkSizers take a concrete type in their method instead of an impl
Migration Examples

Default settings:

/// Before
let splitter = TextSplitter::default().with_trim_chunks(true);
let chunks = splitter.chunks("your document text", 500);
/// After
let splitter = TextSplitter::new(500);
let chunks = splitter.chunks("your document text");

Hugging Face Tokenizers:

/// Before
let tokenizer = Tokenizer::from_pretrained("bert-base-cased", None).unwrap();
let splitter = TextSplitter::new(tokenizer).with_trim_chunks(true);
let chunks = splitter.chunks("your document text", 500);
/// After
let tokenizer = Tokenizer::from_pretrained("bert-base-cased", None).unwrap();
let splitter = TextSplitter::new(ChunkConfig::new(500).with_sizer(tokenizer));
let chunks = splitter.chunks("your document text");

... (truncated)

Commits
  • b03b1be Update changelog for 0.12
  • da6f4ee Bump the minor group with 3 updates
  • 633ffb5 Bump the minor group across 1 directory with 2 updates
  • 7ed0e60 fix: python readme examples
  • 125adeb feat!: Add error-handling for chunk capacities that are invalid
  • 86e7287 feat!: Change ChunkCapacity from a Trait to a Struct
  • 42bfcbb feat!: Update Splitters to require a ChunkConfig, and trim by default
  • 7419d89 Bump rustls from 0.22.3 to 0.22.4 in the cargo group
  • See full diff in compare view

Updates mypy from 1.9.0 to 1.10.0

Changelog

Sourced from mypy's changelog.

Mypy Release Notes

Next release

Mypy 1.10

We’ve just uploaded mypy 1.10 to the Python Package Index (PyPI). Mypy is a static type checker for Python. This release includes new features, performance improvements and bug fixes. You can install it as follows:

python3 -m pip install -U mypy

You can read the full documentation for this release on Read the Docs.

Support TypeIs (PEP 742)

Mypy now supports TypeIs (PEP 742), which allows functions to narrow the type of a value, similar to isinstance(). Unlike TypeGuard, TypeIs can narrow in both the if and else branches of an if statement:

from typing_extensions import TypeIs
def is_str(s: object) -> TypeIs[str]:
return isinstance(s, str)
def f(o: str | int) -> None:
if is_str(o):
# Type of o is 'str'
...
else:
# Type of o is 'int'
...

TypeIs will be added to the typing module in Python 3.13, but it can be used on earlier Python versions by importing it from typing_extensions.

This feature was contributed by Jelle Zijlstra (PR 16898).

Support TypeVar Defaults (PEP 696)

PEP 696 adds support for type parameter defaults. Example:

from typing import Generic
from typing_extensions import TypeVar
</tr></table>

... (truncated)

Commits

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore <dependency name> major version will close this group update PR and stop Dependabot creating any more for the specific dependency's major version (unless you unignore this specific dependency's major version or upgrade to it yourself)
  • @dependabot ignore <dependency name> minor version will close this group update PR and stop Dependabot creating any more for the specific dependency's minor version (unless you unignore this specific dependency's minor version or upgrade to it yourself)
  • @dependabot ignore <dependency name> will close this group update PR and stop Dependabot creating any more for the specific dependency (unless you unignore this specific dependency or upgrade to it yourself)
  • @dependabot unignore <dependency name> will remove all of the ignore conditions of the specified dependency
  • @dependabot unignore <dependency name> <ignore condition> will remove the ignore condition of the specified dependency and ignore conditions

Bumps the minor group with 2 updates: [semantic-text-splitter](https://github.com/benbrandt/text-splitter) and [mypy](https://github.com/python/mypy).


Updates `semantic-text-splitter` from 0.11.0 to 0.12.0
- [Release notes](https://github.com/benbrandt/text-splitter/releases)
- [Changelog](https://github.com/benbrandt/text-splitter/blob/main/CHANGELOG.md)
- [Commits](benbrandt/text-splitter@v0.11.0...v0.12.0)

Updates `mypy` from 1.9.0 to 1.10.0
- [Changelog](https://github.com/python/mypy/blob/master/CHANGELOG.md)
- [Commits](python/mypy@1.9.0...v1.10.0)

---
updated-dependencies:
- dependency-name: semantic-text-splitter
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: minor
- dependency-name: mypy
  dependency-type: direct:development
  update-type: version-update:semver-minor
  dependency-group: minor
...

Signed-off-by: dependabot[bot] <[email protected]>
@dependabot dependabot bot added dependencies Pull requests that update a dependency file python Pull requests that update Python code labels Apr 24, 2024
@FlorianSchepersAA FlorianSchepersAA force-pushed the dependabot/pip/minor-b48f67565f branch from 5de3331 to 7398947 Compare April 25, 2024 09:13
@FlorianSchepersAA FlorianSchepersAA merged commit 05b015c into main Apr 25, 2024
4 checks passed
@FlorianSchepersAA FlorianSchepersAA deleted the dependabot/pip/minor-b48f67565f branch April 25, 2024 09:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file python Pull requests that update Python code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant