-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(low-code cdk): add config component resolver #149
Conversation
📝 Walkthrough📝 WalkthroughWalkthroughThe pull request introduces significant updates to the Airbyte CDK's declarative components. New components Changes
Possibly related PRs
Suggested reviewers
Wydt about these suggestions? Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Outside diff range and nitpick comments (10)
airbyte_cdk/sources/declarative/resolvers/config_components_resolver.py (1)
68-91
: Consider extending support for non-string component valuesCurrently, the
__post_init__
method raises aValueError
if a component's value is neither a string nor anInterpolatedString
. Would it be beneficial to support other data types, such as numbers or booleans, to enhance flexibility in component mappings? This could make the resolver more versatile when dealing with different types of configuration values.wdyt?
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)
1631-1645
: Review default values for optional fieldsIn the
DeclarativeStream
class, thename
field is set asOptional[str] = Field("", ...)
, which defaults to an empty string. This might blur the distinction between a missing name and an intentionally empty name. Would setting the default toNone
provide clearer semantics? Similarly, reviewing other optional fields to ensure their defaults are appropriately set could improve clarity.wdyt?
airbyte_cdk/sources/declarative/resolvers/__init__.py (2)
7-9
: Organize imports for better readabilityConsider reorganizing the imports to group related modules together and sort them alphabetically. This can enhance readability and make it easier to locate specific imports.
wdyt?
16-16
: Format__all__
for improved clarityThe
__all__
list is getting quite long. Formatting it across multiple lines or as a tuple could improve readability and maintainability, especially as more components are added in the future.wdyt?
unit_tests/sources/declarative/resolvers/test_http_components_resolver.py (1)
Line range hint
150-176
: Add a docstring to the test functionAdding a brief docstring to
test_dynamic_streams_read_with_http_components_resolver
would provide clarity on the purpose of the test and outline the scenarios being validated. This can aid future contributors in understanding and maintaining the test suite.wdyt?
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)
2307-2317
: The StreamConfig creation looks good, but could we enhance error handling?The implementation correctly handles the configs_pointer field, but what do you think about adding validation to ensure the pointer paths are valid? This could help catch configuration issues early. wdyt?
@staticmethod def create_stream_config( model: StreamConfigModel, config: Config, **kwargs: Any ) -> StreamConfig: model_configs_pointer: List[Union[InterpolatedString, str]] = ( [x for x in model.configs_pointer] if model.configs_pointer else [] ) + # Validate pointer paths + for pointer in model_configs_pointer: + if not isinstance(pointer, (str, InterpolatedString)): + raise ValueError(f"Invalid configs_pointer value: {pointer}") return StreamConfig( configs_pointer=model_configs_pointer, parameters=model.parameters or {}, )airbyte_cdk/sources/declarative/declarative_component_schema.yaml (4)
2923-2936
: Support for array indices in field paths looks good!The addition of integer type support in field_path enables direct array element access. The examples nicely demonstrate both string and integer usage patterns.
Quick question: Should we add a note in the description about integer indices being 0-based? This might help prevent off-by-one errors, wdyt?
2981-3006
: StreamConfig component looks well-structured!The new StreamConfig component provides a clean way to locate stream configurations within the source config. The required fields and interpolation support are well thought out.
One suggestion: The description mentions "source config file" but it's actually the runtime source config object. Should we update the description to avoid confusion? Something like "...path in source config where streams configs are located", wdyt?
3007-3026
: ConfigComponentsResolver implementation looks solid!The component provides a nice alternative to HTTP-based resolution, allowing stream templates to be populated from config values. The structure mirrors HttpComponentsResolver for consistency.
A thought: Given this is marked experimental, should we add a note about potential breaking changes in the description? Maybe something like "Breaking changes may be introduced as the API evolves", wdyt?
3041-3043
: DynamicDeclarativeStream update integrates well!The addition of ConfigComponentsResolver as an alternative to HttpComponentsResolver provides flexibility in how stream templates are resolved.
Consider adding an example in the description showing both resolver types? This could help users understand when to use each one, wdyt?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (9)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml
(3 hunks)airbyte_cdk/sources/declarative/manifest_declarative_source.py
(3 hunks)airbyte_cdk/sources/declarative/models/declarative_component_schema.py
(12 hunks)airbyte_cdk/sources/declarative/parsers/manifest_component_transformer.py
(1 hunks)airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
(5 hunks)airbyte_cdk/sources/declarative/resolvers/__init__.py
(1 hunks)airbyte_cdk/sources/declarative/resolvers/config_components_resolver.py
(1 hunks)unit_tests/sources/declarative/resolvers/test_config_components_resolver.py
(1 hunks)unit_tests/sources/declarative/resolvers/test_http_components_resolver.py
(3 hunks)
🔇 Additional comments (7)
airbyte_cdk/sources/declarative/resolvers/config_components_resolver.py (1)
94-105
: Verify the handling of stream configuration paths
In the _stream_config
property, dpath.get
is used with the evaluated path
list to retrieve the stream configurations from self.config
. Are there scenarios where the evaluated paths might not exist in self.config
, potentially leading to unexpected results or exceptions? It might be helpful to include error handling or default values to ensure robustness when the expected configuration paths are missing.
wdyt?
unit_tests/sources/declarative/resolvers/test_config_components_resolver.py (1)
107-142
: Great job on comprehensive testing
The test_dynamic_streams_read_with_config_components_resolver
function effectively covers the main use case of the ConfigComponentsResolver
. The test is well-structured and validates the functionality as expected.
unit_tests/sources/declarative/resolvers/test_http_components_resolver.py (1)
Line range hint 182-198
: Ensure consistent use of configuration in tests
You've updated the source.discover
method to pass _CONFIG
as the configuration. Are there any potential issues if _CONFIG
changes in the future? Ensuring that tests use a consistent and controlled configuration can help prevent unexpected behaviors due to configuration changes.
wdyt?
airbyte_cdk/sources/declarative/parsers/manifest_component_transformer.py (1)
36-36
: LGTM! The mapping changes look good.
The updated mappings correctly reflect the new ConfigComponentResolver functionality and its associated components.
Also applies to: 40-42
airbyte_cdk/sources/declarative/manifest_declarative_source.py (1)
316-316
: Great addition of duplicate stream name prevention!
The implementation effectively prevents duplicate dynamic stream names by tracking them in a set. This helps avoid potential issues that could arise from having multiple streams with the same name.
Also applies to: 354-359
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (2)
200-205
: LGTM! Component registration is properly handled.
The new ConfigComponentsResolver and StreamConfig components are correctly imported and registered in the factory system.
Also applies to: 490-491
2319-2342
: LGTM! The ConfigComponentsResolver creation is well implemented.
The method properly creates both the stream_config and components_mapping, maintaining consistency with the factory pattern.
airbyte_cdk/sources/declarative/models/declarative_component_schema.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (2)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (2)
1826-1826
: Consider improving method signature readability.The create_record_selector method signature is getting quite long with many parameters. What do you think about extracting some related parameters into a dataclass or using the builder pattern to improve readability? wdyt?
@dataclass class RecordSelectorConfig: name: str transformations: List[RecordTransformation] decoder: Optional[Decoder] = None client_side_incremental_sync: Optional[Dict[str, Any]] = None def create_record_selector( self, model: RecordSelectorModel, config: Config, selector_config: RecordSelectorConfig, **kwargs: Any, ) -> RecordSelector: # ... rest of the implementationAlso applies to: 2305-2305
2319-2342
: LGTM! Clear implementation of config components resolver.The create_config_components_resolver method:
- Properly creates stream_config and components_mapping
- Follows the factory method pattern consistently
- Handles type conversion correctly
One small suggestion: Would it make sense to add some validation to ensure components_mapping is not empty? wdyt?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
(6 hunks)
🔇 Additional comments (3)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (3)
131-133
: LGTM! Clean import addition.
The new import for ConfigComponentsResolver is properly grouped with related imports.
490-491
: LGTM! Proper registration of new components.
The new components are correctly registered in the PYDANTIC_MODEL_TO_CONSTRUCTOR mapping.
2306-2318
: LGTM! Well-structured stream config creation.
The create_stream_config method is clean and follows the established pattern:
- Proper type hints
- Handles optional parameters safely
- Consistent with other factory methods
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (4)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (2)
1194-1203
: Consider adding validation for configs_pointer pathsThe
StreamConfig
class looks good but could benefit from additional validation to ensure the paths exist in the source config. This would help catch configuration errors early.What if we added a validator to check that the paths exist? Something like:
@validator('configs_pointer') def validate_configs_pointer(cls, v): if not v: raise ValueError("configs_pointer cannot be empty") return vwdyt?
1169-1172
: Consider documenting array index support in field_pathThe
field_path
now supports integers for array indexing, which is a great addition. However, this change could use some documentation to help users understand this capability.What if we added a docstring example showing array indexing? Something like:
# Example showing array indexing in field_path: # ["data", 0, "name"] would access the 'name' field of the first element in the 'data' arraywdyt?
airbyte_cdk/sources/declarative/declarative_component_schema.yaml (2)
2979-3004
: Consider enhancing StreamConfig examplesThe
StreamConfig
schema looks good, but the examples could be more comprehensive to show different config scenarios.What if we added more examples to showcase different use cases? Something like:
examples: - ["data"] # Simple path - ["data", "streams"] # Nested path - ["data", "{{ parameters.name }}"] # With interpolation - ["connections", "{{ config.connection_id }}", "streams"] # Complex path with configwdyt?
2931-2934
: Consider adding a warning about array indicesThe examples for
field_path
now show array indexing, but it might be worth adding a note about potential risks.What if we added a warning in the description about array indices? Something like:
description: > A list of potentially nested fields indicating the full path where value will be added or updated. When using array indices, ensure the arrays exist and have sufficient length to avoid index errors.wdyt?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (5)
airbyte_cdk/sources/declarative/auth/selective_authenticator.py
(1 hunks)airbyte_cdk/sources/declarative/declarative_component_schema.yaml
(3 hunks)airbyte_cdk/sources/declarative/models/declarative_component_schema.py
(5 hunks)airbyte_cdk/sources/declarative/resolvers/__init__.py
(1 hunks)airbyte_cdk/sources/declarative/resolvers/config_components_resolver.py
(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- airbyte_cdk/sources/declarative/auth/selective_authenticator.py
🚧 Files skipped from review as they are similar to previous changes (2)
- airbyte_cdk/sources/declarative/resolvers/init.py
- airbyte_cdk/sources/declarative/resolvers/config_components_resolver.py
🔇 Additional comments (4)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (2)
1205-1210
: LGTM! The ConfigComponentsResolver implementation looks clean and well-structured.
The class properly encapsulates the stream configuration and components mapping, with clear required fields.
1991-1991
: LGTM! The components_resolver update provides good flexibility.
The DynamicDeclarativeStream
now supports both HTTP and config-based component resolution, which provides good extensibility.
airbyte_cdk/sources/declarative/declarative_component_schema.yaml (2)
3005-3024
: LGTM! The ConfigComponentsResolver schema is well-defined.
The schema properly defines the required fields and includes a good description of the experimental nature of the component.
3039-3041
: LGTM! The components_resolver schema update looks good.
The schema properly supports both HTTP and config-based resolvers through anyOf.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of questions to make sure I have the full context but generally, it looks good to me!
airbyte_cdk/sources/declarative/declarative_component_schema.yaml
Outdated
Show resolved
Hide resolved
…airbytehq/airbyte-python-cdk into lazebnyi/add-config-components-resolver
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (2)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml (2)
2931-2934
: Would it be helpful to add documentation about wildcard behavior? wdyt?The new examples show support for integer indices and wildcards in field paths, which is great! Consider adding documentation to explain:
- The difference between single
*
and double**
wildcards- Any limitations on wildcard depth or combinations
3005-3024
: Should we enhance the experimental warning? wdyt?The component looks well-structured! Consider expanding the experimental warning to be more specific about:
- What aspects might change (e.g., interface, behavior)
- Any known limitations or edge cases
- Migration path when changes occur
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml
(3 hunks)airbyte_cdk/sources/declarative/manifest_declarative_source.py
(5 hunks)airbyte_cdk/sources/declarative/models/declarative_component_schema.py
(12 hunks)unit_tests/sources/declarative/resolvers/test_config_components_resolver.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- airbyte_cdk/sources/declarative/manifest_declarative_source.py
- unit_tests/sources/declarative/resolvers/test_config_components_resolver.py
🔇 Additional comments (6)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (4)
1202-1211
: LGTM! The StreamConfig class is well-structured and follows the established pattern.
The class provides a clean interface for locating stream configurations in the source config, with good examples and documentation. The separate entity approach allows for future extensions as discussed in the past review.
1213-1218
: LGTM! The ConfigComponentsResolver class is well-defined and properly marked as experimental.
The class provides a clear interface for resolving components from source config, with all necessary fields and proper documentation.
Line range hint 3039-3041
: LGTM! The components_resolver field now properly supports both resolver types.
The change enables dynamic streams to resolve components either via HTTP or from source config, aligning with the PR objectives.
1177-1180
: LGTM! The examples effectively demonstrate the new field path capabilities.
The updated examples show:
- Array access using integer indices:
["data", 1, "name"]
- Dynamic field access using interpolation:
["data", "{{ components_values.name }}"]
- Wildcard patterns:
["*", "**", "name"]
Let's verify if there are any existing usages that might be affected:
airbyte_cdk/sources/declarative/declarative_component_schema.yaml (2)
2979-3004
: LGTM! The StreamConfig component looks well-designed.
The component structure allows for future extensibility while keeping it focused on its current responsibility. The naming discussion in past reviews was resolved with good rationale.
3039-3041
: LGTM! The DynamicDeclarativeStream changes look good.
The addition of ConfigComponentsResolver as an alternative to HttpComponentsResolver is clean and maintains backward compatibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (2)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (2)
1194-1203
: LGTM! Consider adding path validation for configs_pointer.The new StreamConfig class looks well-designed. To make it more robust, would you consider adding validation to ensure the configs_pointer paths exist in the source config? This could help catch configuration errors early. wdyt?
def validate_configs_pointer(cls, v): if not v: raise ValueError("configs_pointer cannot be empty") return v configs_pointer: List[str] = Field( ..., description="A list of potentially nested fields indicating the full path in source config file where streams configs located.", examples=[["data"], ["data", "streams"], ["data", "{{ parameters.name }}"]], title="Configs Pointer", validators=[validate_configs_pointer] )
1205-1210
: LGTM! Consider enhancing the documentation.The ConfigComponentsResolver class looks good, but would you consider adding more detailed documentation about:
- How it interacts with StreamConfig
- Examples of typical usage patterns
- Best practices for component mapping definitions
This would help other developers understand its purpose better. wdyt?
class ConfigComponentsResolver(BaseModel): type: Literal["ConfigComponentsResolver"] + """Resolves components using stream configuration. + + This resolver allows dynamic configuration of components based on source config values. + It uses StreamConfig to locate stream configurations and ComponentMappingDefinition + to map those configurations to component fields. + + Example: + resolver = ConfigComponentsResolver( + stream_config=StreamConfig(configs_pointer=["streams"]), + components_mapping=[ + ComponentMappingDefinition( + field_path=["name"], + value="{{ components_values['stream_name'] }}" + ) + ] + ) + """ stream_config: StreamConfig components_mapping: List[ComponentMappingDefinition] parameters: Optional[Dict[str, Any]] = Field(None, alias="$parameters")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
airbyte_cdk/sources/declarative/manifest_declarative_source.py
(5 hunks)airbyte_cdk/sources/declarative/models/declarative_component_schema.py
(5 hunks)unit_tests/sources/declarative/resolvers/test_config_components_resolver.py
(1 hunks)unit_tests/sources/declarative/resolvers/test_http_components_resolver.py
(4 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- unit_tests/sources/declarative/resolvers/test_config_components_resolver.py
- airbyte_cdk/sources/declarative/manifest_declarative_source.py
- unit_tests/sources/declarative/resolvers/test_http_components_resolver.py
🔇 Additional comments (2)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (2)
1991-1991
: LGTM! The addition of ConfigComponentsResolver enhances flexibility.
The change allows both HTTP and config-based component resolution. Let's verify that existing HTTP-based resolvers continue to work as expected.
✅ Verification successful
The HttpComponentsResolver integration looks solid and well-maintained
The codebase shows proper integration of HttpComponentsResolver across multiple files with clear dependency chains:
- Defined as a concrete implementation of ComponentsResolver
- Properly exported in
__init__.py
- Integrated into the component transformation and factory layers
- Has established mapping definitions for retrievers and components
The addition of ConfigComponentsResolver as a Union type is a safe enhancement that maintains backward compatibility with existing HTTP-based resolvers.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for existing uses of HttpComponentsResolver to ensure
# backward compatibility
# Search for HttpComponentsResolver usage
rg -U 'HttpComponentsResolver' -g '!*test*' -g '!*schema*'
Length of output: 1876
1169-1172
: LGTM! The addition of integer support in field paths enhances flexibility.
The change allows accessing array elements by index, which is a useful enhancement. Let's verify that the integer indices are properly handled in the implementation.
✅ Verification successful
The previous search might have failed because the regex pattern was too specific. Let's try a broader search to understand the field path implementation and integer handling.
Let's look at the implementation to verify how integer indices are handled in field paths.
The previous patterns didn't match. Let's try to find the implementation and tests with broader patterns to understand how field paths and integer indices are handled.
We found the DpathExtractor class but need to see its implementation and tests to understand how integer indices are handled. Let's look at those specifically.
LGTM! The addition of integer support in field paths looks safe and well-designed.
Looking at the test file, the DpathExtractor already handles various path expressions including array access with wildcards (*
). The addition of integer indices is a natural extension that follows the same pattern. The test cases show that the extractor correctly handles:
- Nested paths with array access
- Various data structures including nested arrays and objects
- Different types of field path expressions
The change is backward compatible since existing usages with string paths and wildcards continue to work as before. What do you think about adding a test case specifically for integer indices to make the new functionality more explicit? 🤔
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check if there are any existing usages of integer indices in field paths
# to ensure backward compatibility
# Search for field_path usage with integer indices
rg -U 'field_path.*\[.*\d+.*\]' -g '!*test*'
Length of output: 46
Script:
#!/bin/bash
# Let's check for field path related implementations to understand how integers are handled
ast-grep --pattern 'class $_ {
$$$
field_path($$$) {
$$$
}
$$$
}'
# Also search for any field path related code
rg -U "field_path.*\[" -A 5 -B 5
Length of output: 103740
Script:
#!/bin/bash
# Let's check the DpathExtractor implementation to understand how field paths are handled
ast-grep --pattern 'class DpathExtractor {
$$$
def get_records($$$) {
$$$
}
$$$
}'
# Also search for any tests that verify integer index handling
rg -U "test.*field_path.*\d+" -A 5 -B 5
Length of output: 138
Script:
#!/bin/bash
# Let's try a different pattern to find the DpathExtractor implementation
rg -U "class DpathExtractor" -A 20 -B 5
# Also look for test files related to DpathExtractor
rg -U "test.*DpathExtractor.*field_path" -A 10 -B 10
Length of output: 5355
Script:
#!/bin/bash
# Let's look at the implementation of get_records in DpathExtractor
rg -U "def get_records" -A 20 -B 5 airbyte_cdk/sources/declarative/extractors/dpath_extractor.py
# And check for any test files specifically for DpathExtractor
fd "test.*dpath.*\.py" -x cat {}
Length of output: 4549
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for addressing those concerns
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved, changes make sense to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (2)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (2)
1220-1229
: LGTM! Consider enhancing the experimental warning with usage guidance.The StreamConfig class looks well-designed. Since it's marked experimental, would it be helpful to add a note about potential API changes and migration paths in the description? Something like:
- description: (This component is experimental. Use at your own risk.) Describes how to get streams config from the source config. + description: (This component is experimental. Use at your own risk. The API may change in future versions.) Describes how to get streams config from the source config. Consider using HttpComponentsResolver for production use cases until this component is stable.wdyt?
1231-1236
: LGTM! The integration looks clean. Consider enhancing the experimental warning.The ConfigComponentsResolver and its integration with DynamicDeclarativeStream look good. Since this is an experimental feature, would it be helpful to add more context about when to use this vs HttpComponentsResolver? Maybe something like:
- description: (This component is experimental. Use at your own risk.) Resolves and populates stream templates with components fetched from the source config. + description: (This component is experimental. Use at your own risk.) Resolves and populates stream templates with components fetched from the source config. Use this when stream configurations are defined in the source config rather than retrieved via HTTP. For HTTP-based configurations, use HttpComponentsResolver instead.wdyt?
Also applies to: 2033-2033
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml
(3 hunks)airbyte_cdk/sources/declarative/manifest_declarative_source.py
(5 hunks)airbyte_cdk/sources/declarative/models/declarative_component_schema.py
(3 hunks)airbyte_cdk/sources/declarative/parsers/manifest_component_transformer.py
(1 hunks)airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
(6 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- airbyte_cdk/sources/declarative/manifest_declarative_source.py
- airbyte_cdk/sources/declarative/parsers/manifest_component_transformer.py
- airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
🔇 Additional comments (2)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)
1195-1198
: LGTM! The field_path examples nicely demonstrate the enhanced capabilities.
The new examples effectively show how to:
- Use integer indices for array access:
["data", 1, "name"]
- Use wildcards for flexible path matching:
["*", "**", "name"]
airbyte_cdk/sources/declarative/declarative_component_schema.yaml (1)
Line range hint 1-3500
: Skipping review of generated file.
This file is auto-generated by datamodel-codegen from the Python models, so it doesn't need direct review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (3)
unit_tests/sources/declarative/resolvers/test_http_components_resolver.py (3)
27-27
: Consider enriching the test configThe
_CONFIG
looks good for basic testing, but should we add more fields that might be commonly used in real scenarios (like api_key that's referenced in the manifest)? wdyt? 🤔_CONFIG = { "start_date": "2024-07-01T00:00:00.000Z", + "api_key": "test_key", + "base_url": "https://api.test.com" }
114-198
: Consider reducing duplication in test manifestsI notice
_MANIFEST_WITH_DUPLICATES
is largely identical to_MANIFEST
except for the path value. Maybe we could reduce duplication by creating a helper function to generate manifests with different paths? wdyt? 🤔def create_test_manifest(path: str = "items") -> dict: """Create a test manifest with the given path.""" return { "version": "6.7.0", # ... rest of the common manifest structure "components_resolver": { "type": "HttpComponentsResolver", "retriever": { # ... common retriever config "requester": { # ... common requester config "path": path, }, }, }, } # Usage _MANIFEST = create_test_manifest() _MANIFEST_WITH_DUPLICATES = create_test_manifest("duplicates_items")
Line range hint
237-280
: Consider enhancing test assertionsThe test looks good with the improved name and proper config usage! A few suggestions to make it even more robust:
- Should we assert the actual response data structure, not just the stream names?
- Maybe add assertions for the error cases too?
wdyt about something like this? 🤔
def test_dynamic_streams_read_with_http_components_resolver(): expected_streams = [ {"name": "item_1", "id": "1"}, {"name": "item_2", "id": "2"} ] # ... existing setup code ... # Assert catalog structure assert len(actual_catalog.streams) == len(expected_streams) for stream, expected in zip(actual_catalog.streams, expected_streams): assert stream.name == expected["name"] assert stream.json_schema is not None # Verify schema exists # Assert record data assert len(records) == len(expected_streams) for record, expected in zip(records, expected_streams): assert record.stream == expected["name"] assert record.data["id"] == expected["id"]
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
unit_tests/sources/declarative/resolvers/test_http_components_resolver.py
(5 hunks)
🔇 Additional comments (1)
unit_tests/sources/declarative/resolvers/test_http_components_resolver.py (1)
24-24
: Heads up: New import suggests missing test case
I notice we're importing AirbyteTracedException
, but I don't see a test case that uses it yet. Should we add a test for the duplicate streams error case? wdyt? 🤔
Would you like me to help draft a test case for handling duplicate stream errors?
b867154
to
ba7b95e
Compare
What
This PR add config component resolver to dynamic stream
Fixed: https://github.com/airbytehq/airbyte-internal-issues/issues/10787?issue=airbytehq%7Cairbyte-internal-issues%7C10789
How
Described schema, models and data classes for
ConfigComponentsResolver
andStreamConfig
. Defined behavior for components resolving using source config. Added validation to ensure dynamic stream names are unique.Summary by CodeRabbit
Release Notes
New Features
StreamConfig
andConfigComponentsResolver
components for enhanced stream configuration management.DynamicDeclarativeStream
to support new component resolution options.Bug Fixes
Tests
ConfigComponentsResolver
to validate dynamic stream behavior.HttpComponentsResolver
with additional mock data and configuration usage.