Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow a generator to be provided instead of a List #1030

Closed
wants to merge 15 commits into from

Conversation

skinkie
Copy link
Contributor

@skinkie skinkie commented May 7, 2024

📒 Description

Considering you are writing a very big tree, you don't want to materialise that tree before it ends up in a file in a list first. Ideally the entire subtree should only be rendered just in time.

🔗 What I've Done

I have allowed a Generator to be handled as a List.

💬 Comments

There might be more places this needs to be changed. Since List[] is used everywhere, type hinting fails.

🛫 Checklist

import sqlite3

from xsdata.formats.dataclass.context import XmlContext
from xsdata.formats.dataclass.parsers import XmlParser
from xsdata.formats.dataclass.parsers.config import ParserConfig
from xsdata.formats.dataclass.parsers.handlers import LxmlEventHandler
from xsdata.formats.dataclass.serializers import XmlSerializer
from xsdata.formats.dataclass.serializers.config import SerializerConfig
from xsdata.models.datatype import XmlDateTime

from netex import PublicationDelivery, ParticipantRef, MultilingualString, DataObjectsRelStructure, GeneralFrame, \
    GeneralFrameMembersRelStructure, ServiceJourney

serializer_config = SerializerConfig(ignore_default_attributes=True, xml_declaration=True)
serializer_config.pretty_print = True
serializer_config.ignore_default_attributes = True
serializer = XmlSerializer(config=serializer_config)

context = XmlContext()
config = ParserConfig(fail_on_unknown_properties=False)
parser = XmlParser(context=context, config=config, handler=LxmlEventHandler)

def load_generator(con, clazz, limit=None):
    type = getattr(clazz.Meta, 'name', clazz.__name__)

    cur = con.cursor()
    if limit is None:
        cur.execute(f"SELECT object FROM {type};")
    else:
        cur.execute(f"SELECT object FROM {type} LIMIT {limit};")

    while True:
        xml = cur.fetchone()
        if xml is None:
            break
        yield parser.from_bytes(xml[0], clazz)

with sqlite3.connect("/tmp/netex.sqlite") as con:
    publication_delivery = PublicationDelivery(
                publication_timestamp=XmlDateTime.now(),
                participant_ref=ParticipantRef(value="NDOV"),
                description=MultilingualString(value="Huge XML Serializer test"),
                data_objects=DataObjectsRelStructure(choice=[GeneralFrame(members=GeneralFrameMembersRelStructure(choice=load_generator(con, ServiceJourney, 10)))]),
                version="ntx:1.1",
            )

ns_map = {'': 'http://www.netex.org.uk/netex', 'gml': 'http://www.opengis.net/gml/3.2'}
with open('netex-output/huge.xml', 'w') as out:
    serializer.write(out, publication_delivery, ns_map)

Copy link

codecov bot commented May 7, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (cce0a16) to head (d4b3e5b).

Current head d4b3e5b differs from pull request most recent head 67d847f

Please upload reports for the commit 67d847f to get more accurate results.

Additional details and impacted files
@@            Coverage Diff            @@
##              main     #1030   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files          115       115           
  Lines         9238      9265   +27     
  Branches      2179      2190   +11     
=========================================
+ Hits          9238      9265   +27     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@tefra
Copy link
Owner

tefra commented May 7, 2024

We need to properly support the Iterable type annotation in model fields, and of course support this for both xml/json serialization @skinkie.

@skinkie
Copy link
Contributor Author

skinkie commented May 7, 2024

We need to properly support the Iterable type annotation in model fields, and of course support this for both xml/json serialization @skinkie.

But would this something you would support from an architecture point of view?

@tefra
Copy link
Owner

tefra commented May 7, 2024

We need to properly support the Iterable type annotation in model fields, and of course support this for both xml/json serialization @skinkie.

But would this something you would support from an architecture point of view?

Yes

@skinkie
Copy link
Contributor Author

skinkie commented May 30, 2024

@tefra How would you like to proceed? Materialise List or Tuple in the tests?

@skinkie
Copy link
Contributor Author

skinkie commented May 30, 2024

Testing it with my own code results in this error. So it is clearly not done yet.

xsdata.exceptions.XmlContextError: Error on DataObjectsRelStructure::choice: Xml Elements does not support typing `typing.Iterable[typing.Union[netex.general_version_frame_structure.CompositeFrame, netex.mobility_journey_frame.MobilityJourneyFrame, netex.mobility_service_frame.MobilityServiceFrame, netex.sales_transaction_frame.SalesTransactionFrame, netex.fare_frame.FareFrame, netex.driver_schedule_frame.DriverScheduleFrame, netex.vehicle_schedule_frame.VehicleScheduleFrame, netex.service_frame.ServiceFrame, netex.timetable_frame.TimetableFrame, netex.site_frame.SiteFrame, netex.infrastructure_frame.InfrastructureFrame, netex.general_version_frame_structure.GeneralFrame, netex.resource_frame.ResourceFrame, netex.service_calendar_frame.ServiceCalendarFrame]]`

Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
1 Security Hotspot

See analysis details on SonarCloud

@skinkie
Copy link
Contributor Author

skinkie commented Jun 1, 2024

Parsing breaks with Iterable.

        if tokens_factory:
            value = value if collections.is_array(value) else value.split()
            return tokens_factory(
                converter.deserialize(val, types, ns_map=ns_map, format=format)
                for val in value

@skinkie skinkie force-pushed the feature_convert_generator branch from 1d77439 to c99fbb9 Compare June 18, 2024 19:18
@skinkie
Copy link
Contributor Author

skinkie commented Jun 18, 2024

Parsing solved.

@skinkie skinkie marked this pull request as ready for review June 22, 2024 20:33
@skinkie
Copy link
Contributor Author

skinkie commented Jul 2, 2024

While it looked like 'working' again, I noticed that Iterable again breaks the parsing.

Copy link

sonarqubecloud bot commented Aug 6, 2024

@tefra tefra marked this pull request as draft August 25, 2024 05:49
skinkie added a commit to skinkie/xsdata that referenced this pull request Sep 5, 2024
@tefra
Copy link
Owner

tefra commented Oct 20, 2024

Closing in favor of #1082 @skinkie give it a try!

@tefra tefra closed this Oct 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants