You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Originally posted by rnag November 27, 2024
I want to add my thoughts on planned (breaking) changes in the next major release V1.
Planned Changes
There will/should be no default key transform on dump (serialization). So if dataclass fields are defined in snake_case, then in JSON output it will also be in snake_case.
Thus we can remove helper class JSONPyWizard, as it was only a stop-gap solution.
Similarly, there will be no "auto" key transform on load (de-serialization) anymore. See my note/comment under "Performance Improvements" below.
JSONWizard should be the default class name, it is time to do away with alias to JSONSerializable, which IMO doesn't make much sense to retain in a library called Dataclass Wizard.
The __str__() will no longer be default (or at least the same) on JSONWizard subclass.
Pretty-printing a dataclass instance as JSON is a bit unexpected and humorous to me (maybe childish?). Not sure what the new default will be. To use an example of a library, pydantic does it weirdly, it prints the field names with repr'd values separated by a space, and no class name. Maybe there's a middle ground or it could involve leveraging pprint. I'll have to think on it.
We will no longer automatically (silently) convert float value or a float in str (ex. 123.4 or '12.3') to an int if the annotated type is int. There seems to be lot of concern over this and it appears to be tied around unintentional data loss, and I agree, we shouldn't lose the fractional part when converting to int, especially as Python we should strive to be more explicit and not do "silent" conversions like these.
The @dataclass decorator may no longer be required? For convenience, our library can use @dataclass_transform and apply it ourselves if a class isn't decorated with it. Especially true as most IDEs like PyCharm now support it. I think this would be a huge help for users, and me personally, as I sometimes forget to apply @dataclass.
All deprecated stuff should and can be removed (ex. __pre_as_dict__() hook)
Though for best practice, we can also support None when loading to str type. Something like '' if x is None else str(x) seems like a good middle ground to have 🤔
Methods under LoadMixin should now return a string instead of be defined as regular functions, this will boost performance as we now exec function anyway, so there's no need to nest functions when parsing individual fields.
If I have time, I can also do a similar thing for DumpMixin and dumpers.py. My reasoning is, perhaps by default we can use the type annotations on a field to determine how to dump/serialize it. For example, if annotated type is str | None, then have kwargs[field] = value in string code to return the field value, no need to check the type of value as how dataclasses does it, e.g. if type(obj) in _ATOMIC_TYPES: ... each time asdict is called. Though my follow-up thought was, it will prove tricky for cases like Optional and Union. For Union type annotated fields, maybe it's best to check the type of value directly after all.
Coincidentally, this also means some (or all?) of the Parsers in parsers.py can be removed, as they will be unnecessary.
The default behavior should be to iterate over dataclass fields on de-serialization, instead of looping over the JSON object. This will have the minor benefit of eliminating for loop. I am thinking maybe having a Meta setting such as input_letter_case or similar, so e.g. if set to input_letter_case='CAMEL', then it will enable automatically map my_str dataclass field to myStr in input JSON object. Plus of course, another setting such as wizard_mode=True or auto_key_transform=True would effectively disable "minor optimization" mode and loop over the input JSON object, as this library is currently doing it, and as the example on the frontpage of the docs clearly illustrates to users.
I had more changes planned, if I remember them I will add or jot them down here. Thanks all, and kindly let me know any comments or feedback down below! 👋
The text was updated successfully, but these errors were encountered:
Discussed in #153
Originally posted by rnag November 27, 2024
I want to add my thoughts on planned (breaking) changes in the next major release V1.
Planned Changes
snake_case
, then in JSON output it will also be insnake_case
.JSONPyWizard
, as it was only a stop-gap solution.JSONWizard
should be the default class name, it is time to do away with alias toJSONSerializable
, which IMO doesn't make much sense to retain in a library called Dataclass Wizard.__str__()
will no longer be default (or at least the same) onJSONWizard
subclass.pydantic
does it weirdly, it prints the field names withrepr
'd values separated by a space, and no class name. Maybe there's a middle ground or it could involve leveragingpprint
. I'll have to think on it.float
value or a float instr
(ex.123.4
or'12.3'
) to anint
if the annotated type isint
. There seems to be lot of concern over this and it appears to be tied around unintentional data loss, and I agree, we shouldn't lose the fractional part when converting toint
, especially as Python we should strive to be more explicit and not do "silent" conversions like these.@dataclass
decorator may no longer be required? For convenience, our library can use@dataclass_transform
and apply it ourselves if a class isn't decorated with it. Especially true as most IDEs like PyCharm now support it. I think this would be a huge help for users, and me personally, as I sometimes forget to apply@dataclass
.__pre_as_dict__()
hook)Performance improvements
as_str()
is unnecessary, simply using builtinstr()
appears to be the fastest approach. What a shocker 😮None
when loading tostr
type. Something like'' if x is None else str(x)
seems like a good middle ground to have 🤔LoadMixin
should now return a string instead of be defined as regular functions, this will boost performance as we nowexec
function anyway, so there's no need to nest functions when parsing individual fields.DumpMixin
anddumpers.py
. My reasoning is, perhaps by default we can use the type annotations on a field to determine how to dump/serialize it. For example, if annotated type isstr | None
, then havekwargs[field] = value
in string code to return the field value, no need to check the type of value as howdataclasses
does it, e.g.if type(obj) in _ATOMIC_TYPES: ...
each timeasdict
is called. Though my follow-up thought was, it will prove tricky for cases likeOptional
andUnion
. ForUnion
type annotated fields, maybe it's best to check the type of value directly after all.parsers.py
can be removed, as they will be unnecessary.for
loop. I am thinking maybe having aMeta
setting such asinput_letter_case
or similar, so e.g. if set toinput_letter_case='CAMEL'
, then it will enable automatically mapmy_str
dataclass field tomyStr
in input JSON object. Plus of course, another setting such aswizard_mode=True
orauto_key_transform=True
would effectively disable "minor optimization" mode and loop over the input JSON object, as this library is currently doing it, and as the example on the frontpage of the docs clearly illustrates to users.I had more changes planned, if I remember them I will add or jot them down here. Thanks all, and kindly let me know any comments or feedback down below! 👋
The text was updated successfully, but these errors were encountered: