Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Codegen generating deserializers and serializers for the JDWP commands #73

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

wekesa360
Copy link
Contributor

This PR addresses issue #57, creating command classes containing serializer and deserializer functions and grouped into a single file based on a command set.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 27, 2023
""" JDWP serializer classes. """


class JDWPPacketHeader:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably a good idea to make this a dataclasses .dataclass or typing.NamedTuple.

return length_bytes + id_bytes + flags_bytes + command_set_bytes + command_bytes


class JDWPPacket:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this too.

@@ -0,0 +1,31 @@
""" JDWP serializer classes. """
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you see classes in this file being used ? Those classes are not used by the codegen, it's unclear to me yet whether they should be included in this PR.

@@ -0,0 +1,21 @@
"""Command Set: ReferenceType """
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't check generated code in, we'll setup buck rules that generate those files on the fly when the debugger is built.

from projects.jdwp.defs.command_sets.reference_type import Signature


class SignatureCommand:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to generate something more like this:

@dataclasses.dataclass
class SignatureCommand(Command):
    types: ReferenceTypeId
    
    async def serialize(...):
        ...

    @staticmethod
    async def parse(...):
        ...

    async def parse_response():
       ...


def generate_field_serializer(self, field: Field) -> str:
serializer_code: str = ""
if isinstance(field, Struct):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should never be true, this should be (and I believe is) guaranteed on the type system level.

if isinstance(field, Struct):
for subfield in field.fields:
serializer_code += self.generate_field_serializer(subfield)
elif field.type == Type.INT:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What could work pretty well here is pattern matching:

match field.type:
    case Type.INT:
        return f"out.writeInt(self.{field.name})"

    case Type.STRING:
        return f"out.writeString(self.{field.name})"

    ...

    case _:
        raise Exception(f"Unrecognized type: {field.type}")

)
elif field.type == Type.STRING:
serializer_code = (
f" serialized_data += command.{field.name}.encode('utf-8')"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the format that we want. The spec says:

A UTF-8 encoded string, not zero terminated, preceded by a four-byte integer length.
I think it will be a good approach to introduce some kind of input and output stream abstractions that knows how to read/write values of all types represented by PrimitiveType union.

)
elif field.type == Type.OBJECT_ID:
serializer_code = (
f" serialized_data += command.{field.name}.to_bytes(8, 'big')"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not that simple. The spec says:

Object ids, reference type ids, field ids, method ids, and frame ids may be sized differently in different target VM implementations. Typically, their sizes correspond to size of the native identifiers used for these items in JNI and JVMDI calls. The maximum size of any of these types is 8 bytes. The "idSizes" command in the VirtualMachine command set is used by the debugger to determine the size of each of these types.

This is one more reason to have input/output stream abstractions. If we do that then we'll be able to configure them with output of "idSizes" command.

@michalgr
Copy link
Contributor

I think I would break this task into the following PRs:

  1. Generate NewTypes corresponding to values of PrimitiveType enum. It will be a good idea to keep different kinds of ids distinct, pyre will help us not mix things up.
  2. Generate dataclasses/NamedTuples corresponding to commands and resonses (only fields for now).
  3. Propose input stream and output stream interfaces
  4. For each command/response generate parse and serialize methods.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants