Codegen generating deserializers and serializers for the JDWP commands #73

wekesa360 · 2023-10-27T17:39:38Z

This PR addresses issue #57, creating command classes containing serializer and deserializer functions and grouped into a single file based on a command set.

michalgr · 2023-10-30T11:11:26Z

projects/jdwp/serializers/jdwp_packet_erializer.py

+""" JDWP serializer classes. """
+
+
+class JDWPPacketHeader:


It's probably a good idea to make this a dataclasses .dataclass or typing.NamedTuple.

michalgr · 2023-10-30T11:11:39Z

projects/jdwp/serializers/jdwp_packet_erializer.py

+        return length_bytes + id_bytes + flags_bytes + command_set_bytes + command_bytes
+
+
+class JDWPPacket:


And this too.

michalgr · 2023-10-30T11:14:17Z

projects/jdwp/serializers/jdwp_packet_erializer.py

@@ -0,0 +1,31 @@
+""" JDWP serializer classes. """


How do you see classes in this file being used ? Those classes are not used by the codegen, it's unclear to me yet whether they should be included in this PR.

michalgr · 2023-10-30T11:15:56Z

projects/jdwp/serializers/reference_type_serializer.py

@@ -0,0 +1,21 @@
+"""Command Set: ReferenceType """


Please don't check generated code in, we'll setup buck rules that generate those files on the fly when the debugger is built.

michalgr · 2023-10-30T12:13:50Z

projects/jdwp/serializers/reference_type_serializer.py

+from projects.jdwp.defs.command_sets.reference_type import Signature
+
+
+class SignatureCommand:


I think we want to generate something more like this:

@dataclasses.dataclass class SignatureCommand(Command): types: ReferenceTypeId async def serialize(...): ... @staticmethod async def parse(...): ... async def parse_response(): ...

michalgr · 2023-10-30T12:16:52Z

projects/jdwp/serializers/serializer_codegen.py

+
+    def generate_field_serializer(self, field: Field) -> str:
+        serializer_code: str = ""
+        if isinstance(field, Struct):


This should never be true, this should be (and I believe is) guaranteed on the type system level.

michalgr · 2023-10-30T12:19:46Z

projects/jdwp/serializers/serializer_codegen.py

+        if isinstance(field, Struct):
+            for subfield in field.fields:
+                serializer_code += self.generate_field_serializer(subfield)
+        elif field.type == Type.INT:


What could work pretty well here is pattern matching:

match field.type: case Type.INT: return f"out.writeInt(self.{field.name})" case Type.STRING: return f"out.writeString(self.{field.name})" ... case _: raise Exception(f"Unrecognized type: {field.type}")

michalgr · 2023-10-30T12:28:00Z

projects/jdwp/serializers/serializer_codegen.py

+            )
+        elif field.type == Type.STRING:
+            serializer_code = (
+                f"        serialized_data += command.{field.name}.encode('utf-8')"


This is not the format that we want. The spec says:

A UTF-8 encoded string, not zero terminated, preceded by a four-byte integer length.
I think it will be a good approach to introduce some kind of input and output stream abstractions that knows how to read/write values of all types represented by PrimitiveType union.

michalgr · 2023-10-30T12:30:14Z

projects/jdwp/serializers/serializer_codegen.py

+            )
+        elif field.type == Type.OBJECT_ID:
+            serializer_code = (
+                f"        serialized_data += command.{field.name}.to_bytes(8, 'big')"


This is not that simple. The spec says:

Object ids, reference type ids, field ids, method ids, and frame ids may be sized differently in different target VM implementations. Typically, their sizes correspond to size of the native identifiers used for these items in JNI and JVMDI calls. The maximum size of any of these types is 8 bytes. The "idSizes" command in the VirtualMachine command set is used by the debugger to determine the size of each of these types.

This is one more reason to have input/output stream abstractions. If we do that then we'll be able to configure them with output of "idSizes" command.

michalgr · 2023-10-30T12:41:37Z

I think I would break this task into the following PRs:

Generate NewTypes corresponding to values of PrimitiveType enum. It will be a good idea to keep different kinds of ids distinct, pyre will help us not mix things up.
Generate dataclasses/NamedTuples corresponding to commands and resonses (only fields for now).
Propose input stream and output stream interfaces
For each command/response generate parse and serialize methods.

wekesa360 added 11 commits October 20, 2023 19:45

add jdwp serializer classes

6ba9d85

add type annotation

88a417b

add utility function

ac87a47

add test for serializer

073d535

Merge branch 'main' into codegen-generating-jdwp

2b61f61

Merge branch 'main' into codegen-generating-jdwp

06744e2

rename file

71b4a04

Merge branch 'main' into codegen-generating-jdwp

5e78041

Merge branch 'main' into codegen-generating-jdwp

4eab45d

add codegen for serializer

10983c2

add type annotation

430d8c4

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 27, 2023

wekesa360 added 2 commits October 27, 2023 20:46

fix generated code indentation

6a4ed6a

reformat files

7efa84b

michalgr requested changes Oct 30, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Codegen generating deserializers and serializers for the JDWP commands #73

Codegen generating deserializers and serializers for the JDWP commands #73

wekesa360 commented Oct 27, 2023

michalgr Oct 30, 2023

michalgr Oct 30, 2023

michalgr Oct 30, 2023

michalgr Oct 30, 2023

michalgr Oct 30, 2023

michalgr Oct 30, 2023

michalgr Oct 30, 2023

michalgr Oct 30, 2023

michalgr Oct 30, 2023

michalgr commented Oct 30, 2023

		return length_bytes + id_bytes + flags_bytes + command_set_bytes + command_bytes


		class JDWPPacket:

		from projects.jdwp.defs.command_sets.reference_type import Signature


		class SignatureCommand:

Codegen generating deserializers and serializers for the JDWP commands #73

Are you sure you want to change the base?

Codegen generating deserializers and serializers for the JDWP commands #73

Conversation

wekesa360 commented Oct 27, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

michalgr commented Oct 30, 2023