Skip to content

Serialization

Karan Dhareshwar edited this page Sep 20, 2019 · 1 revision

What is serialization and why do I need it?

Later parts of this lab will have you implementing your own protocols. An important part of this is understanding when one packet ends and when another begins. As we discussed in class, insofar we have been using the "" string to mark the end of our 'packets'. Those of you with some previous programming knowledge may have already worked with JSON in some capacity, serialization is very similar (if not the same) to convert a JSON string to something like an object or something like a dictionary (ie. key-value pair) . Playground undertakes almost all the serialization burden away from you.

What does a typical Playground Packet look like

To create a Playground packet is very similar to defining a class in Python, and this actually plays into the idea of serialization. Remember, since you are sending information over a network (overlay or otherwise), you have to first convert it into a 'stream' of data that can be sent over a network.

At this point, you can begin to define the important part of your packet: the fields that define it. Fields require a name and a type. The types are not Python types. Rather, they are types that I have created to represent data that will be sent over a network. The currently defined types are all in playground.network.packet.fieldtypes and include:

  • UINT (with UINT8, UINT16, UINT32, and UINT64 variants)
  • INT (with INT8, INT16, INT32, and INT64 variants)
  • BOOL
  • LIST
  • STRING
  • BUFFER
  • ComplexFieldType

We’ll save ComplexFieldType for another lab. For now, the other types should be sufficient. Let’s discuss each one briefly.

UINT and INTs are integers (no decimal) and UINT’s are unsigned (>= 0). The numbers that follow are how many bits. An INT8 is an 8-bit integer, and can hold any value between -126 and +127.

A BOOL is a true/false.

Strings and Buffers are for holding Python strings and bytes. You can search around on the Internet for an explanation of the difference (see, e.g., https://stackoverflow.com/questions/6224052/what-is-the-difference-between-a-string-and-a-byte-string). But for just a quick practical explanation:

s1 = “this is a string”
b1 = b”these are bytes” # note the ‘b’ in front of the quotes

And finally, let’s discuss LIST. LIST allows you to send multiple items of the same type in a packet. It is always declared with a second type (e.g., LIST(UINT8), LIST(STRING), etc).

In addition to the value that each type can hold, each type can also have a “null” value that is represented by the FIELD_NOT_SET value. This value needs to be imported from playground.network.packet as well.

Let’s get back to creating our packet. First, let’s make a packet that has some UINT32’s, a STRING, and a BUFFER. So, we need to import those types accordingly:

	from playground.network.packet.fieldtypes import UINT32, STRING, BUFFER

Now let’s define a few fields:

	from playground.network.packet import PacketType
	from playground.network.packet.fieldtypes import UINT32, STRING, BUFFER

	class MyPacket(PacketType):
		DEFINITION_IDENTIFIER = “lab2b.student_x.MyPacket”
		DEFINITION_VERSION = “1.0”

		FIELDS = [
			(“counter1”, UINT32),
			(“counter2”, UINT32),
			(“name”, STRING),
			(“data”, BUFFER)
			]

That’s it! The packet is completely defined. Each field in the “FIELDS” list identifies a field by it’s name and its type. These will be automatically populated when creating an instance of the packet. Let’s do that next: packet1 = MyPacket() packet1.counter1 = 100

Where did counter1 come from? The PacketType class, upon instantiation, creates variables named after the field names. In this case, it created counter1, counter2, name, and data. And, when setting the data, it will do some basic type checking. For example:

	packet1.counter2 = -100

This line above will throw an exception because it will note that counter2 is an unsigned int and cannot be negative.

Once the packet is created, it can be serialized into a stream of bytes. In the next lab, you will send the bytes over the network but, for now, we just want to test that this serialization and de-serialization back into an object works as expected. To serialize a packet, call the serialize() method.

	packetBytes = packet1.serialize()

If you are trying the example so far, this line above should throw an exception. The problem is that a packet won’t serialize unless all required values are set. Remember the FIELD_NOT_SET value? If any non-optional field is FIELD_NOT_SET, serialization will fail. We’ll deal with optional values another time. For now, let’s set all the fields of MyPacket:

	packet1.counter1 = 100
	packet1.counter2 = 200
	packet1.name = “Dr. Nielson”
	packet1.data = b“This may look like a string but it’s actually a sequence of bytes.”

Now we can serialize:

	packetBytes = packet1.__serialize__()

You may want to print these bytes out just to see what they look like. These bytes are appropriate for sending over a network. Once the bytes are received, they can be de-serialized back into an object. There are two ways of doing this.

The first way is to use the Deserialize class method of PacketType (or MyPacket). This method assumes you have enough bytes to completely de-serialize. Let’s try that out:

	packet2 = PacketType.Deserialize(packetBytes)
	if packet1 == packet2:
		print(“These two packets are the same!”)

What happened here is we took packet1, turned it into a stream of bytes, and then used Deserialize to make an equivalent object. The two objects can be compared together and, so long as their fields match, they’ll be found equivalent as shown in the example above.

Deserialize works great but in network operations, you don’t always receive all the data at once. And sometimes, you might receive the data from two packets at the same time. How do you know if you have enough to deserialize? And how do you know if you need to deserialize more than one packet?

Fortunately, the PacketType class also provides a Deserializer object that deals with all of these problems. The Deserializer object takes network bytes in chunks and returns as many packets as it can unpack. Here is how it works:

	deserializer = PacketType.Deserializer()
	deserilaizer.update(data)
	for packet in deserializer.nextPackets():
		# now I have a packet!

Here’s an example using the MyPacket example:

	packet1 = MyPacket()
	# fill in packet1 fields

	packet2 = MyPacket()
	# fill in packet2 fields

	packet3 = MyPacket()
	# fill in packet3 fields

	pktBytes = packet1.__serialize__() + packet2.__serialize__() + packet3.__serialize__()

Ok, so far so good. We have all three packets serialized into a single stream of bytes. How can we test the Deserializer object?

Let’s create a test where Deserializer only receives 10 bytes at a time.

	deserializer = PacketType.Deserializer()
	print(“Starting with {} bytes of data”.format(len(pktBytes)))
	while len(pktBytes) > 0:
		# let’s take of a 10 byte chunk
		chunk, pktBytes = pktBytes[:10], pktBytes[10:]
		deserializer.update(chunk)
		print(“Another 10 bytes loaded into deserializer. Left={}”.format(len(pktBytes)))
		for packet in deserializer.nextPackets():
			print(“got a packet!”)
			if packet == packet1: print(“It’s packet 1!”)
			elif packet == packet2: print(It’s packet 2!”)
			elif packet == packet3: print(“It’s packet 3!”)

Try playing with this until it make sense and you feel comfortable

Now one thing to note is in the above example, every packet you created must have its values initialized to some value, you could have it be where you manually configure them take on some default value, but you must note that the serializer will not do this for you and will throw an exeception if you don't set a value for a field.

It maybe the case that you don't want to use all the values you define, in this case you can also use the Optional field. You can do it in the following manner:

class OptionalPacket(Packet.Type):
     FIELDS = [
          ("compulsory_value", UINT32),
          ("optional_value", UINT32 {Optional: True})
     ]

Note: In the above example I omitted the DEFINITION variables for convenience, but you must always include those.

You can also, set up a Packet with a list in it.

class ListPacket(Packet.Type):
      FIELDS = [
           ("list",LIST[STRING])
      ]

With this packet you can have a field of that contains a list of strings.

Note: This is different from a python list which can have values like `["Hello", 1] which are of different types. You can think of these as arrays in C where all the elements have the same data type.