Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runtime Type Testing via Tags #13

Open
DavePearce opened this issue May 5, 2017 · 2 comments
Open

Runtime Type Testing via Tags #13

DavePearce opened this issue May 5, 2017 · 2 comments

Comments

@DavePearce
Copy link
Member

DavePearce commented May 5, 2017

The current approach to runtime type testing is based on runtime interrogation. That is, by exploiting the available type information provided by JavaScript, it checks on demand whether a given value is an instance of a given type. This approach has been pretty successful and is relatively easy to implement. The current could be further optimised. Regardless, there are a few challenges for this approach which are difficult or impossible to overcome:

  1. Type testing of functions or methods. Unfortunately, JavaScript does not provide much useful type information regarding functions (see Type Testing Functions or Methods #9). The only real solution is to somehow embed information into the function reference when it's created. This, potentially, could work.
  2. Type testing cyclic reference structures. This has the potential to go into an infinite loop. In the case of recursive (non-reference) structures, termination is guaranteed because the value represents a tree of finite height. However, in the case of cyclic reference structures, the value could represent (for example) a cyclic list which would mean the type test recursed forever.

These two issues raise the obvious and immediate question as to whether an alternative approach is possible. In particular, an approach based on type tags has been long touted as the ideal solution. But, can it work?

Here is a current list of known challenges for a tag based approach:

  • Any type. Unfortunately, the any type does pose a fairly significant problem for a tag based approach. The key problem is that there is no obvious way to generate an appropriate tag. The only real solution is to embed the full type of the value in question as the tag. This is workable (I believe) and would essentially cause some minor performance problems only in the case of the any type. The other obvious option is to remove the any type altogether.
  • Tag Conversions. One obvious problem is that we would need to perform appropriate tag conversions when the type of a value changes. For example, a value of type int|null flows into a variable of type int|null|bool. The key to minimising the need for this would be to somehow divide up the space (e.g. int is always tag value 0, etc).

A possible space for tags might be:

  • null == 0
  • bool == 1
  • byte == 2
  • int == 3
  • [] == 4
  • & == 5
  • { == 6

Remember that we don't have to encode tags for union, intersection or negation types. This is because values are concrete and cannot have that type. Furthermore, nested union types will correspond to nested tags.

NOTE: Probably, it would help a lot if support for tagging was built into WyIL bytecode. For example, if the necessary tag conversion and introduction points were already identified.

@DavePearce
Copy link
Member Author

DavePearce commented Jul 3, 2017

Another interesting option is to have a file-level tag system (this is perhaps similar in some ways to the idea of "selector colouring" [1]). Roughly speaking, every file would have an array (vtable ?) which maps each index to a concrete type used in the given file. This has some nice properties:

  1. Less Retagging. Under this model, retagging only ever occurs when data flows from a type defined in one file to a type defined in another file. Observe, however, that retagging can always be implemented using a single switch statement.
  2. Efficient Runtime Type Tests. We can always implement a runtime type test using a switch statement.
  3. Accessible Type Information. Full type information is available at runtime for all known types. Hmmmm, is that really true?

To reduce the cost when transferring data between files, we can implement some standard tags for primitives (as above). But, even so, there is two major issues:

  1. Determining Foreign Type Tags is Hard. For the compiler to determine the type tag array for a foreign file seems complex. Specifically, it needs to look through the whole file and see what concrete types there are. One solution here is to expose this information explicitly in the WyIL file.

  2. Dealing with Recursive Types is Hard. The problem with recursive types is that they represent an infinite set of type tags. We can't concretise them all because ... well, that's impossible. The only real solution here is to stick with one tag per union.

An interesting thing that could be done would be to perform some kind of type tag optimisation when compiling multiple files together.

References

@DavePearce
Copy link
Member Author

At this point, its still pretty unclear to me what the best option is. I think there are at least two things to consider:

  1. Can true type tags actually work? That is, if we allowed unions to have concrete type tags (i.e. not just integers) --- is that enough?
  2. Can we simplify Whiley to avoid these problems? Well, we could move towards a more explicit tagged union representation.

Arrgggh.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant