Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to generate random expressions with a minimum depth requirement? #30

Open
0x0f0f0f opened this issue Mar 27, 2021 · 1 comment
Open

Comments

@0x0f0f0f
Copy link

0x0f0f0f commented Mar 27, 2021

grammar = @grammar begin
    Real = x | y | z | a | b | c
    Real = -Real
    Real = Real * Real | Real + Real | Real - Real | Real / Real
    Real = _(Base.rand(-100.0:100.0)) 
    Real = sin(Real) | cos(Real)
    Real = |(-100:100)
end

n = 100
Random.seed!(rand(UInt))
rulenode = rand(RuleNode, grammar, :Real, n)
println(get_executable(rulenode, grammar))

For benchmarking my metaprogramming/CAS package I would like to generate random expressions that are precisely n levels deep. By using rand the way above, most of the times I do get expressions that are 0 level deep (constants), I would like to plot the average performance of my package in manipulating expressions in relation to the depth of each expression. The batches of test expressions should all have the same size. Is there any way to achieve such behaviour?

@rcnlee
Copy link
Collaborator

rcnlee commented Mar 27, 2021

Hi @0x0f0f0f, there are a few approaches to getting expressions of an exact target depth:

  1. The naive solution would be rejection sampling. However, with very skewed grammars, this is not practical because the probability of choosing a terminal production rule greatly outweighs the probability of choosing a non-terminal one. BTW, is the target depth really 100?
  2. To improve the sampling efficiency of (1), you can use a probabilistic grammar and heavily bias sampling towards the non-terminal production rules.
  3. Without rejection sampling, the approach would be to restructure the grammar to declare a new type for each level, i.e., Real0, ..., Realn. Put all the terminal production rules at Realn and only non-terminal production rules at the previous levels. For a low target depth, you can do this manually, but for high target depth, you probably want to write a function to programmatically generate it. What you need to generate is a Grammar object. You can reference @grammar to get an idea of how to populate it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants