Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mylstm elichika (WIP) #210

Merged
merged 8 commits into from
May 13, 2019
Merged

Conversation

Rishav1
Copy link
Contributor

@Rishav1 Rishav1 commented Apr 28, 2019

Implemented and tested individual units. MyLSTM ONNX graph is being generated.
However, on running it, we get the following stack trace.

*** Testing out/elichika_tmp/tmp ***
build/tools/run_onnx --test out/elichika_tmp/tmp
Initializing ChainerX...
Loading model...
Loading data...
Found 1 test cases
Constructing model...
Failed to schedule (): Loop(@F.L.47.range_L.47/Len, , @F.L.47.range_L.47, batch_size, o_st_unused, in_3_L.-1, gate_unused, in_2_L.-1, f_unused, pmask_unused, input_unused, num_hidden, in_1_L.-1, o_unused, nmask_unused, i_unused, f_st_st_unused, time_unused, in_3_st_unused, inputs_L.46, inputs_st_unused, param_l_W, param_l_b) -> (@F.L.47.range_st_L.47, batch_size_st_L.47, o_st_st_L.47, in_3_st_L.47, gate_st_L.47, in_2_st_L.47, f_st_L.47, pmask_st_L.47, input_st_L.47, num_hidden_st_L.47, h_st_L.47, o_st_L.47, nmask_st_L.47, i_st_L.47, f_st_st_st_L.47, time_st_L.47, in_3_st_st_L.47, inputs_st_L.47, inputs_st_st_L.47, CanonicalizeLoopUnusedOut@param_l_W, CanonicalizeLoopUnusedOut@param_l_b)
o_st_unused cannot be ready
gate_unused cannot be ready
f_unused cannot be ready
pmask_unused cannot be ready
input_unused cannot be ready
o_unused cannot be ready
nmask_unused cannot be ready
i_unused cannot be ready
f_st_st_unused cannot be ready
in_3_st_unused cannot be ready
inputs_st_unused cannot be ready
Failed graph is stored in /tmp/chainer_compiler_failure.onnx
Check `false' failed! in CheckSanity at /home/rchouras/workplace/chainer-compiler/compiler/scheduler.cc:233:

@Rishav1
Copy link
Contributor Author

Rishav1 commented Apr 28, 2019

Checked the following via unit testing.

  • len() function
  • x.shape, x.size attributes
  • F.tanh, F.sigmoid chainer functions
  • F.broadcast_to, F.expand_dims chainer functions

Since the graph is being generated, I suspect there is a deeper issue that needs a visual comparison between the graph generated by ch2o and the one by elichika to figure out the scheduler error.

Some insight as to help visualization would be appreciated. Gonna try this first though.

Discovered two bugs -
- In the current impelemtation, all the variables in a for loop must be
initialized before the loop. Added a test to fix this in the future.
- Strange behaviour when using a lambda function over model constrution.
The graph is generated but their is value mismatch between the
outputs. Added a test for understanding the same too.
@Rishav1
Copy link
Contributor Author

Rishav1 commented Apr 28, 2019

Copied commit 2e0765 description.

Discovered two bugs -

  • In the current implementation, all the variables in a for loop must be
    initialized before the loop. Added a test to fix this in the future.
  • Strange behavior when using a lambda function over model construction.
    The graph is generated but there is value mismatch between the
    outputs. Added a test for understanding the same too.

Copy link
Member

@shinh shinh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me. Sawada-san, do you have any comments especially on how shape and size are handled?

Thanks for the quick and great PR. I really like this PR has tests for each function you added


def vcall(self, module: 'Field', graph: 'Graph', inst: 'values.ValueRef', args: 'functions.FunctionArgInput', line=-1):
node = nodes.NodeLen(
args.inputs[0].get_value(), # TODO: Check this.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate, please? It'll be great if you can give info about what should be done to remove this TODO comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was a non-issue. I was concerned if a new node was necessary here. As @durswd pointed out, this can be done through NodeCall itself.

self.name = 'len'

def vcall(self, module: 'Field', graph: 'Graph', inst: 'values.ValueRef', args: 'functions.FunctionArgInput', line=-1):
node = nodes.NodeLen(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add assert len(args.inputs) == 1 or something just in case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessary I think.

@@ -258,7 +258,8 @@ def get_outputs(self) -> 'List[FieldOutput]':
ret = []

for key, att in self.attributes.items():

if att.name in ['shape', 'size']:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sawada-san, could you double-check this is the right approach to handle these attribtues here? If this is right, I guess we need at least a comment to explain these will be handled in other code (vevaluator.py:114?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this implementation causes error as follows (it is very rare case).

class A:
def init(self):
self.size = 'Size'

size, shape is regarded as bultin property.
I implemented shape in NDArray

https://github.com/pfnet-research/chainer-compiler/blob/master/elichika/elichika/parser/functions_ndarray.py#L126

it is redundant. But it can support many cases.
Do you change it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't aware that there was an implementation to handle NDArrayFunctions. In that case, there is a bug in the implementation then. It doesn't handle the following case.

xs = np.random.rand(3,4,5)
[x.shape for x in xs]

I'm sure your implementation can be extended to fix this though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@@ -44,6 +44,17 @@ def forward(self, xs, h, c, mask):
#h = self.initial_h
#c = self.initial_c
inputs = F.pad_sequence(xs)
x = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a TODO comment to explain this is a workaround for the bug and these assignments should be removed?

@@ -141,6 +147,14 @@ def main():
[np.random.rand(4, 3).astype(np.float32), 2, 5],
subname='double_for', backprop=True)

# Bugs.
model = lambda: B()
testtools.generate_testcase(model, args, subname='lambda_bug')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the expected behavior here? For me,

$ build/tools/dump out/elichika_syntax_For_closure_bug > a
$ build/tools/dump out/elichika_syntax_For_lambda_bug > b
$ diff -u a b

shows

--- a	2019-04-29 17:05:30.658437913 +0900
+++ b	2019-04-29 17:05:38.354337738 +0900
@@ -1,4 +1,4 @@
-=== out/elichika_syntax_For_closure_bug ===
+=== out/elichika_syntax_For_lambda_bug ===
 ir_version: 5
 producer_name: "elichika"
 producer_version: "0.1"
@@ -262,7 +262,7 @@
 opset_import {
   version: 10
 }
-=== out/elichika_syntax_For_closure_bug/test_data_set_0/input_0_0.pb ===
+=== out/elichika_syntax_For_lambda_bug/test_data_set_0/input_0_0.pb ===
 dims: 4
 dims: 5
 data_type: 1
@@ -287,7 +287,7 @@
 float_data: 0.130388752
 float_data: 0.653830528
 name: "in_0_L.-1"
-=== out/elichika_syntax_For_closure_bug/test_data_set_0/input_0_1.pb ===
+=== out/elichika_syntax_For_lambda_bug/test_data_set_0/input_0_1.pb ===
 dims: 4
 dims: 5
 data_type: 1
@@ -312,7 +312,7 @@
 float_data: 0.46937564
 float_data: 0.190825194
 name: "in_0_L.-1"
-=== out/elichika_syntax_For_closure_bug/test_data_set_0/input_0_2.pb ===
+=== out/elichika_syntax_For_lambda_bug/test_data_set_0/input_0_2.pb ===
 dims: 4
 dims: 5
 data_type: 1
@@ -337,7 +337,7 @@
 float_data: 0.648225069
 float_data: 0.292232305
 name: "in_0_L.-1"
-=== out/elichika_syntax_For_closure_bug/test_data_set_0/input_0_3.pb ===
+=== out/elichika_syntax_For_lambda_bug/test_data_set_0/input_0_3.pb ===
 dims: 4
 dims: 5
 data_type: 1
@@ -362,11 +362,11 @@
 float_data: 0.254202753
 float_data: 0.447350442
 name: "in_0_L.-1"
-=== out/elichika_syntax_For_closure_bug/test_data_set_0/input_1.pb ===
+=== out/elichika_syntax_For_lambda_bug/test_data_set_0/input_1.pb ===
 data_type: 7
 int64_data: 4
 name: "in_1_L.-1"
-=== out/elichika_syntax_For_closure_bug/test_data_set_0/output_0.pb ===
+=== out/elichika_syntax_For_lambda_bug/test_data_set_0/output_0.pb ===
 dims: 4
 dims: 5
 data_type: 1

which is reasonable. Your comment git checkout -b Rishav1-mylstm-elichika mylstm-elichika
git pull git://github.com/Rishav1/chainer-compiler.git mylstm-elichika suggests the diff is larger than mine for you?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad. This lambda_bug test has been moved to Linear.py. This test doesn't show any error and works as expected. Plus the B2 test was commented out by mistake. Fixed it in a consequent commit.

testtools.generate_testcase(model, args, subname='lambda_bug')

# model = B2()
# testtools.generate_testcase(model, args, subname='forloop_bug')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for identifying this issue! I'm surprised CH2O didn't have this test.

@@ -401,3 +401,26 @@ def __init__(self, classtype, value, line=-1):

def __str__(self):
return 'Convert({},{})'.format(self.classtype, self.lineprop)

class NodeLen(Node):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that it can be replaced into NodeCall.

@durswd durswd merged this pull request into pfnet-research:mylstm-elichika May 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants