-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ELMo modules #298
Add ELMo modules #298
Conversation
resolve #297 |
Codecov Report
@@ Coverage Diff @@
## master #298 +/- ##
==========================================
+ Coverage 82.63% 82.93% +0.29%
==========================================
Files 207 215 +8
Lines 16006 17283 +1277
==========================================
+ Hits 13226 14333 +1107
- Misses 2780 2950 +170
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really appreciate the hard work of getting ELMo into our codebase. But I feel that the code should not be merged into master in its current condition, the main reason being to avoid code bloat. I think we can have a temporary branch in the upstream repo (asyml/texar-pytorch) to hold the changes in this PR, so people could still use this if they want; but until we get rid of enough code bloat, I don't think we should merge this.
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess some of these modules were just copied from the AllenNLP repo? Could you check whether this would cause licensing issues?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, it seems that they've implemented almost everything from scratch... I think we have some of the functionalities in the library, for instance (correct me if I'm wrong), custom-cell LSTMs w/ highway connections, and more powerful embedding classes. Simply committing all this code into the master branch would be enormous code bloat.
For the added functions, most of them are directly copy from For the added modules, I modified or simplified most of them to get rid of some internal dependencies in |
FYI, |
Add
texar
-styled ELMo encoder adapted fromallennlp
. The corresponding tokenizer will be in another PR.