-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expand aspect extraction on sentence context and further improve aspect extraction #92
Comments
Interesting way to improve the aspect extraction, I like it! Two points:
Have a nice weekend, too! |
Hello, yes, it would be better to use cam2 indeed.
… On 28 Sep 2018, at 17:45, Matthias Schildwächter ***@***.***> wrote:
Interesting way to improve the aspect extraction, I like it!
Two points:
I would prefer it, if you could use the /cam2 for such experimental features, as it always can break something. Please don't change the backend running on /cam-api2 because that one is used by our study system (I switched from cam-api, because changes came in).
Actually as I see it, the ML classifiers are not working anymore
That selections are just there for test reasons, right? Because from user perspective I would not want so select the context size I think. The feature is nice no question about that, but maybe the context size should be selected automatically if there is too few context available because too few sentences got found.
Have a nice weekend, too!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#92 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABY6vtI7JpXbnOJ-KlYcKK4BiVH4LwDbks5ufkQdgaJpZM4W-qjv>.
|
@MSchild As for your other question, that's true, the additional options are only visible in the front end page for testing purposes (@alexanderpanchenko wanted to be able to try different configurations with it). As soon as I (and whoever else wants to test this) have found out which combination of context size and amount of sentences to use this'll be implemented in the back end only and will not be visible for the user. I could imagine giving the user the option of using the context for aspect extraction versus not doing so but maybe that's not even necessary. As for your third point, I actually have no idea how the ML classifiers work as those weren't written by me. If I broke something there with my additions let me know where and how I could fix it because I really don't know anything about the ML code. I'll be reporting back with my testing results from this week a bit later today. |
Here's my report for this week. I've done a few smaller things to further improve on what I'm already doing:
In addition to that, my main goal was to further test the context aspect extraction. I've not tested the quality of aspects being found within the contexts but instead I've tested the amount of time it takes. The results are pretty underwhelming: As stated in my last report, even a pretty small amount of sentences being used for context aspect extraction means that the whole process takes waaay longer than without any usage of contexts. Because of that default was at 0.1 % of all sentences. I've tested multiple different configurations regarding context size, amount of sentences to use, and objects to compare. After all this testing my suggestions would be:
|
Alright, I didn't really know what all the different versions of CAM were all about. So if I understand you correctly, previously you would have preferred me using /cam2 for this, but now you switched the study to /cam2 and that means now you actually want me to use standard /cam for such things, right? Please let me know if I didn't get that correctly. The studie used the backend on cam-api and I switched that to cam-api2. |
@ChulioZ I moved your changes to another branch to keep the master branch clean and running. (I created another branch with your changes and went back on master to the commit before) As next step, I will deploy your new branch on /cam3 to provide access to your features and set /cam to the old one, to have a running demo. If you have questions, feel free to ask :) Edit: deployed now http://ltdemos.informatik.uni-hamburg.de/cam3/#/ |
I fixed the issue with the ML approaches and redeployed it on /cam3/ so that at least BoW also can be used now. To create the demo version I created a new branch called "demo" ( https://github.com/uhh-lt/cam/tree/demo ). The basis for demo was taken from master and I merged your feature branch into it to get your changes. Furthermore, I replaced the frontend adaptions by properties in the config.json file. For further development on your aspect extraction I suggest to work on the created feature branch (described above). If you pushed changes and want to redeploy it on /cam3, all you have to do is to go to srv/docker/pan-cam3, use git pull and execute docker-compose down && docker-compose build && docker-compose up -d (and delete the old containers -> docker rmi) |
I've now committed the latest version of context aspect extraction into the branch you've created for it. I think it should be a stable version right now. I've also deployed it to cam3. It features context aspect extraction for 10 sentences per object, using a context size of 2 (meaning 2 sentences before and 2 sentences after the actual sentence are used). If you need/want context aspect extraction for the demo/YouTube video/study/whatever, you can merge it into master. I can also do it myself if you want me to do so, @alexanderpanchenko . The front end part giving the user the option to choose which context sentence amount and context size to use is now gone because the testing phase is basically over. If you want to test different configurations for those numbers, you can do so by changing the assigned values in pos_link_extracter.py. The current configuration of 10 sentences per object and a context size of 2 seems to be the best after all my testing. If we find that it still takes too long or that we could even go a bit longer, I could change it to 5 or even 20 sentences. Just tell me when you think numbers should be changed. |
@alexanderpanchenko
Here's my report for this week:
Whenever I find a structure like that in a sentence all nouns that appear before or after this structure (depending on the rest of the sentence) are treated as aspects.
As always I'm open for suggestions and ideas.
Have a good weekend!
The text was updated successfully, but these errors were encountered: