-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Complete mini project 3 #3
base: master
Are you sure you want to change the base?
Conversation
|
||
### Required Packages: | ||
pip install nltk requests vaderSentiment | ||
pip install matplotlib scikit-learn scip |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really minor, but these two lines came out as on the same line which made it initially difficult to understand that there were two commands. Make sure that your markdown styling is what you want it to be in the final product.
for text in all_texts: | ||
wordlist = text.split() | ||
for word in wordlist: | ||
if(word not in allwords): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No parentheses needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, really good work. Each function has an informative docstring. Code is concise and clean.
Re: "I found unit testing quite challenging for the data sets of this size and nature": one technique is to use a small test set for unit tests. For example, in GeneFinder, tiny sequences are used in the unit tests, in order to make them manageable and debuggable. (In GeneFinder, I would actually use smaller unit tests, as in the GeneFinder solution set.) |
There's substantial duplication between |
No description provided.