Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a helper script and a notebook (Jupyter) to explore the NIPS papers corpus #7

Open
liadmagen opened this issue Oct 13, 2018 · 0 comments
Labels
good first issue Good for newcomers hacktoberfest 🍁 https://hacktoberfest.digitalocean.com/

Comments

@liadmagen
Copy link
Member

After the corpus is created and saved, create a notebook (src/notebook/) where we explore the features of the corpus.

On a helper script (src/papers/data/), load the dataset, and create a matrix for words frequencies, in the preprocessed corpus and in the original one.
expose functions to return this matrices and its properties.

In the Jupyter notebook, explore the matrices. Which words are the most frequent in each one?

Feel free to explore and display more properties of the datasets.

@liadmagen liadmagen added good first issue Good for newcomers hacktoberfest 🍁 https://hacktoberfest.digitalocean.com/ labels Oct 13, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers hacktoberfest 🍁 https://hacktoberfest.digitalocean.com/
Projects
None yet
Development

No branches or pull requests

1 participant