Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Text Chunking Option #73

Open
jarmoza opened this issue Oct 31, 2015 · 0 comments
Open

Text Chunking Option #73

jarmoza opened this issue Oct 31, 2015 · 0 comments

Comments

@jarmoza
Copy link
Owner

jarmoza commented Oct 31, 2015

By default TWiC will now chunk texts over 5000 words, and will do so by attempting to find logical, syntactical endpoints in texts if possible. This chunk size should be alterable via configuration file and command line argument. Part of the reasoning here is to follow suggested methodology (see Jockers' Macroanalysis) and also that large texts noticeably slow down TWiC when the Text View panel is opened. The balance here is also that in the current layouts for CorpusCluster and TextCluster panels, more texts crowd the space.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant