Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First version of vocabulary program #2

Open
arwer13 opened this issue Feb 26, 2016 · 3 comments
Open

First version of vocabulary program #2

arwer13 opened this issue Feb 26, 2016 · 3 comments
Assignees

Comments

@arwer13
Copy link
Owner

arwer13 commented Feb 26, 2016

With respect to the vocabulary program. Here is the starting specification as I see it.

Develop a schema of relational SQL database to store

  • words (all unique)

  • examples of word usage (whole sentences)

  • frequencies of words

    Result: file with sql code to create the schema.
    Notes:

    • At this stage it would be impossible to differentiate word forms, but it's ok.
    • Consider to use SQLite3 as database, because it's the simplest one, but very popular
    • pure xml or json or something like this are not good choices at all for storing and further processing of our kind of data (for many reasons). Some document-oriented (for example json documents) might be considered, but it 's to be a good choice in our case.

Develop a Python program for parsing provided text and adding / updating database.

It should have a simple command line interface: one argument -- path to .txt file

Develop a Python program for querying information about word:

  • all known examples of usage
  • it's frequency
  • it's percentage (among all known unique words)

This program should have a simple command line interface: one argument -- word to look up.

@RyanMcCarl RyanMcCarl self-assigned this Feb 29, 2016
@RyanMcCarl
Copy link
Collaborator

Hi Artyom,

Sorry, I missed this email; this looks good to me. I will get started on it this week.

Ryan

On Feb 26, 2016, at 6:01 AM, Veremeenko Artyom [email protected] wrote:

With respect to the vocabulary program. Here is the starting specification as I see it.

Develop a schema of relational SQL database to store

words (all unique)
examples of word usage (whole sentences)
frequencies of words

Result: file with sql code to create the schema.
Notes:

At this stage it would be impossible to differentiate word forms, but it's ok.
Consider to use SQLite3 as database, because it's the simplest one, but very popular
pure xml or json or something like this are not good choices at all for storing and further processing of our kind of data (for many reasons). Some document-oriented (for example json documents) might be considered, but it 's to be a good choice in our case.
Develop a Python program for parsing provided text and adding / updating database.

It should have a simple command line interface: one argument -- path to .txt file

Develop a Python program for querying information about word:

  • all known examples of usage
  • it's frequency
  • it's percentage (among all known unique words)

This program should have a simple command line interface: one argument -- word to look up.

Reply to this email directly or view it on GitHub.

@arwer13
Copy link
Owner Author

arwer13 commented Mar 2, 2016 via email

@RyanMcCarl
Copy link
Collaborator

Hi Artyom,

Sorry for creating confusion; I think I don't have the time to work on a full program right now because of my full-time job and WordBrewery. So instead, let's continue with the plan I proposed before of short 5-15 minute coding lessons + Q & As and code review. Does that sound good?

Thanks,
Ryan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants