Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A little more documentation #5

Open
turian opened this issue May 26, 2011 · 3 comments
Open

A little more documentation #5

turian opened this issue May 26, 2011 · 3 comments

Comments

@turian
Copy link

turian commented May 26, 2011

It seems like scanLinks.py is supposed to be run before scanData.py, correct?

However, the README doesn't reflect this.
Also, the header for scanLinks.py doesn't describe what it does.

What is directScan.py? Should this be used? It drops tables tho.
The header in directScan.py is also incorrect.

@faraday
Copy link
Owner

faraday commented May 27, 2011

There's a part about scanLinks.py under Usage in README:

This creates the pagelinks table and records incoming and outgoing link counts.
python scanLinks.py <hgw.xml file from Wikiprep dump>

directScan.py can be used when you want to test with the exact selection of articles from Gabrilovich et al. in 2005 dump (see selected.txt). You should execute directScan.py instead of scanData.py if you want to test this.

And you mean the line:

USAGE: scanData.py <hgw.xml file from Wikiprep>

from the header in directScan.py, right?

@turian
Copy link
Author

turian commented May 27, 2011

Oh, my mistake, I didn't see the USAGE section. I thought the "what this contains" at the top was the usage.

I meant the line that you mentioned "USAGE: scanData.py <hgw.xml file from Wikiprep>" in directScan.py. A handful of other python files say that they are scanData.py, even though they are not.

@faraday
Copy link
Owner

faraday commented Jul 18, 2011

Usage explanations are enhanced. Scripts try to guide the user when provided with wrong/insufficient arguments.
However, I will edit README in the light of your additions since basic DB setup (creating a "wiki" database with UTF8 charset for instance) is not explained currently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants