Skip to content
This repository has been archived by the owner on Oct 3, 2022. It is now read-only.

Commit

Permalink
build_knowledge : filter out evals
Browse files Browse the repository at this point in the history
  • Loading branch information
tao-pr authored and starcolon committed Feb 26, 2017
1 parent 32acffb commit 1c31060
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion build_knowledge.py
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ def iter_topic(crawl_collection,start):
def ensure_viable(ns,stopwords):
def clean(a):
# Strip non-alphanumeric symbols (unicode symbols reserved)
a = re.sub("[\x00-\x2F\x3A-\x40\x5B-\x60\x7B-\x7F]+", "", a)
a = re.sub("[\x00-\x2F\x3A-\x40\x5B-\x60\x7B-\x7F\(\)]+", "", a)
for s in stopwords:
a.replace(s,'')
return a.strip()
Expand Down

0 comments on commit 1c31060

Please sign in to comment.