-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Week4 hienhpss #51
base: master
Are you sure you want to change the base?
Week4 hienhpss #51
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
fname,lname,email | ||
Bill,Gates,[email protected] | ||
Alice,Wondergirl,[email protected] | ||
Julius,Caesar,[email protected] | ||
Bob,Dylan,[email protected] |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
fname,lname,email | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Remove this. You need to submit your code only. |
||
Mike,Tyson,[email protected] | ||
Bob,Dylan,[email protected] | ||
Neo,Anderson,[email protected] | ||
Bill,Gates,[email protected] |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
Bill Gates <[email protected]> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Remove this. You need to submit your code only. |
||
Bob Dylan <[email protected]> |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,56 @@ | ||
import csv | ||
from csv import Dialect | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't see you using |
||
import sys | ||
from hashlib import md5 | ||
|
||
def read_csv(filename): | ||
'''Reader csv files with header. General function that can be reused''' | ||
with open(filename, newline='') as csv_file: | ||
# Read the header from first line | ||
header = csv_file.readline().rstrip().split(',') | ||
# Read the csv using the header obtained above | ||
csv_reader = csv.DictReader(csv_file, delimiter = ',', fieldnames = header) | ||
for row in csv_reader: | ||
yield(row) | ||
|
||
def generate_md5(*args): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The condition says each guest has only a single email, so you don't need md5, right? Have a cup of tea and think about it. |
||
'''Generate md5 from list of strings. General function that can be reused''' | ||
m = md5() | ||
for i in args: | ||
# Encode first | ||
i_enc = i.encode('utf-8') | ||
m.update(i_enc) | ||
return m.digest() | ||
|
||
def week4_csv_to_dict(csv_rows): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Don't call it week4, give it some meaningful name. |
||
'''Convert an iterator of rows into dictionary | ||
with key as hash of the whole row. This function is not generic | ||
and can be used for week0004 practice only''' | ||
result = dict() | ||
for row in csv_rows: | ||
md5_email = generate_md5(row['email']) | ||
#only add into dict if email is not used. Skip those duplicate emails | ||
if not md5_email in list(result.keys()): | ||
result[md5_email] = row | ||
return result | ||
|
||
|
||
def week4_match_sources(): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Don't call it week4, give it some meaningful name. |
||
'''Match the 2 input file and return the people | ||
who subscribe to both''' | ||
file1 = sys.argv[1] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would rather pass 2 file names to this function. Avoid accessing |
||
file2 = sys.argv[2] | ||
source1 = week4_csv_to_dict(read_csv(file1)) | ||
source2 = week4_csv_to_dict(read_csv(file2)) | ||
for key in set(source1.keys()): | ||
if key in set(source2.keys()): | ||
if source1[key]['fname'] == source2[key]['fname'] and source1[key]['lname'] == source2[key]['lname']: | ||
yield(source1[key]) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Congrats! Finally you have a valid reason to use generators ;) |
||
|
||
def week4_output(): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Don't call it week4, give it some meaningful name. |
||
'''Outout list of duplicate guests to file''' | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Spelling error |
||
for person in week4_match_sources(): | ||
print('{:s} {:s} <{:s}>'.format(str(person['fname']),str(person['lname']),str(person['email']))) | ||
|
||
if __name__ == "__main__": | ||
week4_output() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Always have a new line at the end of your code files. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this. You need to submit your code only.