Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Week4 hienhpss #51

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions week0004/hienhpss/python/event1.input
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
fname,lname,email
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this. You need to submit your code only.

Bill,Gates,[email protected]
Alice,Wondergirl,[email protected]
Julius,Caesar,[email protected]
Bob,Dylan,[email protected]
5 changes: 5 additions & 0 deletions week0004/hienhpss/python/event2.input
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
fname,lname,email
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this. You need to submit your code only.

Mike,Tyson,[email protected]
Bob,Dylan,[email protected]
Neo,Anderson,[email protected]
Bill,Gates,[email protected]
2 changes: 2 additions & 0 deletions week0004/hienhpss/python/guests.output
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Bill Gates <[email protected]>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this. You need to submit your code only.

Bob Dylan <[email protected]>
56 changes: 56 additions & 0 deletions week0004/hienhpss/python/week0004.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
import csv
from csv import Dialect
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see you using Dialect

import sys
from hashlib import md5

def read_csv(filename):
'''Reader csv files with header. General function that can be reused'''
with open(filename, newline='') as csv_file:
# Read the header from first line
header = csv_file.readline().rstrip().split(',')
# Read the csv using the header obtained above
csv_reader = csv.DictReader(csv_file, delimiter = ',', fieldnames = header)
for row in csv_reader:
yield(row)

def generate_md5(*args):
Copy link
Member

@vietlq vietlq Oct 21, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition says each guest has only a single email, so you don't need md5, right? Have a cup of tea and think about it.

'''Generate md5 from list of strings. General function that can be reused'''
m = md5()
for i in args:
# Encode first
i_enc = i.encode('utf-8')
m.update(i_enc)
return m.digest()

def week4_csv_to_dict(csv_rows):
Copy link
Member

@vietlq vietlq Oct 21, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't call it week4, give it some meaningful name.

'''Convert an iterator of rows into dictionary
with key as hash of the whole row. This function is not generic
and can be used for week0004 practice only'''
result = dict()
for row in csv_rows:
md5_email = generate_md5(row['email'])
#only add into dict if email is not used. Skip those duplicate emails
if not md5_email in list(result.keys()):
result[md5_email] = row
return result


def week4_match_sources():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't call it week4, give it some meaningful name.

'''Match the 2 input file and return the people
who subscribe to both'''
file1 = sys.argv[1]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather pass 2 file names to this function. Avoid accessing sys.argv to make this function reusable.

file2 = sys.argv[2]
source1 = week4_csv_to_dict(read_csv(file1))
source2 = week4_csv_to_dict(read_csv(file2))
for key in set(source1.keys()):
if key in set(source2.keys()):
if source1[key]['fname'] == source2[key]['fname'] and source1[key]['lname'] == source2[key]['lname']:
yield(source1[key])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Congrats! Finally you have a valid reason to use generators ;)


def week4_output():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't call it week4, give it some meaningful name.

'''Outout list of duplicate guests to file'''
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling error

for person in week4_match_sources():
print('{:s} {:s} <{:s}>'.format(str(person['fname']),str(person['lname']),str(person['email'])))

if __name__ == "__main__":
week4_output()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always have a new line at the end of your code files.