Skip to content

Latest commit

 

History

History
31 lines (21 loc) · 1.46 KB

README.md

File metadata and controls

31 lines (21 loc) · 1.46 KB

ADM-HW4-G18

The group is composed by: Alessandra Anna Griesi, Hannes Engelhardt and Federica Spoto.

In this assignment we solved two different task: in the first one we perform a clustering analysis of house announcements, taking the data through web scraping; in the second one we define two hash functions to check the presence of duplicates in a list of passwords.

Data Source

For the first task, we used the data of the Immobiliare.it website, taking into account more than 10k announcements. For the second task, tha password are given as an input in the file passwords2.txt.

Description of the project

In this repository you will find:

  1. Homework_4.ipynb: the Jupyter file contains all the work done in light of the achievement of the final results:
  • Implementation and comments of Task 1;

  • Implementation and comments of Task 2;

  • Bonus step: implementation of the K-means algorithm from scratch.
  1. function.py: the file contains all the functions created during the study and used in Homework_4.ipynb.