Skip to content

Latest commit

 

History

History
7 lines (6 loc) · 849 Bytes

README.md

File metadata and controls

7 lines (6 loc) · 849 Bytes

WebIndexer

-WebIndexer is a powerful search engine utility written in C++ designed to build an inverted index from a given dataset, allowing for efficient query processing and retrieval of web pages based on search terms. It employs advanced text processing techniques to clean and tokenize input data, ensuring accurate and relevant search results.

  • Text Cleaning: Removes leading and trailing punctuation and converts tokens to lowercase for standardization.
  • Inverted Index Construction: Builds an index mapping tokens to the set of web pages they appear on, facilitating quick lookups.
  • Search Query Processing: Supports complex queries with union, intersection, and difference operations to refine search results.
  • User-Friendly Interface: Simple command-line prompts guide the user through query input and display matching results.