-WebIndexer is a powerful search engine utility written in C++ designed to build an inverted index from a given dataset, allowing for efficient query processing and retrieval of web pages based on search terms. It employs advanced text processing techniques to clean and tokenize input data, ensuring accurate and relevant search results.
- Text Cleaning: Removes leading and trailing punctuation and converts tokens to lowercase for standardization.
- Inverted Index Construction: Builds an index mapping tokens to the set of web pages they appear on, facilitating quick lookups.
- Search Query Processing: Supports complex queries with union, intersection, and difference operations to refine search results.
- User-Friendly Interface: Simple command-line prompts guide the user through query input and display matching results.