Skip to content

NikitaKharkov/devProTestTask

Repository files navigation

Dev Pro test task

Task:

Write the parser of the site http://www.wordreference.com/synonyms/

Use OOP and PHP 7.1

Setup

Download project and go to project directory. Run docker-compose up and then docker exec -it dp_php bash.

Then you have run composer install and then php bin/console rabbitmq-supervisor:rebuild.

Run exit to out of container.

Add to /etc/hosts of your machine dp.loc and then in browser write just domain dp.loc. That's it.

Results

I used proxy servers and it helps but slow process very much. I think there is a much easier solution to make it, but I don't know it for now...

I use rabbitMQ queries to speed up the process, but I faced with captcha and I don't think now. And then I decided to use proxies. After many attempts I discovered that even if I use proxy I can't provision of grab synonyms of all words. So I see 3 ways to solve problem:

  • Use so many proxies and if we find that some proxy return content with captcha - remove from list and that's it;
  • Compute what time is acceptable for grabbing and set timeout. But its so slow way;
  • Use API for this as planned by this site.

If you could tell me how to do it I will be very thankful because I want to decide this task very much anyway. Not depends on your decision.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published