crawler_valor #85

veniciusgrjr · 2015-07-25T14:29:50Z

I've created this crawler for Valor. Please check if it's ok. I try to take in count the coments on @lucasmachadorj pull request.

…to repeat this code in every crawler.

flavioamieiro · 2015-07-27T15:31:22Z

capture/crawler_valor.py

+		index = requests.get(INDEX_URL).content
+		soup = BeautifulSoup(index, "lxml")
+		news_index = soup.find(id="block-valor_capa_automatica-central_automatico").find_all('h2')
+		news_urls = news_urls + ['http://www.valor.com.br' + BeautifulSoup(  art.encode('utf8') , "lxml" ).find('a').attrs['href'] for art in news_index]


It might be a good idea to break this line to make it more readable. PEP8 suggests 72 with a maximum of 79. I like to follow that whenever possible.

flavioamieiro · 2015-07-27T15:43:00Z

I think that, apart from the really small issues I pointed out in line comments, the code looks good and we should merge it.

flavioamieiro · 2015-07-27T15:51:13Z

Also, another really important thing: please use 4 spaces instead of tab. I have no real problem with tabs, but mixing spaces and tabs are a bad idea, and our entire code base is using spaces. If you use vim you can use spaces by adding

    set tabstop=4
    set shiftwidth=4
    set expandtab

To your .vimrc.

flavioamieiro · 2015-07-27T15:51:45Z

capture/crawler_valor.py

+from bs4 import BeautifulSoup
+import requests
+import re
+import pandas as pd


You don't need to import re nor pandas. It's a good idea to remove these imports.

veniciusgrjr added 2 commits July 25, 2015 01:05

crawler for valor

b270401

I've created logging_mc to configure logger. this way, we don't need …

e1b8a51

…to repeat this code in every crawler.

flavioamieiro reviewed Jul 27, 2015
View reviewed changes

Merge remote-tracking branch 'upstream/master'

a952589

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

crawler_valor #85

crawler_valor #85

veniciusgrjr commented Jul 25, 2015

flavioamieiro Jul 27, 2015

flavioamieiro commented Jul 27, 2015

flavioamieiro commented Jul 27, 2015

flavioamieiro Jul 27, 2015

crawler_valor #85

Are you sure you want to change the base?

crawler_valor #85

Conversation

veniciusgrjr commented Jul 25, 2015

flavioamieiro Jul 27, 2015

Choose a reason for hiding this comment

flavioamieiro commented Jul 27, 2015

flavioamieiro commented Jul 27, 2015

flavioamieiro Jul 27, 2015

Choose a reason for hiding this comment