soup.find fails to find Tableau data #58

stepa8 · 2022-03-18T18:56:20Z

Ran this on WSL on Windows 10 which is a flavor of ubuntu.

from tableauscraper import TableauScraper as TS

url = "https://public.tableau.com/app/profile/epidemiology.immunization.services.branch/viz/COVID-19DailyHighlights/DailyHighlights"
ts = TS()
ts.loads(url)

Then, we see this error:
python scrape_tableau.py
Traceback (most recent call last):
File "scrape_tableau.py", line 9, in
ts.loads(url)
File "/mnt/c/Users/stepa8/Projects/tableau-scraping/tab-env/lib/python3.8/site-packages/tableauscraper/TableauScraper.py", line 80, in loads
soup.find("textarea", {"id": "tsConfigContainer"}).text
AttributeError: 'NoneType' object has no attribute 'text'

It appears soup.find cannot find: "textarea", {"id": "tsConfigContainer"

Is there a workaround?

xplreitr · 2022-04-26T22:21:52Z

I was running into a similar problem and this issue sent me in the right direction.

#30

It seems like there is a URL other than the public facing URL . You have to open chrome tools and the network tab find the url that starts with https://public.tableau.com/views....

I tried looking up the one you were interested in and couldn't find the exact tableau worksheet, but the only one published by epidemiology.immunization.services.branch was this one https://public.tableau.com/app/profile/epidemiology.immunization.services.branch/viz/COVID-19DemographicsTEST_16498711218660/DailyCounts

And if you look in the network tab when it was loading, this URL popped up

https://public.tableau.com/views/COVID-19DemographicsTEST_16498711218660/DailyCounts

Which I just did a quick test and this URL seems to work. Someone else more knowledgeable might be able to explain the difference between the two URLs. But it might be helpful to put something in the documentation that the public facing URL is not exactly the URL needed to make this work

martinolmos · 2023-10-25T18:53:52Z

Hello, thank you for this amazing library.

I am facing a similar issue. I found the public.tableau.com/views url but is returning an empty DataFrame.
Here is the url: 'https://public.tableau.com/views/DB_FISCA_01/Fisca_DS_RankingPeliculas'

martinolmos · 2023-10-25T20:07:25Z

I tried going through the source code and the thing is that data['secondaryInfo'] is empty.

Here is my code, which I took from here:

import requests
from bs4 import BeautifulSoup
import json
import re

url = "https://public.tableau.com/views/DB_FISCA_01/Fisca_DS_RankingPeliculas"

r = requests.get(
    url,
    params= {
        ":display_static_image":"y",
        ":bootstrapWhenNotified":"true",
        ":embed":"true",
        ":language":"es-ES",
        ":embed":"y",
        ":showVizHome":"n",
        ":apiID":"host0"
    }
)

soup = BeautifulSoup(r.text, "html.parser")
tableauData = json.loads(soup.find("textarea",{"id": "tsConfigContainer"}).text)

dataUrl = f'https://public.tableau.com{tableauData["vizql_root"]}/bootstrapSession/sessions/{tableauData["sessionid"]}'


r = requests.post(dataUrl, data= {
    "sheet_id": tableauData["sheetId"],
})


dataReg = re.search('\d+;({.*})\d+;({.*})', r.text, re.MULTILINE)
info = json.loads(dataReg.group(1))
data = json.loads(dataReg.group(2))

And then print(data) returns {'secondaryInfo': {}}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

soup.find fails to find Tableau data #58

soup.find fails to find Tableau data #58

stepa8 commented Mar 18, 2022

xplreitr commented Apr 26, 2022

martinolmos commented Oct 25, 2023

martinolmos commented Oct 25, 2023

soup.find fails to find Tableau data #58

soup.find fails to find Tableau data #58

Comments

stepa8 commented Mar 18, 2022

xplreitr commented Apr 26, 2022

martinolmos commented Oct 25, 2023

martinolmos commented Oct 25, 2023