-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
strange diffs getting tweeted #7
Comments
Should be interesting to compare Python readability with the JavaScript version that it is based on. |
Weird. I just ran the above, and I get:
If I view the source of https://www.thestar.com/news/world/2017/01/11/uk-teen-charged-with-murder-of-7-year-old-girl.html I see that text in a large chunk of JavaScript here ...or just ctrl+F "Good Wife" |
Run the same thing a few minutes later and I get:
|
Wild. I guess now we know why we're getting the diffs. What the heck is going on? Could they be serving advertisements randomly to people? |
Looks like it isn't actually grabbing the body consistently. I'm not seeing a way to really tweak Readability either.
|
I've put the torstar account on pause until we can figure this one out since it's putting out so many false positives. |
That's a wise move. Definitely leaving it open because I bet we run up against this type of issue with other sites. |
@edsu I think things have resolved themselves for the most part with the recent commits. What we were seeing before is now coming through like this tweet: https://twitter.com/torstar_diff/status/842119916958453762 -- So, maybe resolving #28 might fully resolve this issue? |
@ruebot noticed a series of odd updates like this which led to the discovery that readability returns very little content sometimes. For example:
returns (at the moment):
Perhaps there should be a configurable threshold below which the content will be ignored or at least not tweeted? Could readability be tuned in this case to return content that is more appropriate like the text of the AP press release?
The text was updated successfully, but these errors were encountered: