Skip to content

Commit

Permalink
Merge pull request #1 from awiouy/56.27pre-20160719
Browse files Browse the repository at this point in the history
56.27pre 20160719
  • Loading branch information
awiouy authored Jul 20, 2016
2 parents a6930ef + f301251 commit 376bd65
Show file tree
Hide file tree
Showing 929 changed files with 106,409 additions and 0 deletions.
Empty file removed README.md
Empty file.
Binary file added WebGrab+Plus/WG2MP.exe
Binary file not shown.
Binary file added WebGrab+Plus/WebGrab+Plus.exe
Binary file not shown.
Binary file added WebGrab+Plus/xmltv.dll
Binary file not shown.
Binary file added WebGrab+Plus/xmltv_time_correct.exe
Binary file not shown.
116 changes: 116 additions & 0 deletions config/WebGrab++.config.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
<?xml version="1.0"?>
<!-- Configuration file for WebGrab+Plus, the incremental Electronic-Program-Guide web grabber
by Jan van Straaten, December 2011
Version V1.1.1 -->

<settings>
<!-- filename
The path (required) + filename where the epgguide xml file is /will be located. It must include drive and folder. Like C:\ProgramData\ServerCare\WebGrab\guide.xml
If the file already exist (from last run or from another xmltv source) it will read it and use what fits the requested output. In that case the file will be updated. If no such file exist it will be created.
Change the following to your own needs :
-->
<filename>/storage/.kodi/userdata/addon_data/service.webgrabplus/guide.xml</filename>

<!-- modes:
d or debug saves the output xmltv file in a file with -debug addition in the file name . The original xmltv file will be kept.
m or measure measures the time for each updated show or new show added
n = nomark disables the update-type marking (n) (c) (g) (r) at the end of the description
v or verify verifies the result following a channel update
w or wget use wget as grab engine (might improve site recognition in rare cases)
Note that modes can be added in one line, separated by comma's or spaces, or both.
-->
<mode>m</mode>

<!-- postprocess:
Optional , specifies which of the available postprocesses should run.
syntax: <postprocess run="" grab="">processname</postprocess>
(optional) grab="yes" or "y" or "true" or "on" : grabs epg first (default) ; "no" or "n" or "false" or "off" : skip epg grabbing
(optional) run="yes" or "y" or "true" or "on" : runs the postprocess (default) ; "no" or "n" or "false" or "off" : do not run post process
processname: the process to run :
processname = mdb runs a build in movie database grabber (read / adapt ...\mdb\mdb.confif.xml
processname = rex runs a postprocess that re-allocates xmltv elements (read / adapt ...\rex\rex.config.xml)
examples:
<postprocess run="on" grab="on">mdb</postprocess> grabs first , then run mdb
<postprocess>mdb</postprocess> same as above (uses defaults for grab and run)
<postprocess grab="no">rex</postprocess> runs rex without grab (existing xmltv file)
-->
<postprocess run="y" grab="y">mdb</postprocess>

<!-- proxy:
This setting is only required if your computer is connected to internet behind a proxy
specify proxy address as ip:port like <proxy>192.168.2.4:8080</proxy>
or as <proxy>automatic</proxy> which attempts to read the proxy address from your connection settings. If your proxy requires a username and password, add them like
<proxy user="username" password="password">192.168.2.4:8080</proxy>
<proxy>192.168.2.2:8080</proxy>
-->
<proxy>automatic</proxy>

<!-- user agent:
The user agent string that is sent to the tvguide website. Some sites require this. Valid values are either
, in which case the program generates a random string, or any other string like <user-agent>Mozilla/5.0 (Windows; U; MSIE 9.0; WIndows NT 9.0; en-US)</user-agent> <user-agent>random</user-agent><user-agent>Mozilla/5.0 (Windows; U; MSIE 9.0; WIndows NT 9.0; en-US)</user-agent><user-agent>Mozilla/5.0 (Linux; U; Android 0.5; en-us) AppleWebKit/522+ (KHTML, like Gecko) Safari/419.3</user-agent>
-->
<user-agent>Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0; yie9)</user-agent>


<!-- logging:
simply put 'on' in there to start logging, anything else will turn it off
-->
<logging>on</logging>

<!--retry
The most simple form of retry defines the amount of times the grabber engine should attempt to capture a web page before giving up and continuing with the next page, like <retry>4</retry>
It is also the place to specify delays between retries and the grabbing of html pages with the following attributes: timeout; the delay between retries (default is 10 sec), channel-delay; the delay between subsequent channels (default is 0), index-delay; the delay between the grabbing of index pages (default is 0), show-delay; the delay between the grabbing of detail show pages (default is 0). In the most complete version it will look like this:
<retry time-out="5" channel-delay="5" index-delay="1" show-delay="1">4</retry> show-delay="2"
-->
<retry time-out="5">4</retry>

<!--skip
It takes two values H,m separated by a comma:
The first H : if a show takes more than H hours, it's either tellsell or other commercial fluff, or simply a mistake or error, we want to skip such shows.
The second m : if a show is less or equal than m minutes it is probably an announcement , in any case not a real show.
When entered as <skip></skip> the defaults are 12 hours, 1 minute, same as <skip>12,1</skip>. To disable this function enter or just leave out this entry completely<skip>14, 1</skip><skip>16,1</skip>
-->

<skip>noskip</skip>

<!--timespan
The timespan for which shows will be grabbed.
It takes one or two values separated by a comma or a space. The first is the number of days (including today) to download, note that 0 is today. The second (optional) is a time specified between 0:00 and 24:00 which will reduce the download to only the one show (per day) which is scheduled around the specified time. Any value between start time (including) and stop time will do
This -one-show-only mode is helpfull if a SiteIni file needs to be debugged
-->
<timespan>0</timespan>

<!-- update mode
i or incremental only updates of changes , gabs, repairs and new shows
l or light forces update of today and new shows, rest as incremental
s or smart forces update of today and tomorrow and new shows, rest as light
f or full or force forces full update
If one of these values is entered here it will apply to all channels selected for update
(see channel). This value overrules the value of 'update' for in the individual channels
If no value is entered here the individual 'update' values from the channellist are taken
-->
<update></update>

<!-- The channel-list :
Each channel to be grabbed has a separate entry in the list, the most common form is:
<channel update=.. site=.. site_id=.. xmltv_id=.. >display-name</channel>
Besides this form, there is a possibility to specify special channels like 'combi-channels' and 'timeoffset-channels', see further down for more information-->
<!-- Channel list files :
The easiest way to compose this channel-list is to copy the required channels from the channel-list files which can be found in the SiteIni.Pack for nearly every supported tvguide site. -->
<!-- update :
The mode values here can be set for each channel differently if not overruled by the general update setting (see above). Allowed values are as the same as the general update setting. Any other value will be ignored. If any of the allowed values of 'update' is entered, this channel will be updated , no value no update ! In that case the epg data of that channel will remain as it is. -->
<!-- site:
The website to be used to get the EPG from. The value entered here is the name of the .ini file that supplies the specific parameters for the site without .ini extension.
e.g tvgids.nl.ini becomes site="tvgids.nl" and gids.publiekeomroep.nl.ini becomes site="gids.publiekeomroep.nl".-->
<!-- site_id:
This is the number or text used by the site as reference to the correct html page for this channel. It is used by the program to compose the url for the shows for a channel. For nearly all sites supported by the program a channel-list file is provided in the siteini-pack. It list most of the available channels including this site_id -->
<!-- xmltv_id :
The xmltv_id can be any string that suits your needs, you will find it back as the "channel" in your xml file as in :
<programme start="20100218072500 +0200" stop="20100218075500 +0200" channel="RTL7-id"> -->
<!-- display-name: This will be used in the xmltv file to give the channel's displayname. That is the name the epgprogram will use to display the channel. Give it any value you like. It is no problem if site_id , xmltv_id and display-name are equal -->
<!-- Important !
Be aware that all channels entered here will be included in the xmltv channel table even if no update is requested. This allows the update of individual channels without affecting the data of the others in the list. A channel not in this list will be removed from your xmltv listing together with all the show data of it if found there by WebGrab+Plus. (If you use WebGrab+Plus with a xmltv input file from another source, it will remove all data from channels not in this list and create an entry for new channels)
WebGrab+Plus uses the xmltv_id to identify a channel in an existing xmltv file.
-->
</settings>
12 changes: 12 additions & 0 deletions config/chans2correct.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
<channels>
<!--
This file specifies the channels to be 'time' corrected by xmltv_time_correct.exe
Syntax of this file:
<channel time_error="+1">channel-name</channel>
- time_error : the number of (decimal)hours the channel is 'off',
(so, if you want all shows of a channel 1 hour later in the output xmltv file, specify time_error="-1",
for 1:30 minutes earlier, specify time_error="1.5")
- channel-name, the xmltv_id as in webgrab++.config.xml of the channel you want to correct
Examples:
-->
</channels>
3 changes: 3 additions & 0 deletions config/guide.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
<?xml version="1.0" encoding="UTF-8"?>
<tv generator-info-name="WebGrab+Plus/w MDB &amp; REX Postprocess -- version V1.56.27 -- Jan van Straaten" generator-info-url="http://www.webgrabplus.com">
</tv>
61 changes: 61 additions & 0 deletions config/mdb/imdb.com.ask.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
* WebGrab+Plus ini for grabbing IMDB data from TvGuide websites
* Site : imdb.com, primary search with ask.com
* revision : 1 correction in production date
* Jan van Straaten, 14/04/2012
*
site {url=imdb.com|cultureinfo=en-GB|charset=UTF-8|matchfactor=60|searchsite=ask}
* primary search:
http://www.ask.com/web?&q=imdb%2bDer+grosse+Edison%2b%2bClarence+Brown&/NCR
*url_primarysearch {url(urlencode=1,2,3,4,5,6)|http://www.ask.com/web?&q=|imdb+|'title'|+|'productiondate'|+|'credit'|&/NCR}
url_primarysearch {url(debug urlencode=1,2,3,4)|http://www.ask.com/web?&q=|imdb+|'title'|+|'credit'|&/NCR}
show_id.scrub {multi|primary|imdb|/tt|/|onmousedown}
*
show_id.modify {remove| } * remove spaces
* filter showid (7 char long):
mdb_temp_1.modify {calculate(type=element format=F0)|'show_id' #} * number of show_id's = loop index
loop {('mdb_temp_1' > "0" max=50)|4}
mdb_temp_1.modify {calculate(format=F0)|1 -} * decrease index
mdb_temp_2.modify {substring(type=element)|'show_id' 'mdb_temp_1' 1} * the showid to inspect
mdb_temp_3.modify {calculate(type=char format=F0)|'mdb_temp_2' #} * how many chars in this show_id?
show_id.modify {remove('mdb_temp_3' not "7" type=element)|'show_id' 'mdb_temp_1' 1} * remove this show_id if not 7 chars
* end loop
* filter showid (only numbers and < 2500000):
mdb_temp_1.modify {calculate(type=element format=F0)|'show_id' #} * number of show_id's = loop index
loop {('mdb_temp_1' > "0" max=50)|5}
mdb_temp_1.modify {calculate(format=F0)|1 -} * decrease index
mdb_temp_2.modify {substring(type=element)|'show_id' 'mdb_temp_1' 1} * the showid to inspect
mdb_temp_3.modify {calculate(format=F0)|'mdb_temp_2'} * convert to number
show_id.modify {remove('mdb_temp_3' "0" type=element)|'show_id' 'mdb_temp_1' 1} * remove this show_id if not only numbers
show_id.modify {remove('mdb_temp_3' > "2500000" type=element)|'show_id' 'mdb_temp_1' 1} * remove this show_id if > 2500000
* end loop
*
* imdb url's:
url_mdb_p1 {url|primary|http://www.imdb.com/title/tt|show_id|/}
*url_mdb_p1 {url|primary|http://www.imdb.com/find?q=tt|show_id|&s=all}
*http://www.imdb.com/find?q=tt2200000&s=all
*url_mdb_p2.modify {addstart|'url_mdb_p1'plotsummary}
*url_mdb_p3.modify {addstart|'url_mdb_p1'releaseinfo#akas}
*url_mdb_p4.modify {addstart|'url_mdb_p1'reviews}
*url_mdb_p5.modify {addstart|'url_mdb_p1'fullcredits#cast}
*
url_mdb_p2 {url|primary|http://www.imdb.com/title/tt|show_id|/plotsummary}
url_mdb_p3 {url|primary|http://www.imdb.com/title/tt|show_id|/releaseinfo#akas}
url_mdb_p4 {url|primary|http://www.imdb.com/title/tt|show_id|/reviews}
url_mdb_p5 {url|primary|http://www.imdb.com/title/tt|show_id|/fullcredits#cast}
*
* imdb elements
mdb_title.scrub {single|p1|<span class="title-extra">||<i>|</span>} * original title when redirected
mdb_title.scrub {single(separator=" - " exclude="IMDb" include=first)|p1|<head>|<title>|(|</title>}
mdb_title.scrub {multi(separator=" - ")|p3|<h5><a name="akas">Also Known As (AKA)</a></h5>|<tr>\n<td>|</td>|</table>} *aka's
*mdb_productiondate.scrub {single|p1|<title>|(|)|</title>}
mdb_productiondate.scrub {single|p1|<title>||</title>|</title>}
mdb_actor.scrub {multi|p1|itemprop="actors"|>|</a>|</div>}
mdb_actor.scrub {multi(exclude="<img src=")|p5|castlist/position|;">|</a>|</table>} * full list
mdb_director.scrub {multi|p1|itemprop="director"|>|</a>|</div>}
mdb_director.scrub {multi|p5|Directed by</a>|/">|</a>|</a>} * fulllist
mdb_starrating.scrub {single|p1|Ratings:|itemprop="ratingValue">|</span>|from}
mdb_starratingvotes.scrub {single|p1|Ratings:|itemprop="ratingCount">|</span>|users</a>}
mdb_commentsummary.scrub {multi(max=5 exclude="This review may contain spoilers")|p4|<a href="reviews-index?">|<b>|</b>|Add another review}
mdb_review.scrub {multi(exclude="SPOILERS ARE INCLUDED" include=first)|p4|<a href="reviews-index?">|</p>\n<p>|</p>\n\n|<div class="yn"}
mdb_plot.scrub {single|p2|<p class="plotpar">||<i>|</i>}
mdb_description.scrub {single|p1|<meta name="description"|content="|" />|<meta}
57 changes: 57 additions & 0 deletions config/mdb/imdb.com.bing.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
* WebGrab+Plus ini for grabbing IMDB data from TvGuide websites
* Site : imdb.com, primary search with bing.com
* revision : 1 correction in productiondate
* Jan van Straaten, 14/04/2012
*
site {url=imdb.com|cultureinfo=en-GB|charset=UTF-8|matchfactor=60|searchsite=bing}
* primary search:
url_primarysearch {url|http://www.bing.com/search?q=|imdb+title/tt+|'title'|+|'productiondate'|+|'credit'|&scope=web&setmkt=en-US&qs=ns&form=QBRE&qb=2}
*scope=web&setmkt=es-ES&setlang=match
show_id.scrub {multi(exclude="AND")|primary|imdb|/tt|/|onmousedown}
*
* filter showid (7 char long):
show_id.modify {remove| } * remove spaces
mdb_temp_1.modify {calculate(type=element format=F0)|'show_id' #} * number of show_id's = loop index
loop {('mdb_temp_1' > "0" max=50)|4}
mdb_temp_1.modify {calculate(format=F0)|1 -} * decrease index
mdb_temp_2.modify {substring(type=element)|'show_id' 'mdb_temp_1' 1} * the showid to inspect
mdb_temp_3.modify {calculate(type=char format=F0)|'mdb_temp_2' #} * how many chars in this show_id?
show_id.modify {remove('mdb_temp_3' not "7" type=element)|'show_id' 'mdb_temp_1' 1} * remove this show_id if not 7 chars
* end loop
* filter showid (only numbers and < 2500000):
mdb_temp_1.modify {calculate(type=element format=F0)|'show_id' #} * number of show_id's = loop index
loop {('mdb_temp_1' > "0" max=50)|5}
mdb_temp_1.modify {calculate(format=F0)|1 -} * decrease index
mdb_temp_2.modify {substring(type=element)|'show_id' 'mdb_temp_1' 1} * the showid to inspect
mdb_temp_3.modify {calculate(format=F0)|'mdb_temp_2'} * convert to number
show_id.modify {remove('mdb_temp_3' "0" type=element)|'show_id' 'mdb_temp_1' 1} * remove this show_id if not only numbers
show_id.modify {remove('mdb_temp_3' > "2500000" type=element)|'show_id' 'mdb_temp_1' 1} * remove this show_id if > 2500000
* end loop
*
* imdb url's:
url_mdb_p1 {url|primary|http://www.imdb.com/title/tt|show_id|/}
*url_mdb_p2.modify {addstart|'url_mdb_p1'plotsummary}
*url_mdb_p3.modify {addstart|'url_mdb_p1'releaseinfo#akas}
*url_mdb_p4.modify {addstart|'url_mdb_p1'reviews}
*url_mdb_p5.modify {addstart|'url_mdb_p1'fullcredits#cast}
*
url_mdb_p2 {url|primary|http://www.imdb.com/title/tt|show_id|/plotsummary}
url_mdb_p3 {url|primary|http://www.imdb.com/title/tt|show_id|/releaseinfo#akas}
url_mdb_p4 {url|primary|http://www.imdb.com/title/tt|show_id|/reviews}
url_mdb_p5 {url|primary|http://www.imdb.com/title/tt|show_id|/fullcredits#cast}
*
* imdb elements
mdb_title.scrub {single|p1|<span class="title-extra">||<i>|</span>} * original title when redirected
mdb_title.scrub {single(separator=" - " exclude="IMDb" include=first)|p1|<head>|<title>|(|</title>}
mdb_title.scrub {multi(separator=" - ")|p3|<h5><a name="akas">Also Known As (AKA)</a></h5>|<tr>\n<td>|</td>|</table>} *aka's
mdb_productiondate.scrub {single|p1|<title>||</title>|</title>}
mdb_actor.scrub {multi|p1|itemprop="actors"|>|</a>|</div>}
mdb_actor.scrub {multi(exclude="<img src=")|p5|castlist/position|;">|</a>|</table>} * full list
mdb_director.scrub {multi|p1|itemprop="director"|>|</a>|</div>}
mdb_director.scrub {multi|p5|Directed by</a>|/">|</a>|</a>} * fulllist
mdb_starrating.scrub {single|p1|Ratings:|itemprop="ratingValue">|</span>|from}
mdb_starratingvotes.scrub {single|p1|Ratings:|itemprop="ratingCount">|</span>|users</a>}
mdb_commentsummary.scrub {multi(max=5 exclude="This review may contain spoilers")|p4|<a href="reviews-index?">|<b>|</b>|Add another review}
mdb_review.scrub {multi(exclude="SPOILERS ARE INCLUDED" include=first)|p4|<a href="reviews-index?">|</p>\n<p>|</p>\n\n|<div class="yn"}
mdb_plot.scrub {single|p2|<p class="plotpar">||<i>|</i>}
mdb_description.scrub {single|p1|<meta name="description"|content="|" />|<meta}
Loading

0 comments on commit 376bd65

Please sign in to comment.