Skip to content
Steve edited this page Mar 29, 2018 · 38 revisions

Get an article extract

The get_query() method gets (light) HTML and (Markdown) text extracts.

>>> page = wptools.page('Ella Fitzgerald')
>>> page.get_query()
en.wikipedia.org (query) Ella Fitzgerald
en.wikipedia.org (imageinfo) File:Ella Fitzgerald (Gottlieb 02871...
Ella Fitzgerald (en) data
{
  extext: <str(2002)> **Ella Jane Fitzgerald** (April 25, 1917J...
  extract: <str(2067)> <p><b>Ella Jane Fitzgerald</b> (April 25, 1...
  ...
}

Compare to RESTBase extracts:

>>> page.get_restbase('/page/summary/')
en.wikipedia.org (restbase) /page/summary/Ella Fitzgerald
Ella Fitzgerald (en) data
{
  exhtml: <str(1455)> <p><b>Ella Jane Fitzgerald</b> (April 25, 19...
  exrest: <str(1424)> Ella Jane Fitzgerald (April 25, 1917June ...
  ...
}

Get a representative image

A representative image for a page can come from the Wikimedia:API, from an Infobox, from Wikidata Property:P18, or from the RESTBase. See the Images wiki page for details.

>>> page = wptools.page('Frida Kahlo')
>>> page.get_query()
en.wikipedia.org (query) Frida Kahlo
en.wikipedia.org (imageinfo) File:Frida Kahlo, by Guillermo Kahlo...
Frida Kahlo (en) data
{
  image: <list(2)> {'kind': 'query-pageimage', u'descriptionshortu...
  ...
}
>>> page.images(['kind','url'])
[{'kind': 'query-pageimage',
  u'url': u'https://upload.wikimedia.org/wikipedia/commons/0/06/Frida_Kahlo%2C_by_Guillermo_Kahlo.jpg'},
 {'kind': 'query-thumbnail',
  'url': u'https://upload.wikimedia.org/wikipedia/commons/thumb/0/06/Frida_Kahlo%2C_by_Guillermo_Kahlo.jpg/160px-Frida_Kahlo%2C_by_Guillermo_Kahlo.jpg'}]

Frida Kahlo

Get page HTML

The most performant way to get article HTML is via RESTBase.

>>> page = wptools.page('Buddha')
>>> page.get_restbase('/page/html/')
en.wikipedia.org (restbase) /page/html/Buddha
Buddha (en) data
{
  html: <str(628054)> <!DOCTYPE html><html prefix="dc: http://purl...
}

Get Infobox data

Getting data from Infoboxes may be unavoidable, but getting Wikidata (via get_wikidata()) is preferred. Wikidata is structured but sometimes data poor, while Infoboxen are unstructured and frequently data rich. Please consider updating Wikidata if the information you want is only available in a MediaWiki instance so that others may benefit from open linked data.

>>> page = wptools.page('Fela Kuti')
>>> page.get_parse()
en.wikipedia.org (parse) Fela Kuti
en.wikipedia.org (imageinfo) File:Fela Kuti.jpg
Fela Kuti (en) data
{
  infobox: <dict(17)> website, associated_acts, death_place, image...
  ...
}
>>> page.data['infobox']['instrument']
'Saxophone, vocals, keyboards, trumpet, guitar, drums'

Get cover images

Most media (album, book, film, etc.) cover images on Wikipedia appear in an Infobox. For convenience, we put "cover" files from infoboxes in the image attribute.

>>> page = wptools.page('Blue Train (album)')
>>> page.get_parse()
en.wikipedia.org (parse) Blue Train (album)
en.wikipedia.org (imageinfo) File:John Coltrane - Blue Train.jpg
Blue Train (album) (en) data
{
  image: <list(1)> {'kind': 'parse-cover', u'descriptionshorturl':...
  infobox: <dict(16)> Name, Language, Artist, Cover, Recorded, Lab...
  ...
}
>>> page.images()[0]['url']
u'https://upload.wikimedia.org/wikipedia/en/6/68/John_Coltrane_-_Blue_Train.jpg'

Blue Train

Get wikidata

We put Wikidata page claims in data['claims']. We fetch entity labels into data['labels'] and put it all together in data['wikidata']. See the Wikidata page in our wiki for more details.

>>> page = wptools.page('Stephen Fry')
>>> page.get_wikidata()
www.wikidata.org (wikidata) Stephen Fry
www.wikidata.org (labels) P1220|Q6625963|P2387|P434|Q1860|P2469|P...
www.wikidata.org (labels) P106|P268|P269|P27|P26|P21|Q4927100|P86...
www.wikidata.org (labels) P1050|P1969|Q765642
en.wikipedia.org (imageinfo) File:Stephen Fry cropped.jpg
Stephen Fry (en) data
{
  aliases: <list(1)> Stephen John Fry
  claims: <dict(74)> P646, P1220, P2387, P434, P648, P3192, P1050,...
  description: English comedian, actor, writer, presenter, and activist
  image: <list(1)> {'kind': 'wikidata-image', u'descriptionshortur...
  label: Stephen Fry
  labels: <dict(103)> P1220, Q6625963, P2387, P434, Q1860, P2469, ...
  modified: <dict(1)> wikidata
  pageid: 191035
  requests: <list(5)> wikidata, labels, labels, labels, imageinfo
  title: Stephen_Fry
  what: human
  wikibase: Q192912
  wikidata: <dict(74)> Tumblr ID (P3943), MovieMeter director ID (...
  wikidata_url: https://www.wikidata.org/wiki/Q192912
}
>>> page.data['wikidata']
{u'AllMovie artist ID (P2019)': u'p25206',
 u'AlloCin\xe9 person ID (P1266)': u'11671',
 u'BIBSYS ID (P1015)': u'90862409',
 u'BNE ID (P950)': u'XX1358358',
 u'BnF ID (P268)': u'13191060q',
 u'CONOR ID (P1280)': u'39805539',
 u'Commons category (P373)': u'Stephen Fry',
 u'DNF person ID (P2626)': u'66163',
 u'Discogs artist ID (P1953)': u'289153',
 u'Elonet person ID (P2387)': u'241363',
 u'Encyclop\xe6dia Britannica Online ID (P1417)': u'biography/Stephen-Fry',
 u'FAST ID (P2163)': u'313699',
 u'Filmportal ID (P2639)': u'8844ffd4f8964001a39a1c136dceea04',
 u'Freebase ID (P646)': u'/m/0h0yt',
 u'GND ID (P227)': u'115765646',
 u'IMDb ID (P345)': u'nm0000410',
 u'ISFDB author ID (P1233)': u'3347',
 u'ISNI (P213)': [u'0000 0001 2129 064X', u'0000 0004 2241 3148'],
 u'Instagram username (P2003)': u'stephenfryactually',
 u'Internet Broadway Database person ID (P1220)': u'84850',
 u'Kinopoisk person ID (P2604)': u'16465',
 u'Last.fm music ID (P3192)': u'Stephen+Fry',
 u'Library of Congress authority ID (P244)': u'n92115518',
 u'MovieMeter director ID (P1969)': u'12195',
 u'Munzinger IBA (P1284)': u'00000022844',
 u'MusicBrainz artist ID (P434)': u'fad46635-5d90-484e-bcf9-5a8e3c1f8830',
 u'NE.se ID (P3222)': u'stephen-fry',
 u'NKCR AUT ID (P691)': u'jn19981001266',
 u'NLR (Romania) ID (P1003)': u'RUNLRAUTH7766127',
 u'NNDB people ID (P1263)': u'345/000055180',
 u'NUKAT (WarsawU) authorities (P1207)': u'n96100436',
 u'NYT topic ID (P3221)': u'person/stephen-fry',
 u'National Portrait Gallery (London) person ID (P1816)': u'mp06527',
 u'National Thesaurus for Author Names ID (P1006)': u'074121065',
 u'Open Library ID (P648)': u'OL231965A',
 u'PORT person ID (P2435)': u'11384',
 u'PTBNP ID (P1005)': u'1469418',
 u'Perlentaucher ID (P866)': u'stephen-fry',
 u'Quora topic ID (P3417)': u'Stephen-Fry-actor',
 u'SFDb person ID (P2168)': u'186797',
 u'SUDOC authorities (P269)': u'035462418',
 u'Scope.dk person ID (P2519)': u'4776',
 u'Songkick artist ID (P3478)': u'81644',
 u'Theatricalia person ID (P2469)': u'13fc',
 u'Tumblr ID (P3943)': u'stephen-fry-me',
 u'Twitter username (P2002)': u'stephenfry',
 u'VIAF ID (P214)': [u'39518907', u'305718028'],
 u'WikiTree ID (P2949)': u'Fry-2606',
 u"audio recording of the subject's spoken voice (P990)": u'Stephen Fry voice.flac',
 u'country of citizenship (P27)': u'United Kingdom (Q145)',
 u'date of birth (P569)': u'+1957-08-24T00:00:00Z',
 u'educated at (P69)': u"Queen's College (Q765642)",
 u'employer (P108)': u'BBC (Q9531)',
 u'given name (P735)': u'Stephen (Q4927100)',
 u'image (P18)': u'Stephen Fry cropped.jpg',
 u'instance of (P31)': u'human (Q5)',
 u'languages spoken, written or signed (P1412)': u'English (Q1860)',
 u'medical condition (P1050)': u'bipolar disorder (Q131755)',
 u'movement (P135)': u'atheism (Q7066)',
 u'name in native language (P1559)': u'Stephen John Fry',
 u'nominated for (P1411)': [u'British Academy Television Award for Best Entertainment Performance (Q4969372)',
  u'Tony Award for Best Featured Actor in a Play (Q1474410)',
  u'Kentucky colonel (Q632482)'],
 u'occupation (P106)': [u'actor (Q33999)',
  u'comedian (Q245068)',
  u'television presenter (Q947873)',
  u'screenwriter (Q28389)',
  u'autobiographer (Q18814623)',
  u'writer (Q36180)',
  u'director (Q3455803)',
  u'television actor (Q10798782)',
  u'novelist (Q6625963)',
  u'stage actor (Q2259451)',
  u'science fiction writer (Q18844224)',
  u'film actor (Q10800557)'],
 u'official website (P856)': u'http://www.stephenfry.com',
 u'page banner (P948)': u'StephenFryWorldPride.jpg',
 u'place of birth (P19)': u'Hampstead (Q25610)',
 u'religion (P140)': u'atheism (Q7066)',
 u'sex or gender (P21)': u'male (Q6581097)',
 u'sexual orientation (P91)': u'homosexuality (Q6636)',
 u'signature (P109)': u'Stephen Fry signature.svg',
 u'spouse (P26)': u'Elliott Spencer (Q22808271)',
 u"topic's main category (P910)": u'Category:Stephen Fry (Q8817795)',
 u'website account on (P553)': u'Quora (Q51711)',
 u'work period (start) (P2031)': u'+1982-00-00T00:00:00Z',
 u'\u010cSFD person ID (P2605)': u'5127'}

Minimize Wikidata requests

You can minimize the number of Wikidata (labels) requests by specifying only the labels you want with wanted_labels(). In the example below, we would normally make three or more calls for Wikidata labels, but let's assume we only want the gender property and the corresponding label ('sex or gender (P21)': 'female (Q6581072)'):

>>> page = wptools.page('Simone de Beauvoir')
>>> page.wanted_labels(['P21', 'Q6581072'])
>>> page.get_wikidata()
www.wikidata.org (wikidata) Simone de Beauvoir
www.wikidata.org (labels) P21|P31|Q5|Q6581072
Simone de Beauvoir (en) data
{
  aliases: <list(10)> Simone-Lucie-Ernestine-Marie Bertrand de Bea...
  claims: <dict(83)> P646, P723, P535, P800, P373, P648, P1273, P2...
  description: <str(106)> French writer, intellectual, existential...
  label: Simone de Beauvoir
  labels: <dict(4)> P21, P31, Q5, Q6581072
  modified: <dict(1)> wikidata
  pageid: 8373
  requests: <list(2)> wikidata, labels
  title: Simone_de_Beauvoir
  what: human
  wikibase: Q7197
  wikidata: <dict(2)> instance of (P31), sex or gender (P21)
  wikidata_url: https://www.wikidata.org/wiki/Q7197
}

All the original claims are still there, but we've reduced the labels and wikidata we've resolved. We always get 'instance of (P31)' so that we know what we're looking at.

>>> page.data['wikidata']
{u'instance of (P31)': u'human (Q5)',
 u'sex or gender (P21)': u'female (Q6581072)'}

Get all the page info

Simply calling get() on a page will automagically fetch extracts, images, infobox data, wikidata, and other metadata via the MediaWiki, Wikidata, and RESTBase APIs.

>>> page = wptools.page('Gandhi')
>>> page.get()
en.wikipedia.org (query) Gandhi
en.wikipedia.org (parse) 19379
www.wikidata.org (wikidata) Q1001
www.wikidata.org (labels) Q1280678|P535|Q18338317|P434|Q1860|P376...
www.wikidata.org (labels) P18|P19|P1066|P509|P345|Q16382|P1006|P3...
www.wikidata.org (labels) Q2140674|Q1282294|Q21200566|P409|Q26490...
www.wikidata.org (labels) P3417|P4431|P2949|P69|Q129286|Q9441|P42...
en.wikipedia.org (restbase) /page/summary/Mahatma_Gandhi
en.wikipedia.org (imageinfo) File:Portrait Gandhi.jpg|File:MKGandhi.jpg
Mahatma Gandhi (en) data
{
  aliases: <list(10)> M K Gandhi, Mohandas Gandhi, Bapu, Gandhi, M...
  claims: <dict(106)> P646, P535, P906, P434, P648, P3762, P1273, ...
  description: <str(67)> pre-eminent leader of Indian nationalism ...
  exhtml: <str(1144)> <p>Mahātmā <b>Mohandas Karamchand Gandhi</b>...
  exrest: <str(907)> Mahātmā Mohandas Karamchand Gandhi (; Hindust...
  extext: <str(2999)> Mahātmā **Mohandas Karamchand Gandhi** ( ; H...
  extract: <str(3292)> <p>Mahātmā <b>Mohandas Karamchand Gandhi</b...
  image: <list(6)> {'kind': 'query-pageimage', u'descriptionshortu...
  infobox: <dict(25)> known_for, other_names, image, signature, bi...
  iwlinks: <list(10)> https://biblio.wiki/wiki/Mohandas_K._Gandhi,...
  label: Mahatma Gandhi
  labels: <dict(163)> Q1280678, P535, Q18338317, Q131149, P434, Q1...
  length: 262,058
  links: <list(500)> 10 Janpath, 14th Dalai Lama, 1915 Singapore M...
  modified: <dict(2)> wikidata, page
  pageid: 19379
  parsetree: <str(330951)> <root><template><title>Redirect</title>...
  random: Salt
  redirected: <list(1)> {u'to': u'Mahatma Gandhi', u'from': u'Gandhi'}
  redirects: <list(53)> {u'ns': 0, u'pageid': 55342, u'title': u'M...
  requests: <list(9)> query, parse, wikidata, labels, labels, labe...
  title: Mahatma_Gandhi
  url: https://en.wikipedia.org/wiki/Mahatma_Gandhi
  url_raw: https://en.wikipedia.org/wiki/Mahatma_Gandhi?action=raw
  watchers: 1,770
  what: human
  wikibase: Q1001
  wikidata: <dict(105)> Geni.com profile ID (P2600), National Libr...
  wikidata_url: https://www.wikidata.org/wiki/Q1001
  wikitext: <str(260607)> {{Redirect|Gandhi}}{{pp-move-indef}}{{pp...
}

You can also call get_more() to get further page data—like page files, categories, languages, contributors, and average daily views. This results in a more expensive (slower) query:

>>> page.get_more()
en.wikipedia.org (querymore) Gandhi
Mahatma Gandhi (en) data
{
  categories: <list(68)> Category:1869 births, Category:1948 death...
  contributors: 2,608
  files: <list(52)> File:Aum Om red.svg, File:Commons-logo.svg, Fi...
  languages: <list(167)> {u'lang': u'af', u'title': u'Mahatma Gand...
  views: 24,565
}

Get page info in many languages

You can get page info in any language supported by the target MediaWiki site. Simply use the lang=language code keyword argument like this:

Arabic page

>>> page = wptools.page('مهاتما غاندي', lang='ar')
>>> page.get()
ar.wikipedia.org (query) مهاتما غاندي
ar.wikipedia.org (parse) 24528
www.wikidata.org (wikidata) Q1001
www.wikidata.org (labels) Q1280678|P535|Q18338317|P434|Q1860|P376...
www.wikidata.org (labels) P18|P19|P1066|P509|P345|Q16382|P1006|P3...
www.wikidata.org (labels) Q2140674|Q1282294|Q21200566|P409|Q26490...
www.wikidata.org (labels) P3417|P4431|P2949|P69|Q129286|Q9441|P42...
ar.wikipedia.org (restbase) /page/summary/مهاتما_غاندي
ar.wikipedia.org (imageinfo) File:Portrait Gandhi.jpg
مهاتما غاندي (ar) data
{
  claims: <dict(106)> P646, P535, P906, P434, P648, P3762, P1273, ...
  exhtml: <str(1149)> <p><span></span></p><p><b>موهانداس كرمشاند غ...
  exrest: <str(1070)> موهانداس كرمشاند غاندي (بالإنجليزية:Mohandas...
  extext: <str(1914)> **موهانداس كرمشاند غاندي** (بالإنجليزية:Moha...
  extract: <str(2024)> <p><span></span></p><p><b>موهانداس كرمشاند ...
  image: <list(4)> {'kind': 'query-pageimage', u'descriptionshortu...
  iwlinks: <list(6)> https://ar.wikiquote.org/wiki/%D9%85%D9%87%D8...
  label: مهاتما غاندي
  labels: <dict(163)> Q1280678, P535, Q18338317, Q131149, P434, Q1...
  length: 43,426
  links: <list(372)> 16, 1869, 1906, 1908, 1915, 1918, 1922, 1924,...
  modified: <dict(2)> wikidata, page
  pageid: 24528
  parsetree: <str(28894)> <root><template><title>Coord</title><par...
  random: اتماسينت (امجاو تشوقت)
  redirects: <list(8)> {u'ns': 0, u'pageid': 24547, u'title': u'\u...
  requests: <list(9)> query, parse, wikidata, labels, labels, labe...
  title: مهاتما_غاندي
  url: https://ar.wikipedia.org/wiki/مهاتما_غاندي
  url_raw: https://ar.wikipedia.org/wiki/مهاتما_غاندي?action=raw
  watchers: 71
  what: إنسان
  wikibase: Q1001
  wikidata: <dict(49)> تصنيف مكتبة الكونغرس (P1149), الصورة (P18),...
  wikidata_url: https://www.wikidata.org/wiki/Q1001
  wikitext: <str(24746)> {{Coord|28.6415|N|77.2483|E|display=title...
}

French page

>>> page = wptools.page('Gandhi', lang='fr')
>>> page.get()
fr.wikipedia.org (query) Gandhi
fr.wikipedia.org (parse) 6874
www.wikidata.org (wikidata) Q1001
www.wikidata.org (labels) Q1280678|P535|Q18338317|P434|Q1860|P376...
www.wikidata.org (labels) P18|P19|P1066|P509|P345|Q16382|P1006|P3...
www.wikidata.org (labels) Q2140674|Q1282294|Q21200566|P409|Q26490...
www.wikidata.org (labels) P3417|P4431|P2949|P69|Q129286|Q9441|P42...
fr.wikipedia.org (restbase) /page/summary/Mohandas_Karamchand_Gandhi
fr.wikipedia.org (imageinfo) File:Gandhi smiling 1942.jpg|File:Po...
Mohandas Karamchand Gandhi (fr) data
{
  aliases: <list(1)> Gandhi
  claims: <dict(106)> P646, P535, P906, P434, P648, P3762, P1273, ...
  description: leader politique et religieux indien
  exhtml: <str(3532)> <p><b>Mohandas Karamchand Gandhi</b> (en guj...
  exrest: <str(1158)> Mohandas Karamchand Gandhi (en gujarati મોહન...
  extext: <str(3334)> **Mohandas Karamchand Gandhi** (en gujarati ...
  extract: <str(5838)> <p><b>Mohandas Karamchand Gandhi</b> (en gu...
  image: <list(6)> {'kind': 'query-pageimage', u'descriptionshortu...
  infobox: <dict(16)> nom, lieu de naissance, nationalité, surnom,...
  iwlinks: <list(30)> https://biblio.wiki/wiki/Mohandas_K._Gandhi,...
  label: Mohandas Karamchand Gandhi
  labels: <dict(163)> Q1280678, P535, Q18338317, Q131149, P434, Q1...
  length: 211,912
  links: <list(500)> 10 mars, 11 septembre, 1869, 1906, 1919, 1922...
  modified: <dict(2)> wikidata, page
  pageid: 6874
  parsetree: <str(250187)> <root><template><title>SPE</title></tem...
  random: Primula vulgaris
  redirected: <list(1)> {u'to': u'Mohandas Karamchand Gandhi', u'f...
  redirects: <list(9)> {u'ns': 0, u'pageid': 11276, u'title': u'Ma...
  requests: <list(9)> query, parse, wikidata, labels, labels, labe...
  title: Mohandas_Karamchand_Gandhi
  url: https://fr.wikipedia.org/wiki/Mohandas_Karamchand_Gandhi
  url_raw: <str(67)> https://fr.wikipedia.org/wiki/Mohandas_Karamc...
  watchers: 155
  what: être humain
  wikibase: Q1001
  wikidata: <dict(100)> identifiant LibriVox d'auteur (P1899), ide...
  wikidata_url: https://www.wikidata.org/wiki/Q1001
  wikitext: <str(205463)> {{SPE}}{{Entête label|AdQ}}{{Voir homony...
}

Chinese (Traditional) page

>>> page = wptools.page('Gandhi', lang='zh')
>>> page.get()
zh.wikipedia.org (query) Gandhi
zh.wikipedia.org (parse) 9516
www.wikidata.org (wikidata) Q1001
www.wikidata.org (labels) Q1280678|P535|Q18338317|P434|Q1860|P376...
www.wikidata.org (labels) P18|P19|P1066|P509|P345|Q16382|P1006|P3...
www.wikidata.org (labels) Q2140674|Q1282294|Q21200566|P409|Q26490...
www.wikidata.org (labels) P3417|P4431|P2949|P69|Q129286|Q9441|P42...
zh.wikipedia.org (restbase) /page/summary/圣雄甘地
zh.wikipedia.org (imageinfo) File:Portrait Gandhi.jpg|File:MKGandhi.jpg
圣雄甘地 (zh) data
{
  claims: <dict(106)> P646, P535, P906, P434, P648, P3762, P1273, ...
  exhtml: <str(697)> <p><b>莫罕達斯·卡拉姆昌德·甘地</b>古吉拉特語<span lang="gu"...
  exrest: <str(406)> 莫罕達斯·卡拉姆昌德·甘地古吉拉特語મોહનદાસ કરમચંદ ગાંધી印地語:...
  extext: <str(456)> **莫罕達斯·卡拉姆昌德·甘地**古吉拉特語**મોહનદાસ કરમચંદ ગા...
  extract: <str(701)> <p><b>莫罕達斯·卡拉姆昌德·甘地</b>古吉拉特語<span lang="gu...
  image: <list(5)> {'kind': 'query-pageimage', u'descriptionshortu...
  infobox: <dict(18)> website, known_for, death_place, image, sign...
  iwlinks: <list(77)> https://en.wikipedia.org/wiki/Absolute_ideal...
  label: 圣雄甘地
  labels: <dict(163)> Q1280678, P535, Q18338317, Q131149, P434, Q1...
  length: 28,524
  links: <list(428)> 1869, 1948, 19世纪哲学, Aesthetic e...
  modified: <dict(2)> wikidata, page
  pageid: 9516
  parsetree: <str(19692)> <root><template><title>more footnotes</t...
  random: 
  redirected: <list(1)> {u'to': u'\u5723\u96c4\u7518\u5730', u'fro...
  redirects: <list(10)> {u'ns': 0, u'pageid': 9514, u'title': u'\u...
  requests: <list(9)> query, parse, wikidata, labels, labels, labe...
  title: 圣雄甘地
  url: https://zh.wikipedia.org/wiki/圣雄甘地
  url_raw: https://zh.wikipedia.org/wiki/圣雄甘地?action=raw
  watchers: 71
  what: 人類
  wikibase: Q1001
  wikidata: <dict(53)> FAST編號 (P2163), 出生日期 (P569), SELIBR識別碼 (P90...
  wikidata_url: https://www.wikidata.org/wiki/Q1001
  wikitext: <str(14070)> {{more footnotes|time=2014-01-09T02:16:42...
}

Chinese (Simplified) page

The variant keyword is not officially recognized/supported, but does appear to work for some versions of Chinese Wikipedia:

>>> page = wptools.page('Gandhi', lang='zh', variant='zh-cn')
>>> page.get()
zh.wikipedia.org (query) Gandhi
API warning: {u'main': {u'warnings': u'Unrecognized parameter: variant.'}}
zh.wikipedia.org (parse) 9516
API warning: {u'main': {u'warnings': u'Unrecognized parameter: variant.'}}
www.wikidata.org (wikidata) Q1001
www.wikidata.org (labels) Q1280678|P535|Q18338317|P434|Q1860|P376...
www.wikidata.org (labels) P18|P19|P1066|P509|P345|Q16382|P1006|P3...
www.wikidata.org (labels) Q2140674|Q1282294|Q21200566|P409|Q26490...
www.wikidata.org (labels) P3417|P4431|P2949|P69|Q129286|Q9441|P42...
zh.wikipedia.org (restbase) /page/summary/圣雄甘地
zh.wikipedia.org (imageinfo) File:Portrait Gandhi.jpg|File:MKGandhi.jpg
圣雄甘地 (zh) data
{
  claims: <dict(106)> P646, P535, P906, P434, P648, P3762, P1273, ...
  exhtml: <str(697)> <p><b>莫罕達斯·卡拉姆昌德·甘地</b>古吉拉特語<span lang="gu"...
  exrest: <str(406)> 莫罕達斯·卡拉姆昌德·甘地古吉拉特語મોહનદાસ કરમચંદ ગાંધી印地語:...
  extext: <str(456)> **莫罕达斯·卡拉姆昌德·甘地**古吉拉特语**મોહનદાસ કરમચંદ ગા...
  extract: <str(701)> <p><b>莫罕达斯·卡拉姆昌德·甘地</b>古吉拉特语<span lang="gu...
  image: <list(5)> {'kind': 'query-pageimage', u'descriptionshortu...
  infobox: <dict(18)> website, known_for, death_place, image, sign...
  iwlinks: <list(77)> https://en.wikipedia.org/wiki/Absolute_ideal...
  label: 圣雄甘地
  labels: <dict(163)> Q1280678, P535, Q18338317, Q131149, P434, Q1...
  length: 28,524
  links: <list(428)> 1869, 1948, 19世纪哲学, Aesthetic e...
  modified: <dict(2)> wikidata, page
  pageid: 9516
  parsetree: <str(19692)> <root><template><title>more footnotes</t...
  random: 毛縣
  redirected: <list(1)> {u'to': u'\u5723\u96c4\u7518\u5730', u'fro...
  redirects: <list(10)> {u'ns': 0, u'pageid': 9514, u'title': u'\u...
  requests: <list(9)> query, parse, wikidata, labels, labels, labe...
  title: 圣雄甘地
  url: https://zh.wikipedia.org/wiki/圣雄甘地
  url_raw: https://zh.wikipedia.org/wiki/圣雄甘地?action=raw
  watchers: 71
  what: 人类
  wikibase: Q1001
  wikidata: <dict(35)> 出生日期 (P569), Quora主题代码 (P3417), 所属组织 (P463)...
  wikidata_url: https://www.wikidata.org/wiki/Q1001
  wikitext: <str(14070)> {{more footnotes|time=2014-01-09T02:16:42...
}

Get category members

Get the members of a MediaWiki category:

>>> cat = wptools.category('Category:Music')
>>> cat.get_members()
en.wikipedia.org (categorymembers) Category:Music
Category:Music (en) data
{
  members: <list(43)> {u'ns': 0, u'pageid': 18839, u'title': u'Mus...
  requests: <list(1)> category
}
>>> cat.data['members'][:10]
[{u'ns': 0, u'pageid': 18839, u'title': u'Music'},
 {u'ns': 118, u'pageid': 55885548, u'title': u'Draft:Damian Bos'},
 {u'ns': 2, u'pageid': 55885620, u'title': u'User:Damian Bos/sandbox'},
 {u'ns': 100, u'pageid': 1474047, u'title': u'Portal:Music'},
 {u'ns': 14, u'pageid': 17320591, u'title': u'Category:Music by genre'},
 {u'ns': 14, u'pageid': 45332038, u'title': u'Category:Music by media franchise'},
 {u'ns': 14, u'pageid': 43461764, u'title': u'Category:Music by geographical categorization'},
 {u'ns': 14, u'pageid': 32108655, u'title': u'Category:Music by source'},
 {u'ns': 14, u'pageid': 29156321, u'title': u'Category:Music by theme'},
 {u'ns': 14, u'pageid': 1520543, u'title': u'Category:Music-related lists'}]

Get site info

Get information about a Mediawiki site:

>>> site = wptools.site()
>>> site.get_info('ja.wikipedia.org')
ja.wikipedia.org (query) siteinfo|siteviews|mostviewed
ja.wikipedia.org (query) siteviews:uniques
Wikipedia (ja) data
{
  activeusers: 12,838
  admins: 46
  articles: 1,085,297
  edits: 67,419,719
  images: 84,389
  info: <dict(50)> invalidusernamechars, phpversion, imagewhitelis...
  jobs: 13
  mostviewed: <list(500)> {u'count': 424335, u'ns': 0, u'title': u...
  pages: 3,197,229
  queued-massmessages: 0
  requests: <list(2)> siteinfo, sitevisitors
  site: jawiki
  siteviews: 32,197,661
  users: 1,271,721
  visitors: 9,777,018
}

List most popular articles

Get the top articles for a site:

>>> site = wptools.site()
>>> site.top('nl.wikipedia.org')
nl.wikipedia.org (query) siteinfo|siteviews|mostviewed
nl.wikipedia.org (query) siteviews:uniques
nlwiki mostviewed articles:
1. Hoofdpagina (178,350)
2. Madonna (zangeres) (39,323)
3. Black Friday (31,406)
4. Thanksgiving Day (19,185)
5. Rik Felderhof (11,403)
6. Thijs van Leer (7,638)
7. Staten van de Verenigde Staten (6,754)
8. Sinterklaas (6,643)
9. Fred Teeven (5,693)
10. Axelle Red (5,230)
11. 24 november (3,776)
12. Sinterklaasjournaal (3,128)
13. Julia Roberts (3,085)
14. Yotam Ottolenghi (3,037)
15. Robbert Dijkgraaf (3,026)
16. Anouk (2,897)
17. René Klijn (2,880)
18. Aubergine (plant) (2,874)
19. Victoria van het Verenigd Koninkrijk (2,817)
20. Soefisme (2,794)
21. Dolly Parton (2,704)
22. Groningen (provincie) (2,644)
23. Lijst van chemische elementen (2,639)
24. Periodiek systeem (2,549)
25. Taurine (2,546)