IBM Debater® Wikipedia Category Stance¶

The dataset contains:

132 concepts
4603 Wikipedia categories and lists annotated for stance (Pro/Con) towards the concepts

The released data file has 4 columns:

Label (-/P/C/?)	Concept	Category/List Page Title	URL
...	...	...	...
...	...	...	...
...	...	...	...

Column A: the label
Column B: the concept
Column C: the page title of the category or list in Wikipedia
Column D: the URL of the category/list page

For each category, the label is one of the following:

“-” : the category is not a person group category
“P” : pro stance (supporting the concept)
“C” : con stance (opposing the concept)
“?” : the stance cannot be determined based on the category name, or the category is not relevant
“X” : unresolved case: each of the 3 annotators gave a different label

Load Dataset¶

import requests # External dependency: pip install requests
import tarfile

# Downloading the dataset
url_base = 'https://dax.cdn.appdomain.cloud/dax-wikipedia-category-stance'
version = '1.0.2'
fname = 'wikipedia-category-stance.tar.gz'
url = "{}/{}/{}".format(url_base, version, fname)
r = requests.get(url)

if not r.ok:
    print("There are some errors when downloading {}".format(url))

with open(fname, 'wb') as f:
    f.write(r.content)

# Extracting the dataset
with tarfile.open(fname, 'r:*') as f:
    f.extractall()

Read Dataset¶

import os
import pandas as pd # External dependency: pip install pandas

data_path = "WikipediaCategoriesResults.csv"
if not os.access(data_path, os.R_OK):
    print("Failed to read the target file: {}".format(data_path))

data = pd.read_csv(data_path)
data.head(20)

	Label (-/P/C/?)	Concept	Category/List Page Title	URL
0	P	Abortion	Category:American_abortion_providers	https://en.wikipedia.org/wiki/Category:America...
1	P	Abortion	Category:American_pro-choice_activists	https://en.wikipedia.org/wiki/Category:America...
2	C	Abortion	Category:American_pro-life_activists	https://en.wikipedia.org/wiki/Category:America...
3	-	Abortion	Category:Anti-abortion_violence_in_the_United_...	https://en.wikipedia.org/wiki/Category:Anti-ab...
4	P	Abortion	Category:British_pro-choice_activists	https://en.wikipedia.org/wiki/Category:British...
5	P	Abortion	Category:Canadian_pro-choice_activists	https://en.wikipedia.org/wiki/Category:Canadia...
6	C	Abortion	Category:Canadian_pro-life_activists	https://en.wikipedia.org/wiki/Category:Canadia...
7	P	Abortion	Category:French_pro-choice_activists	https://en.wikipedia.org/wiki/Category:French_...
8	P	Abortion	Category:Irish_pro-choice_activists	https://en.wikipedia.org/wiki/Category:Irish_p...
9	C	Abortion	Category:New_Zealand_pro-life_activists	https://en.wikipedia.org/wiki/Category:New_Zea...
10	P	Abortion	Category:Victims_of_anti-abortion_violence_in_...	https://en.wikipedia.org/wiki/Category:Victims...
11	-	Abortion	List_of_songs_about_abortion	https://en.wikipedia.org/wiki/List_of_songs_ab...
12	?	Adultery	Category:People_executed_for_adultery	https://en.wikipedia.org/wiki/Category:People_...
13	-	Adventism	Category:Adventism	https://en.wikipedia.org/wiki/Category:Adventism
14	P	Adventism	Category:American_Seventh-day_Adventist_clergy	https://en.wikipedia.org/wiki/Category:America...
15	P	Adventism	Category:American_Seventh-day_Adventists	https://en.wikipedia.org/wiki/Category:America...
16	-	Adventism	Category:Andrews_University	https://en.wikipedia.org/wiki/Category:Andrews...
17	P	Adventism	Category:Australian_Seventh-day_Adventists	https://en.wikipedia.org/wiki/Category:Austral...
18	P	Adventism	Category:British_Seventh-day_Adventists	https://en.wikipedia.org/wiki/Category:British...
19	-	Adventism	Category:Church_of_God_(Armstrong)	https://en.wikipedia.org/wiki/Category:Church_...