IBM Debater® Wikipedia Category Stance

The dataset contains:

  • 132 concepts
  • 4603 Wikipedia categories and lists annotated for stance (Pro/Con) towards the concepts

The released data file has 4 columns:

Label (-/P/C/?) Concept Category/List Page Title URL
... ... ... ...
... ... ... ...
... ... ... ...
  • Column A: the label
  • Column B: the concept
  • Column C: the page title of the category or list in Wikipedia
  • Column D: the URL of the category/list page

For each category, the label is one of the following:

  • “-” : the category is not a person group category
  • “P” : pro stance (supporting the concept)
  • “C” : con stance (opposing the concept)
  • “?” : the stance cannot be determined based on the category name, or the category is not relevant
  • “X” : unresolved case: each of the 3 annotators gave a different label

Load Dataset

In [1]:
import requests # External dependency: pip install requests
import tarfile

# Downloading the dataset
url_base = 'https://dax.cdn.appdomain.cloud/dax-wikipedia-category-stance'
version = '1.0.2'
fname = 'wikipedia-category-stance.tar.gz'
url = "{}/{}/{}".format(url_base, version, fname)
r = requests.get(url)

if not r.ok:
    print("There are some errors when downloading {}".format(url))
In [2]:
with open(fname, 'wb') as f:
    f.write(r.content)

# Extracting the dataset
with tarfile.open(fname, 'r:*') as f:
    f.extractall()

Read Dataset

In [3]:
import os
import pandas as pd # External dependency: pip install pandas

data_path = "WikipediaCategoriesResults.csv"
if not os.access(data_path, os.R_OK):
    print("Failed to read the target file: {}".format(data_path))
In [4]:
data = pd.read_csv(data_path)
data.head(20)
Out[4]:
Label (-/P/C/?) Concept Category/List Page Title URL
0 P Abortion Category:American_abortion_providers https://en.wikipedia.org/wiki/Category:America...
1 P Abortion Category:American_pro-choice_activists https://en.wikipedia.org/wiki/Category:America...
2 C Abortion Category:American_pro-life_activists https://en.wikipedia.org/wiki/Category:America...
3 - Abortion Category:Anti-abortion_violence_in_the_United_... https://en.wikipedia.org/wiki/Category:Anti-ab...
4 P Abortion Category:British_pro-choice_activists https://en.wikipedia.org/wiki/Category:British...
5 P Abortion Category:Canadian_pro-choice_activists https://en.wikipedia.org/wiki/Category:Canadia...
6 C Abortion Category:Canadian_pro-life_activists https://en.wikipedia.org/wiki/Category:Canadia...
7 P Abortion Category:French_pro-choice_activists https://en.wikipedia.org/wiki/Category:French_...
8 P Abortion Category:Irish_pro-choice_activists https://en.wikipedia.org/wiki/Category:Irish_p...
9 C Abortion Category:New_Zealand_pro-life_activists https://en.wikipedia.org/wiki/Category:New_Zea...
10 P Abortion Category:Victims_of_anti-abortion_violence_in_... https://en.wikipedia.org/wiki/Category:Victims...
11 - Abortion List_of_songs_about_abortion https://en.wikipedia.org/wiki/List_of_songs_ab...
12 ? Adultery Category:People_executed_for_adultery https://en.wikipedia.org/wiki/Category:People_...
13 - Adventism Category:Adventism https://en.wikipedia.org/wiki/Category:Adventism
14 P Adventism Category:American_Seventh-day_Adventist_clergy https://en.wikipedia.org/wiki/Category:America...
15 P Adventism Category:American_Seventh-day_Adventists https://en.wikipedia.org/wiki/Category:America...
16 - Adventism Category:Andrews_University https://en.wikipedia.org/wiki/Category:Andrews...
17 P Adventism Category:Australian_Seventh-day_Adventists https://en.wikipedia.org/wiki/Category:Austral...
18 P Adventism Category:British_Seventh-day_Adventists https://en.wikipedia.org/wiki/Category:British...
19 - Adventism Category:Church_of_God_(Armstrong) https://en.wikipedia.org/wiki/Category:Church_...