Leveraging Linked Data using Python and SPARQL

Nabanita Roy

Case Study Data Science Databases Natural Language Processing Web

See in schedule

Wikipedia is the digital encyclopedia that we use daily to find out facts and information. What could be better than being able to extract the extreme wealth of crowd-sourced knowledge from Wikipedia without using traditional web scrapers? Various community-driven projects extract knowledge from Wikipedia and stores them structurally, retrievable using SPARQL. It can be used to mine data for a range of Data Science projects. In this talk, I will walk through the basics of the Open Web and how to use Python to use this huge open database.
The agenda includes the following:
• Introduction to DBpedia and Wikidata
• Introduction to Linked Data
• W3C Web Ontology Language (OWL) and Universal Resource Identifier (URI)
• How to query DBpedia/WikiData
o Build SPARQL Query
o Use Python’s SPARQLWrapper
• Python Code Walkthrough to create
o A Tabular Dataset
o A Corpus for Language Models
o Knowledge Graph
• A Data Science Use-Case using data from DBpedia

Prerequisites – Basic knowledge of Python programming, Natural Language Processing, and SQL

Type: Talk (45 mins); Python level: Intermediate; Domain level: Beginner


Nabanita Roy

ACI Worldwide

Data Scientist @ ACI Worldwide | Edu Co-Lead @ Women in AI Ireland | Python and Data Science Instructor @ WAIA | ❤ NLP