Improved web scraping with Python tools and Bash utilities.

How to do web scraping efficiently (And how to make it less tedious)

Riya

Command-Line HTML5 Linux Python 3 Web Crawling

See in schedule: Fri, Jul 30, 11:10-11:55 CEST (45 min)

What is a web scraping and why should you learn how to do it?

I will talk about what it means to scrape data from the web and what is wrong with go-to copy-paste techniques. This will also cover various benefits of web-scraping, not only for work, but also how it can help to make your life simpler: using it for job-searching, product price monitoring, and collecting data to train ML models.

This talk will first cover the basics like using CSS Selectors for parsing data.

I will cover how to use developer tools to look for a tag, class, or id to target the required data.

I will also go through some basics of Regex which can be very useful in targeting required data.

After building up the base for web-scraping, I’ll show you some of the major tools including requests, BeautifulSoup, Selenium, Scrapy in an interactive manner, where you can follow along, as we build up from the most user-friendly tools like requests and BeautifulSoup to more specialized tools.
We’ll go a step further in this process and take a brief look at more complex topics. I will cover the problems that you will come across while scraping, such as asynchronous loading and client-side rendering, authentication, redirects, captchas, etc, and their possible workarounds. Finally, I’ll show how to automate web-scraping tasks using cron (Linux) and Task Scheduler (Windows).

You’ll leave the talk with a good understanding of the techniques of web scraping, and a library of useful tools you can use to write your own scrapers.

Type: Talk (45 mins); Python level: Beginner; Domain level: Beginner

Riya

I am an undergraduate majoring in electronics and communication. My educational inspirations consist of acquiring a deep understanding of software development and Machine Learning that would aid me in pursuing a Masters's in Computer Science. My career vision is to be an asset in the software development field who is responsible enough to solve some serious world problems.

Improved web scraping with Python tools and Bash utilities.

How to do web scraping efficiently (And how to make it less tedious)

Riya

Riya

Home

Registration

Program

Setup

EuroPython

FAQ

Improved web scraping with Python tools and Bash utilities.

How to do web scraping efficiently (And how to make it less tedious)

Riya

Riya

Home

Registration

Program

Setup

Sponsor

EuroPython

FAQ