Web Scraping with Python Training Course
Web Scraping is a technique for extracting data from a website then saving it to local file or database.
This instructor-led, live training (online or onsite) is aimed at developers who wish to use Python to automate the process of crawling many websites to extract data for processing and analysis.
By the end of this training, participants will be able to:
- Install and configure Python and all relevant packages.
- Retrieve and parse data stored across many websites.
- Understand how websites work and how their HTML is structured.
- Construct spiders to crawl the web at scale.
- Use Selenium to crawl AJAX-driven web pages.
Format of the Course
- Interactive lecture and discussion.
- Lots of exercises and practice.
- Hands-on implementation in a live-lab environment.
Course Customization Options
- This course assumes knowledge of programming.
- To request a customized training for this course, please contact us to arrange.
Course Outline
Introduction
Setting up the Development Environment
Python Primer: Data Structures, Conditionals, File Handling, etc.
Python Packages for Web Scraping: Scrapy and BeautifulSoup
How a Website Works
How HTML is Structured
Making a Web Request
Scraping an HTML Page
Working with XPath and CSS
Filtering Data Using Regular Expressions
Creating a Web Crawler
Crawling AJAX and JavaScript Pages with Selenium.
Web Scraping Best Practices
Troubleshooting
Summary and Conclusion
Requirements
- Programming experience, preferably in Python. If participants have programming experience in a language other than Python, the training can be extended to include more introductory Python exercises.
Audience
- Developers
Need help picking the right course?
Web Scraping with Python Training Course - Enquiry
Web Scraping with Python - Consultancy Enquiry
Consultancy Enquiry
Testimonials (1)
Many different examples and topics has been covered, from basic investigation to login management and dynamic page management.
Daniele Tagliaferro - Creditsafe Italia Srl
Course - Web Scraping with Python
Provisional Upcoming Courses (Contact Us For More Information)
Related Courses
BDD with Python and Behave
7 HoursThis instructor-led, live training in Portugal begins with a discussion of BDD and how the Behave framework can be used to carry out BDD testing for web applications. Participants are given ample opportunity to interact with the instructor and peers while implementing the concepts and tactics learned in this hands-on, practice-based lab environment.
By the end of this training, participants will have a firm understanding of BDD and Behave, as well as the necessary practice to implement these techniques and tools in real-world test scenarios.
Scaling Data Analysis with Python and Dask
14 HoursThis instructor-led, live training in Portugal (online or onsite) is aimed at data scientists and software engineers who wish to use Dask with the Python ecosystem to build, scale, and analyze large datasets.
By the end of this training, participants will be able to:
- Set up the environment to start building big data processing with Dask and Python.
- Explore the features, libraries, tools, and APIs available in Dask.
- Understand how Dask accelerates parallel computing in Python.
- Learn how to scale the Python ecosystem (Numpy, SciPy, and Pandas) using Dask.
- Optimize the Dask environment to maintain high performance in handling large datasets.
Data Analysis with Python, Pandas and Numpy
14 HoursThis instructor-led, live training in Portugal (online or onsite) is aimed at intermediate-level Python developers and data analysts who wish to enhance their skills in data analysis and manipulation using Pandas and NumPy.
By the end of this training, participants will be able to:
- Set up a development environment that includes Python, Pandas, and NumPy.
- Create a data analysis application using Pandas and NumPy.
- Perform advanced data wrangling, sorting, and filtering operations.
- Conduct aggregate operations and analyze time series data.
- Visualize data using Matplotlib and other visualization libraries.
- Debug and optimize their data analysis code.
FARM (FastAPI, React, and MongoDB) Full Stack Development
14 HoursThis instructor-led, live training in (online or onsite) is aimed at developers who wish to use the FARM (FastAPI, React, and MongoDB) stack to build dynamic, high-performance, and scalable web applications.
By the end of this training, participants will be able to:
- Set up the necessary development environment that integrates FastAPI, React, and MongoDB.
- Understand the key concepts, features, and benefits of the FARM stack.
- Learn how to build REST APIs with FastAPI.
- Learn how to design interactive applications with React.
- Develop, test, and deploy applications (front end and back end) using the FARM stack.
Developing APIs with Python and FastAPI
14 HoursThis instructor-led, live training in Portugal (online or onsite) is aimed at developers who wish to use FastAPI with Python to build, test, and deploy RESTful APIs easier and faster.
By the end of this training, participants will be able to:
- Set up the necessary development environment to develop APIs with Python and FastAPI.
- Create APIs quicker and easier using the FastAPI library.
- Learn how to create data models and schemas based on Pydantic and OpenAPI.
- Connect APIs to a database using SQLAlchemy.
- Implement security and authentication in APIs using the FastAPI tools.
- Build container images and deploy web APIs to a cloud server.
Machine Learning with Python – 2 Days
14 HoursThe aim of this course is to provide a basic proficiency in applying Machine Learning methods in practice. Through the use of the Python programming language and its various libraries, and based on a multitude of practical examples this course teaches how to use the most important building blocks of Machine Learning, how to make data modeling decisions, interpret the outputs of the algorithms and validate the results.
Our goal is to give you the skills to understand and use the most fundamental tools from the Machine Learning toolbox confidently and avoid the common pitfalls of Data Sciences applications.
Machine Learning with Python – 4 Days
28 HoursThe aim of this course is to provide general proficiency in applying Machine Learning methods in practice. Through the use of the Python programming language and its various libraries, and based on a multitude of practical examples this course teaches how to use the most important building blocks of Machine Learning, how to make data modeling decisions, interpret the outputs of the algorithms and validate the results.
Our goal is to give you the skills to understand and use the most fundamental tools from the Machine Learning toolbox confidently and avoid the common pitfalls of Data Sciences applications.
Accelerating Python Pandas Workflows with Modin
14 HoursThis instructor-led, live training in Portugal (online or onsite) is aimed at data scientists and developers who wish to use Modin to build and implement parallel computations with Pandas for faster data analysis.
By the end of this training, participants will be able to:
- Set up the necessary environment to start developing Pandas workflows at scale with Modin.
- Understand the features, architecture, and advantages of Modin.
- Know the differences between Modin, Dask, and Ray.
- Perform Pandas operations faster with Modin.
- Implement the entire Pandas API and functions.
Python for Natural Language Generation (NLG)
21 HoursIn this instructor-led, live training in Portugal, participants will learn how to use Python to produce high-quality natural language text by building their own NLG system from scratch. Case studies will also be examined and the relevant concepts will be applied to live lab projects for generating content.
By the end of this training, participants will be able to:
- Use NLG to automatically generate content for various industries, from journalism, to real estate, to weather and sports reporting.
- Select and organize source content, plan sentences, and prepare a system for automatic generation of original content.
- Understand the NLG pipeline and apply the right techniques at each stage.
- Understand the architecture of a Natural Language Generation (NLG) system.
- Implement the most suitable algorithms and models for analysis and ordering.
- Pull data from publicly available data sources as well as curated databases to use as material for generated text.
- Replace manual and laborious writing processes with computer-generated, automated content creation.
Unit Testing with Python
21 HoursIn this instructor-led, live training in Portugal, participants will learn how to use PyTest to write short, maintainable tests that are elegant, expressive and readable.
By the end of this training, participants will be able to:
- Write readable and maintainable tests without the need for boilerplate code.
- Use the fixture model to write small tests.
- Scale tests up to complex functional testing for applications, packages, and libraries.
- Understand and apply PyTest features such as hooks, assert rewriting and plug-ins.
- Reduce test times by running tests in parallel and across multiple processors.
- Run tests in a continuous integration environment, together with other utilities such as tox, mock, coverage, unittest, doctest and Selenium.
- Use Python to test non-Python applications.
Advanced Machine Learning with Python
21 HoursIn this instructor-led, live training in Portugal, participants will learn the most relevant and cutting-edge machine learning techniques in Python as they build a series of demo applications involving image, music, text, and financial data.
By the end of this training, participants will be able to:
- Implement machine learning algorithms and techniques for solving complex problems.
- Apply deep learning and semi-supervised learning to applications involving image, music, text, and financial data.
- Push Python algorithms to their maximum potential.
- Use libraries and packages such as NumPy and Theano.
Python: Automate the Boring Stuff
14 HoursThis instructor-led, live training in Portugal is based on the popular book, "Automate the Boring Stuff with Python", by Al Sweigart. It is aimed at beginners and covers essential Python programming concepts through practical, hands-on exercises and discussions. The focus is on learning to write code to dramatically increase office productivity.
By the end of this training, participants will know how to program in Python and apply this new skill for:
- Automating tasks by writing simple Python programs.
- Writing programs that can do text pattern recognition with "regular expressions".
- Programmatically generating and updating Excel spreadsheets.
- Parsing PDFs and Word documents.
- Crawling web sites and pulling information from online sources.
- Writing programs that send out email notifications.
- Use Python's debugging tools to quickly resolve bugs.
- Programmatically controlling the mouse and keyboard to click and type for you.
Advanced Python - 4 Days
28 HoursThis instructor-led, live training in Portugal (online or onsite) is aimed at developers who wish to learn advanced Python programming techniques, including how to apply this versatile language to solve problems in areas such as distributed applications, data analysis and visualization, UI programming and maintenance scripting.
Python Programming - 4 days
28 HoursThis course is designed for those wishing to learn the Python programming language. The emphasis is on the Python language, the core libraries, as well as on the selection of the best and most useful libraries developed by the Python community. Python drives businesses and is used by scientists all over the world – it is one of the most popular programming languages.
The course can be delivered using the latest Python version 3.x with practical exercises making use of the full power. This course can be delivered on any operating system (all flavours of UNIX, including Linux and Mac OS X, as well as Microsoft Windows).
The practical exercises constitute about 70% of the course time, and around 30% are demonstrations and presentations. Discussions and questions can be asked throughout the course.
Note: the training can be tailored to specific needs upon prior request ahead of the proposed course date.
Test Automation with Selenium and Python
14 HoursIn this instructor-led, live training in Portugal participants combine the power of Python with Selenium to automate the testing of a sample web application. By combining theory with practice in a live lab environment, participants will gain the knowledge and practice needed to automate their own web testing projects using Python and Selenium.