You should use some HTML parsing library like lxml:

from lxml import etree
s = """<table>
  <tr><th>Event</th><th>Start Date</th><th>End Date</th></tr>
  <tr><td>a</td><td>b</td><td>c</td></tr>
  <tr><td>d</td><td>e</td><td>f</td></tr>
  <tr><td>g</td><td>h</td><td>i</td></tr>
</table>
"""
table = etree.HTML(s).find("body/table")
rows = iter(table)
headers = [col.text for col in next(rows)]
for row in rows:
    values = [col.text for col in row]
    print dict(zip(headers, values))

prints

{'End Date': 'c', 'Start Date': 'b', 'Event': 'a'}
{'End Date': 'f', 'Start Date': 'e', 'Event': 'd'}
{'End Date': 'i', 'Start Date': 'h', 'Event': 'g'}
Answer from Sven Marnach on Stack Overflow
🌐
Python
docs.python.org › 3 › library › html.parser.html
html.parser — Simple HTML and XHTML parser
Encountered a start tag: html Encountered a start tag: head Encountered a start tag: title Encountered some data : Test Encountered an end tag : title Encountered an end tag : head Encountered a start tag: body Encountered a start tag: h1 ...
Discussions

Any way to parse HTML tables?
You can probably find some JavaScript routine that can convert html table data to json. Then run the JavaScript in a URL action. I saw this ability to run JavaScript here and tried it out yesterday: https://www.reddit.com/r/shortcuts/comments/9hpplv/is_it_possible_to_read_the_contents_of_an_xml_file/?st=JNYO7T87&sh=bf71ea97 But just saw that Pretty Print also does this: https://www.reddit.com/r/shortcuts/comments/9mk9br/pretty_print_dictionary/?st=JNYOAYQW&sh=0a9cdf3d More on reddit.com
🌐 r/shortcuts
10
4
November 1, 2018
How to scrap a html table with bs4?
See how to write csv https://realpython.com/python-csv/ Altough I would create a loop on each tr and on each of those loop on each td for tr in soup.find_all("tr"): for td in tr.find_all("td"): handle td.text You might also want to consider pandas here https://www.geeksforgeeks.org/convert-html-table-into-csv-file-in-python/ More on reddit.com
🌐 r/learnpython
14
0
October 23, 2020
Guide to Scrape HTML Table Using Python
1.5M subscribers in the Python community. The largest Python community for Reddit! Stay up to date with the latest news, packages, and meta… More on reddit.com
🌐 r/Python
5
21
October 29, 2022
How to read a table in beautiful soup, and parse the elements
tables=soup.findAll('table')[0].findAll('tr') findAll('table') gets you list of tables. [0] is indexing the first item in the list. You could loop through each table: for table in soup.findAll('table'): or use pandas pandas has a read_html function - you could also try: tables = pandas.read_html(page.content) It returns a list of dataframes - one per table found. More on reddit.com
🌐 r/learnpython
12
2
January 5, 2022
People also ask

Which Python library is best for beginners to parse HTML?
BeautifulSoup is the best starting point. It's simple, well-documented, and quite forgiving of broken HTML. Pair it with the lxml parser for a balance of speed and flexibility.
🌐
scrapingbee.com
scrapingbee.com › blog › python-html-parsers
How to parse HTML in Python: A step-by-step guide for beginners ...
What's the fastest Python library for parsing HTML?
lxml is the fastest option since it's written in C and supports XPath for precise queries. It's ideal for large-scale or performance-sensitive scraping projects.
🌐
scrapingbee.com
scrapingbee.com › blog › python-html-parsers
How to parse HTML in Python: A step-by-step guide for beginners ...
Can I parse XML or JSON with the same tools?
BeautifulSoup and lxml can handle XML, but JSON requires Python's built-in json module. Many sites provide JSON APIs, which are often easier to use than scraping HTML.
🌐
scrapingbee.com
scrapingbee.com › blog › python-html-parsers
How to parse HTML in Python: A step-by-step guide for beginners ...
🌐
Scott Rome
srome.github.io › Parsing-HTML-Tables-in-Python-with-BeautifulSoup-and-pandas
Parsing HTML Tables in Python with BeautifulSoup and pandas
May 30, 2016 - In the next bit of code, we define a website that is simply the HTML for a table. We load it into BeautifulSoup and parse it, returning a pandas data frame of the contents. As you can see, we grab all the tr elements from the table, followed by grabbing the td elements one at a time. We use the “get_text()” method from the td element (called a column in each iteration) and put it into our python object representing a table (it will eventually be a pandas dataframe).
🌐
ZenRows
zenrows.com › homepage › tutorial › how to parse html tables using python + top 3 parsers
How to Parse HTML Tables Using Python + Top 3 Parsers - ZenRows
September 12, 2024 - Cool! You've just scraped an HTML table with BeautifulSoup in Python. However, a simpler way to achieve this task with less coding is to use a web scraping API like ZenRows. You'll see how it works in the next section. Parsing tables with ZenRows is a straightforward process.
🌐
Practical Business Python
pbpython.com › pandas-html-table.html
Reading HTML tables with Pandas - Practical Business Python
In this article, I will discuss how to use pandas read_html() to read and clean several Wikipedia HTML tables so that you can use them for further numeric analysis. For the first example, we will try to parse this table from the Politics section on the Minnesota wiki page.
🌐
PyPI
pypi.org › project › html-table-parser-python3
html-table-parser-python3 · PyPI
A small and simple HTML table parser not requiring any external dependency.
      » pip install html-table-parser-python3
    
Published   Dec 06, 2022
Version   0.3.1
Find elsewhere
🌐
TutorialsPoint
tutorialspoint.com › article › how-to-parse-html-pages-to-fetch-html-tables-with-python
How to Parse HTML pages to fetch HTML tables with Python?
November 9, 2020 - Status code: {response.status_code}") return [] # Parse HTML content soup = BeautifulSoup(response.text, 'html.parser') # Find all tables tables = soup.find_all('table') extracted_tables = [] for i, table in enumerate(tables): try: # Convert to DataFrame df = pd.read_html(str(table))[0] extracted_tables.append({ 'table_index': i + 1, 'dataframe': df, 'shape': df.shape }) except Exception as e: print(f"Error processing table {i + 1}: {e}") return extracted_tables # Example usage url = "https://www.tutorialspoint.com/python/python_basic_operators.htm" all_tables = extract_tables_from_url(url) print(f"Successfully extracted {len(all_tables)} tables") for table_info in all_tables[:2]: # Show first 2 tables print(f"\nTable {table_info['table_index']} - Shape: {table_info['shape']}") print(table_info['dataframe'].head(3))
🌐
Tchut-Tchut Blog
beenje.github.io › blog › posts › parsing-html-tables-in-python-with-pandas
Parsing HTML Tables in Python with pandas | Tchut-Tchut Blog
March 27, 2018 - --------------------------------------------------------------------------- HTTPError Traceback (most recent call last) <ipython-input-17-7e6b50c9f1f3> in <module>() ----> 1 pd.read_html('https://httpbin.org/basic-auth/myuser/mypasswd') ~/miniconda3/envs/jupyter/lib/python3.6/site-packages/pandas/io/html.py in read_html(io, match, flavor, header, index_col, skiprows, attrs, parse_dates, tupleize_cols, thousands, encoding, decimal, converters, na_values, keep_default_na) 913 thousands=thousands, attrs=attrs, encoding=encoding, 914 decimal=decimal, converters=converters, na_values=na_values, -->
🌐
Robocorp
robocorp.com › portal › robot › robocorp › example-html-table-robot
Working with HTML Tables - Robocorp Portal
The get_html_table function returns the example HTML table markup from https://www.w3schools.com/html/html_tables.asp. The read_table_from_html is provided by the html_tables.py library.
🌐
ScrapingBee
scrapingbee.com › blog › python-html-parsers
How to parse HTML in Python: A step-by-step guide for beginners | ScrapingBee
January 16, 2026 - Want to parse HTML in Python right away? Here's the fastest working setup: one version using plain old requests for static pages, and another using ScrapingBee for the real world, where sites throw JavaScript and anti-bot nonsense at you. ... import requests from bs4 import BeautifulSoup # Fetch the page directly url = "https://example.com" html = requests.get(url).text # Parse the HTML with BeautifulSoup + lxml soup = BeautifulSoup(html, "lxml") # Extract the title and all links print(soup.title.get_text()) for link in soup.select("a[href]"): print(link["href"])
🌐
Zyte
zyte.com › home › blog › how to extract data from html table
How to extract data from an HTML table - Zyte #1 Web Scraping Service
September 13, 2022 - One such method is available in the popular python Pandas library, it is called read_html(). The method accepts numerous arguments that allow you to customize how the table will be parsed. You can call this method with a URL or file or actual string. For example, you might do it like this:
🌐
Pandas
pandas.pydata.org › docs › reference › api › pandas.read_html.html
pandas.read_html — pandas 3.0.3 documentation - PyData |
The default value will return all tables contained on a page. This value is converted to a regular expression so that there is consistent behavior between Beautiful Soup and lxml. flavor{“lxml”, “html5lib”, “bs4”} or list-like, optional · The parsing engine (or list of parsing engines) to use.
🌐
Bright Data
brightdata.com › blog › web-data › how-to-scrape-html-tables
Guide on How to Scrape HTML Tables With Python
September 16, 2025 - # Parse the HTML content using BeautifulSoup soup = BeautifulSoup(response.content, 'html.parser') Next, locate the table element in the HTML with the id attribute "example2".
🌐
ScraperAPI
scraperapi.com › home › blog › how to scrape html tables using python
How To Scrape HTML Tables Using Python
March 31, 2026 - Let’s enter the table’s URL (https://datatables.net/examples/styling/stripe.html) in our browser and inspect the page to see what’s happening under the hood. This is why this is a great page to practice scraping tabular data with Python. There’s a clear <table> tag pair opening and closing the table and all the relevant data is inside the <tbody> tag.
🌐
DEV Community
dev.to › chrisgreening › effortlessly-scrape-html-tables-into-python-using-pdreadhtml-559p
Effortlessly scrape HTML tables into Python using pd.read_html! - DEV Community
August 24, 2023 - Here's a step-by-step guide to using this function to get tables from a webpage right into our Python environments: Import pandas: First let's import pandas into our script: ... Specify the source and call pd.read_html: Determine where pd.read_html should look for the HTML content. It could be a URL or a string containing HTML code. For this example let's pull some tables off of the Python Wiki page:
🌐
AskPython
askpython.com › home › how to read html tables using python?
How to read HTML tables using Python? - AskPython
January 18, 2023 - Sometimes, you might want the data types of some columns from the table to be of a specific type. In such cases, you can typecast them using the read_html() function. Recall that in the above example, the data type of ‘Salary’ was ‘int64’.
🌐
Scrapfly
scrapfly.io › blog › answers › how-to-scrape-tables-with-beautifulsoup
How to scrape tables with BeautifulSoup?
April 18, 2026 - To scrape HTML tables using BeautifulSoup and Python the find_all() method can be used with common table parsing algorithms. Here's how to do it.
🌐
Substack
substack.com › home › post › p-151645890
How-To Parse HTML Tables in Python Using Pandas
November 15, 2024 - This method relies on lxml, BeautifulSoup, and the html5lib libraries to parse the HTML page, so make sure to install them if you haven’t done so already. ... Next, identify a website you want to extract the data from, let’s use the List of video games featuring Mario Wikipedia entry as an example. import pandas as pd url = ‘https://en.wikipedia.org/wiki/List_of_video_games_featuring_Mario’ tables = pd.read_html(url) print(len(tables)) # CONTINUE YOUR ANALYSIS HERE
🌐
ZenRows
zenrows.com › homepage › tutorial › how to parse tables using beautifulsoup (+2 better ways)
How to Parse Tables Using BeautifulSoup (+2 better ways) - ZenRows
September 27, 2024 - We'll show you how to parse tables using BeautifulSoup. And you'll also learn two better ways to make the task easier. BeautifulSoup is mainly used to parse HTML in Python. The first step to parsing tables is to locate the table's HTML tag.
🌐
GitHub
github.com › fmilthaler › HTMLParser
GitHub - fmilthaler/HTMLParser: Python class to scrap and parse a webpage (using requests, BeautifulSoup4), mainly for converting tables to pandas.DataFrame · GitHub
Here we scrap a page from Wikipedia, parse it for tables, and convert the first table found into a pandas.DataFrame. from htmlparser import HTMLParser import pandas # Here we scrap a page from Wikipedia, parse it for tables, and convert the ...
Author   fmilthaler