You need to extract text from the PDF pages using extract_text:

import PyPDF2

with open('dummy.pdf', 'rb') as file:
    reader = PyPDF2.PdfReader(file)

    for page in reader.pages:
      print(page.extract_text())

Check the documentation here

Answer from حمزة نبيل on Stack Overflow
🌐
Readthedocs
pypdf2.readthedocs.io › en › 3.x
Welcome to PyPDF2 — PyPDF2 documentation
PyPDF2 is a free and open source pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files. It can also add custom data, viewing options, and passwords to PDF files.
Extract Text from a PDF
Then there are PDF files that contain an image and a text layer in the background. That typically happens when a document was scanned. Although the scanning software (OCR) is pretty good today, it still fails once in a while. PyPDF2 is no OCR software; it will not be able to detect those failures.
Installation
There are several ways to install PyPDF2.
The PdfReader Class
This page is about PyPDF2.
The PdfWriter Class
This page is about PyPDF2.
🌐
PyPI
pypi.org › project › PyPDF2
PyPDF2 · PyPI
PyPDF2 is a free and open-source pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files. It can also add custom data, viewing options, and passwords to PDF files.
      » pip install PyPDF2
    
Published   Dec 31, 2022
Version   3.0.1
Discussions

python - How do I use PyPDF2 to read and display the contents of my PDF when ran? - Stack Overflow
I have a dummy pdf that has words on it. The course I am using to learn uses PyPDF2 on python. Is there a way for PyPDF2 to actually read the words on the pdf rather than give me objects? This is the More on stackoverflow.com
🌐 stackoverflow.com
PyPDF vs PyPDF2 vs PyPDF3 vs PyPDF4 vs others
I prefer to use PyMuPDF - https://pymupdf.readthedocs.io/en/latest/ Extracting text from a PDF is hit and miss as it can be fragmented with each line or even separate words being in their own "block". Adobe has an API that they say can understand the structure and output text that is usable. https://blog.developer.adobe.com/adobe-pdf-extract-api-output-demystified-ff69841c4ed3 More on reddit.com
🌐 r/learnpython
18
113
February 20, 2023
python - "no module named PyPDF2" error - Stack Overflow
I use Spyder, with Python 2.7, on a windows 10. I was able to install the PyPDF2 package with a conda command from my prompt. I said installation complete. Yet, If I try to run a simple import comm... More on stackoverflow.com
🌐 stackoverflow.com
Failing to write some PDFs with PyPDF2
There's no chance of anybody solving that with the info you've provided. Generally, you really need to know what you're doing when constructing Dynamic XFA forms — and many PDF viewers either can't handle them or only partially implement Dynamic XFA handling. You can forget trying to read them reliably with browser-plugin PDF readers, for example. So it all depends on which Dynamix XFA features you're trying to implement and if your particular reader can handle that. (And it's almost always the wrong tool for the job, anyway.) More on reddit.com
🌐 r/learnpython
4
1
March 6, 2021
🌐
GeeksforGeeks
geeksforgeeks.org › python › introduction-to-python-pypdf2-library
Introduction to Python PyPDF2 Library - GeeksforGeeks
July 23, 2025 - PyPDF2 is a Python library that helps in working and dealing with PDF files. It allows us to read, manipulate, and extract information from PDFs without the need for complex software.
🌐
Reddit
reddit.com › r/learnpython › pypdf vs pypdf2 vs pypdf3 vs pypdf4 vs others
r/learnpython on Reddit: PyPDF vs PyPDF2 vs PyPDF3 vs PyPDF4 vs others
February 20, 2023 -

Initially I just googled for a way to get the number of pages in a pdf file. First result was PyPDF2 so I just used that.

After a while I got an error from a file, so I started looking around and realized that there are 4 different forks of this library!

What is going on here? Why are there so many forks?

In other news, later on I will be scraping some text from some pdf files. Which library would you recommend? I won't be needing OCR, the text is already in the files.

Thanks!

🌐
GitHub
github.com › niklasb › PyPDF2
GitHub - niklasb/PyPDF2: A utility to read and write PDFs with Python
PyPDF2 is a pure-python PDF library capable of splitting, merging together, cropping, and transforming the pages of PDF files. It can also add custom data, viewing options, and passwords to PDF files.
Author   niklasb
Find elsewhere
🌐
Anaconda.org
anaconda.org › conda-forge › pypdf2
pypdf2 - conda-forge
Organization created on Apr 11, 2015 · A community-led collection of recipes, build infrastructure, and distributions for the conda package manager
🌐
GitHub
github.com › talumbau › PyPDF2
GitHub - talumbau/PyPDF2: A utility to read and write pdfs with Python · GitHub
PyPDF2 ------------------------------------------------- PyPDF2 is a pure-python PDF library capable of splitting, merging together, cropping, and transforming the pages of PDF files.
Starred by 20 users
Forked by 5 users
🌐
Snyk
security.snyk.io › snyk vulnerability database › pip
PyPDF2 vulnerabilities | Snyk
An important project maintenance signal to consider for PyPDF2 is that it hasn't seen any new versions released to PyPI in the past 12 months, and could be considered as a discontinued project, or that which receives low attention from its maintainers.
🌐
UnoGeeks
unogeeks.com › home › blog › pypdf2
PyPDF2 - Python Docs
January 11, 2024 - PyPDF2 is a Python library used for working with PDF files. It can be used to extract information from PDFs, as well as to manipulate them by merging, splitting, and transforming pages.
🌐
Kanaries
docs.kanaries.net › topics › Python › pypdf2-python-pdf
PyPDF2: The Ultimate Python Library for PDF Manipulation – Kanaries
August 17, 2023 - PyPDF2 is a free and open-source library for working with PDFs in Python. Split, merge, crop, transform, encrypt and decrypt PDFs easily. Supports PDF 1.4 to 1.7 with no dependencies other than the Python standard library.
🌐
Nanonets
nanonets.com › blog › pypdf2-library-working-with-pdf-files-in-python
PYPDF2 Library: How Can You Work With PDF Files in Python?
July 11, 2025 - The best library for working with PDFs in Python is PyPDF2. It’s lightweight, fast, and well-documented.
🌐
Fedora
packages.fedoraproject.org › pkgs › python-PyPDF2 › python3-PyPDF2
python3-PyPDF2 - Fedora Packages
View python3-PyPDF2 in the Fedora package repositories. python3-PyPDF2: Python PDF toolkit and library
🌐
Medium
medium.com › @klogic › pypdf2-manipulate-pdf-with-python-529ed8d8e70
PyPDF2 — Manipulate PDF with Python | by Narongsak Keawmanee | Medium
December 21, 2021 - PyPDF2 — Manipulate PDF with Python When I do a small project. I find myself struggling with how to transform data in PDF. how to manipulate it and make it turn 90, split it in a half, and combine …
🌐
Oreate AI
oreateai.com › blog › resolving-the-modulenotfounderror-no-module-named-pypdf2-in-python › fa4b80fd3668776a2d6a666ba04eec95
Resolving the 'ModuleNotFoundError: No Module Named PyPDF2' in Python - Oreate AI Blog
January 21, 2026 - The first step in resolving this problem is understanding what PyPDF2 actually does. It’s a popular library used for manipulating PDF files—whether it’s reading content, merging documents, or extracting information from them.
🌐
PyPDF
pypdf.readthedocs.io › en › stable › user › installation.html
Installation — pypdf 6.9.1 documentation
There are several ways to install pypdf. The most common option is to use pip · pypdf requires Python 3.9+ to run
🌐
Konfuzio
konfuzio.com › start › blog › python › pypdf2 - python tutorial for pdf manipulation
PyPDF2 - Python Tutorial for PDF Manipulation
May 24, 2025 - Merge, split, encrypt, decrypt and more PDF files with PyPDF2. This Python tutorial shows you how.
🌐
DEV Community
dev.to › kojo_ben1 › how-to-merge-pdf-files-using-the-pypdf2-module-in-python-3lhl
How to merge PDF files using the PyPDF2 module in python - DEV Community
January 17, 2021 - PyPDF2 is a python library used to work with PDF files. You can use it to extract document information, split document page by page, merge multiple pages, encrypt and decrypt, etc.
🌐
GitHub
github.com › sylvainpelissier › PyPDF2
GitHub - sylvainpelissier/PyPDF2: A utility to read and write PDFs with Python · GitHub
PyPDF2 is a free and open-source pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files. It can also add custom data, viewing options, and passwords to PDF files.
Author   sylvainpelissier