Intermediate Python: PDF Manipulating, Extracting, and Combining

Read in a PDFRead in a PDF

pdfReader is a class in the PyPDF2 library for Python that provides a way to read the contents of a PDF file. It allows developers to extract information from a PDF file, such as text, images, and metadata.

pdfReader is useful for a variety of tasks, such as parsing PDF documents to extract information, searching for specific keywords or phrases within a PDF file, and generating reports or summaries based on the contents of a PDF document. By using pdfReader, developers can automate these tasks and extract useful information from PDF files in a streamlined manner.

Overall, pdfReader is an important component of the PyPDF2 library and enables developers to perform a variety of tasks related to PDF file handling in Python.


  1. Import PyPDF2;
  2. Open a PDF file as pdfFileObj;
  3. Read the pdfFileObj file;
  4. Print out the number of pages. You can access the pages of a file using the .pages attribute.

Section 1. Chapter 2