How to count number of words in a pdf file from Linux cli

Using pdftotext:#

  1. Installation:

    • If it’s not installed, you’ll need to install the poppler-utils package which includes pdftotext.
    sudo apt install poppler-utils

    or

    yum install poppler-utils

    depending on your distribution.

  2. Usage:

    • Once installed, you can convert a PDF to text and then count the words as follows:
    pdftotext input.pdf - | wc -w

    Here, input.pdf is your source PDF file, and wc -w counts the number of words. The - in pdftotext specifies that the output should be sent to stdout, which is then piped into wc.