From Video - Extract Hardsub

When subtitles are burned into the video, they become pixels. Your computer doesn’t see “words” — it sees a pattern of light and dark pixels. Extracting text requires an OCR engine to recognize characters, which is prone to errors.

We will be using a Python library called videocr . It is a wrapper that combines the power of (for image processing) and Tesseract-OCR (the industry standard open-source OCR engine). extract hardsub from video

(or hardsubs), and unlike soft subtitles, you can't just toggle them off or export them with a single click. When subtitles are burned into the video, they become pixels

: Most tools allow you to draw a crop box around the specific area where subtitles appear to prevent the OCR from trying to read other on-screen graphics. and unlike soft subtitles