lhfrank wrote:
It's a 288 page document, 15.7 MB. It's a very yellowed copy of a book of music. I thought the filters would work on documents too!
Quartz filters do work on PDF documents. However, they are designed for images. I don't have any documents with yellowed pages to test. I made one with 2 pages and it worked fine. But it wouldn't surprise me if it failed on a 288 page document. For very technical reasons, it just isn't designed for that.
But it's really a moot point. Even if it had worked, the output would be unusable. Black and white is what it says, black and white. You'll be left with 288 white pages, each filled with a random pattern of connected dots. No text.
If you want to extract the text, I recommend Preview's built-in text extraction tool. It has no problem with yellowed pages. Unfortunately, it does have a problem with documents. It only seems to work on one page at a time. And it doesn't seem to work on PDF documents at all. (Or at least, not on all PDF documents.)
Remember, this is Apple "Preview", not Apple Photoshop. Apple understands that people want more. They just bought Pixelmator. But even Pixelmator is designed for images, not documents. Apple simply doesn't support this kind of data extraction or conversion at scale. There is a reason why people sue Google and the Internet Archive for this, and not Apple.
I'm afraid you're going to have to search the internet "grey market" of command-line PDF tools with OCR capabilities.