So you want to parse a PDF?
Out of 3,977 real-world PDFs, 0.5% broke during xref pointer parsing. Not a huge numberâunless you're the one parsing them. The top culprit? Junk data before the start pointer. Classic. Other file weirdness: broken xref tables, bad object offsets, and inconsistent xref chains...