I was trying to download a free book on pandas from a Chinese university website, and I noticed that the PDF file size was 39MB for about 390 pages. This made me a bit nervous about possible malicious content hidden in the file, so I cancelled the download. Is it common for a plain text PDF of that length to be 39MB, and can executable code be hidden in PDFs?
3 Answers
If you're wary about the source, don't stress too much about file size as an indicator of safety. A file can be small and still harmful. Trust your gut about where you're downloading from!
Yeah, a 390-page PDF can definitely hit 39MB, especially if it includes embedded fonts for Chinese characters. PDFs can pack in quite a bit of info, and sizes can vary a lot depending on what's included.
True, the size doesn't tell you about hidden threats like malicious code; those can be tiny. A rough estimate is around 100KB per page for high-quality scans. If that file is text with no images, it shouldn't be that big, but it's possible with added security features.
Right, and sometimes the OCR (Optical Character Recognition) isn't perfect, which can also affect the file size.

Exactly! And if it's mostly images, it could balloon in size fast. Always good to double-check what you're downloading.