I'm looking into a different way of publishing images online and would appreciate some technical feedback. Most web setups tend to serve the original image file or a slightly modified version, making it easy for automated scraping and reuse. Instead of just using protection tactics like compression or watermarking, I want to change the entire delivery system.
My idea is to avoid serving image files directly. Instead, I'd publish images as tiles along with a manifest. This way, a client-side viewer dynamically constructs the image by loading only what's needed for the current viewport and zoom level, meaning the original file isn't requested after publication.
This approach aims not to impede user behavior or enforce DRM, but rather to minimize automated scraping and uncontrolled reuse by eliminating direct access to the original asset.
I'm curious about the tradeoffs this architecture might have, especially regarding scalability, CDN behavior, caching and invalidation, and whether there are potential failure modes or risks that might arise in a production environment. I'd love to hear thoughts from anyone experienced with image-heavy systems.
5 Answers
Have you thought about using signed URLs with AWS S3 that expire in a few seconds? This could control access directly to the files, but I do get your point about making direct access even more challenging through a shift in the model itself.
You might want to consider the IIIF Image API; those have similar principles with tiling and managing dynamic views. Your approach could align well with those ideas, shifting not just how images are served but also their exposure.
I think the main concern before diving into this new approach is to clarify the problem you're aiming to solve. It's not just about users saving images; it's about protecting high-value assets that are easily scrapeable due to their direct URLs. By changing how images are delivered, such as using tiles and a manifest for dynamic reconstruction on the client side, you could significantly reduce uncontrolled reuse at scale. This method removes the existence of a single fetchable asset, which could shift the economics of scraping... but it won't completely stop users who are determined to download. It's a clever way to add friction, though!
Exactly! The essence of your concern is significant. It's not about outright prevention but rather complicating things enough that automated tools struggle to work effectively. I'm rooting for the innovative angle you're taking!
Another idea is to generate various sizes of images and only serve those through signed URLs. You can keep the original for regenerator versions or archival. It gives you control over how users access the files without exposing the originals directly. Just a thought!
While I appreciate your innovative approach, I have to say that focusing on reducing scraping is akin to a form of DRM. No matter what you implement, if the images are available publicly, they can be scraped. The challenge is that if someone really wants to download the images, they could reverse-engineer your approach. It’s a difficult problem to solve completely!

For sure! IIIF has some great concepts that can influence your tiling and viewport strategy. Just remember, your goal is to decrease exposure, and blending these ideas into your architecture could be a solid route!