I'm looking for some insights on an alternative way to publish images that could help curb issues with automated scraping and unauthorized reuse. Typically, even with modern techniques like CDNs or image compression, we still end up serving the original image files (or slightly altered versions). This setup makes it really easy for scrapers to access high-value assets directly.
Instead of focusing on typical protection strategies, I've been considering a different delivery model. The idea is to stop serving image files outright. Instead, images would be published server-side as tiles, accompanied by a manifest. On the client side, a viewer could then piece together the image on-the-fly, loading only what's needed based on the current viewport and zoom level. This way, the original image is never delivered to the client after publishing.
It's important to clarify that this isn't about enforcing DRM or blocking user behavior; rather, it's focused on minimizing uncontrolled reuse by eliminating direct access to the original files. I'm curious to hear thoughts on the tradeoffs of this approach concerning scalability, CDN behavior, caching strategies, and potential operational complexities. Are there specific risks or failure modes that could arise if this were implemented in a production environment? I welcome perspectives from anyone experienced with image-rich systems.
5 Answers
Have you thought about using something like signed URLs? If you could set them to expire after a few seconds, that might help limit how long someone can have access to the original files. Just a thought!
It seems like you've really nailed down the core issue, which is that most systems expose images via direct file URLs. This leads to easy scraping and reuse. Your method of delivering images in tiles is fascinating because it breaks that one-file accessible model. Instead of getting a full asset, scrapers would have to piece things together, making it more complex for them. Your approach might really help in reducing uncontrolled reuse, although users could still find a way around it ultimately. Switching up the delivery strategy like this could definitely change the game!
Honestly, your idea sounds pretty cool, but I have to say it does feel a bit like DRM. While you might limit straightforward downloading, anyone determined enough could still reconstruct the full image from the tiles. It’s all about making the process trickier for those who want to scoop them up.
You might find inspiration in the IIIF Image API; it has similar tiling ideas! But your focus on altering the delivery architecture could minimize direct exposure of the assets effectively, which is smart. Just be prepared for some potential pushback in implementation!
I think your approach of using tiles rather than serving full files is very forward-thinking. However, you might also want to consider how browsers handle new media types. If it gets too obscure, you might have a challenge on your hands getting browsers to render it seamlessly, especially if you're relying on special JavaScript. But overall, it sounds like a worthwhile experiment!

Totally agree! While it won't make scraping impossible, it will certainly add layers of complexity that could deter a lot of would-be scrapers.