The picture element is obviously the result of a lot of really intelligent minds working really hard to solve a complex problem. If you spend a little time thinking about it, and you’re familiar with all the pitfalls that surround this issue, you can see the reasons for many of the decisions that they made. But it’s still not what the web needs.

Before I get into all the reasons why picture is bad, let me say this: I don’t have a different solution. Responsive images is a hard problem, and I don’t know what the right solution is. I’m not even saying the picture element is not the right solution for today, but I hope that it doesn’t have to be the solution we all live with for the next decade.

1. The markup is really confusing… and *unsemantic* (oh no he didn’t)

Before you jump straight to the comments to tell me what an idiot I am, let me say that I’ve read the spec. I “get” how it works. I get the reasons it “had” to be the way that it is. That doesn’t mean it’s ok. Look at this:

<source media=“(min-width: 320px)” sizes=“100vw” srcset=“pic320.jpg 320w, pic640.jpg 640w, pic800.jpg 800w”>
<source media=“(min-width: 1024px)” sizes=“980px” srcset=“pic800.jpg 1x, pic1200.jpg 1.5x, pic1600.jpg 2x”>
<source media=“(min-width: 1200px)” sizes=“calc((100vw / 3) - 100px)” srcset=“pic1200.jpg 1200w, pic1600.jpg 1600w, pic2000.jpg 2000w”>
<img src=“pic320.jpg” alt=“Some alt text">

That’s a real use case. We have media queried sources, because at different device widths, we use the image differently in the layout. We have srcsets for each one, because we don’t know exactly how large the screen of viewing is and we don’t know the device pixel density (one of the biggest challenges in responsive images). This covers a broad number of devices across 3 layout breakpoints. And it’s horrifying.

The real problem with the markup, however, is that we have a layout-relevant element inside what is essentially a meta-data wrapper. The thing you have to understand about the picture element is that

<picture> is not the thing you style in your CSS. The <img> is. The picture element is just a wrapper for the sources. It’s basically invisible in the DOM. That’s insane.

With the video and audio elements, the metadata goes inside the element that has layout consequences, which makes sense and compartmentalizes the information nicely. Why are we designing the HTML spec around fallback support? The markup is bad, it’s confusing, and it doesn’t make sense structurally.

2. Presentation details in the markup

HTML is for creating structure and meaning in the content of pages. CSS is for controlling the presentation of that content across the many devices on which it’s displayed. So why are we creating, supporting, and polyfilling an element that makes it impossible to divorce the HTML content from it’s CSS presentation?

I know why the sizes and media queries have to be in the HTML. I know that the whole point of the picture spec is to increase speed and decrease latency, and needing to wait for the CSS to parse defeats the whole purpose. The point of this spec is to give parsers the info they need to download only the smallest image that works, and maybe to load different images on the fly when layout changes. But there has to be a way to do that without violating one of the core tenants of CSS: separation of concerns.

We have to be able to do better than this. Imagine WordPress implementing the picture element natively? What a UI and code logic nightmare.

3. This isn’t going to be supported for years… so why are we settling?

This isn’t going to actually have broad enough support to use in production for a really, really long time (yea, except for picturefill). And when it is, we’ll be stuck with supporting it forever. Think about that. Is this how to want to add images to your web pages for the next decade? What happens when it’s time for a redesign? I hope you use a CMS that implemented picture smartly, or your life is going to be a manual find and replace hell.

To reiterate, I’m not opposed to the values of the picture element. Performance. Network conservation. Returning control to the user. Thinking about responsive web design as more than just linearly scaling up the UX quality with device width. But this implementation is not good enough. We have to dig deeper. We have to try something else.

How do we solve it?

Like I said before, I don’t know what the solution is. There are a lot of people thinking about this that are much smarter than me. That’s part of the reason I’m so bummed that this solution isn’t what we need it to be. I know we, as web developers, designers, and spec authors, can do better. We have to.