How to handle event handler attributes #226

lukewarlow · 2024-04-23T16:05:11Z

Afaik the current plan for the sanitizer API is that there would be a list of all known attributes and from that you'd subtract an underlying blocklist of unsafe attributes, event handlers are tricky because there's lots of them. If we have a list of event handlers to remove it would be useful if it was defined separately from other attributes so that the trusted types specification could piggy back off of it.

Having said that all current ones follow the on* pattern, should we just carve out on* attributes as being unsafe and block them all? I'm no perf specialist but I imagine that would be faster to process (both WebKit and Chromium already have some paths that do the on prefix check for example).

The text was updated successfully, but these errors were encountered:

annevk · 2024-04-23T16:08:17Z

I think from a re-parsing perspective or letting something through in one browser that's blocked in another this would also be an improvement. Although it would also be a tiny bit of magic again that cannot currently be explained through syntax, but that seems reasonable.

lukewarlow · 2024-04-23T16:16:56Z

Has their been any discussion about allowing stuff like regex in the configs? Not needed for a v1 but if we did allow some form of pattern based matching then it would solve the fact we can't currently define it in syntax.

domenic · 2024-04-24T03:40:22Z

I worry that this restricts the space of future HTML attribute design in a way that's hard to understand. E.g. we could not use attributes named things like once="", only="", online="", ongoing="", onto="".

Maybe it would still be OK as long as we'd consider it a backward-compatible change to modify the sanitization algorithm in the future. E.g. today we'd strip all attributes starting with on, but in the future when we add an only="" attribute, we strip all attributes starting with on except we let only="" through.

annevk · 2024-04-24T06:53:54Z

I think we do have the expectation that what is being sanitized and how will subtly change over time due to the HTML (including SVG/MathML) language evolving. New "safe" attributes getting safelisted seems in line with that.

otherdaniel · 2024-04-24T13:42:26Z

I'm not fond of this, and would prefer us to be precise with the event handler attributes.

With Trusted Types, we initially had a similar logic, but kept getting bug reports about sites using their own on* attributes that would get erroneously blocked by TT. So we eventually re-implemented this with the actual list of actual event handlers, which solves that set of problems. I'd prefer to not repeat that experience.

Examples:

Also, I don't quite understand why this would be needed at all: Our default-config -- where these matter -- are allow-lists, and in an allow-lists the event handlers are very much expected to be absent. That is, no on* attribute -- whether a browser-defined event handler or not -- should show up in any built-in config. So why would we care? Or why would it make a perf difference?

For the config pre-processing, we do need to decide whether an attribute is known or not -- otherwise one could never override the built-in default. That indeed requires a list of known attributes, including the known event handlers. But that's required only during config normalization; and that requires every single HTML attribute ever. I doubt the event handlers make that much of a difference there.

annevk · 2024-04-24T14:30:38Z

Fair enough. List it is then.

annevk · 2024-11-29T09:22:13Z

I do have some lingering worry here that each time a browser introduces a new event handler attribute, the sanitization won't be the same across browsers and this could result in some issues. How to square that with sites using their own on* attributes is unclear though.

mozfreddyb · 2024-11-29T09:55:35Z

If we solved this by disallowing and listing known event handlers in the "no XSS possible" mode and disallowing all handlers starting with "on" in the default, we'd basically allow browsers to bypass the default because a user might create a Sanitizer that allows a new onfoo attribute that the browser supports but Sanitizer is just not aware of yet.

What if we disallow all on* and hardcode exceptions for those known to be widely used? Seems also a bit icky imho.

What if we just disallow all on and tell people to use unsafe if they are unhappy with it? Luxury of designing a new API is that we can decide not just to support harmless on* attributes... (admittedly a bit dismissive of people's use cases)

annevk · 2024-11-29T10:07:09Z

@cure53 how does DOMPurify deal with event handler attributes?

cure53 · 2024-11-29T10:13:07Z

@cure53 how does DOMPurify deal with event handler attributes?

@annevk We work with an allow-list on which event handlers are simply not present. If folks decide they want to permit them, they can do so using the config, i.e. via ADD_ATTR.

Other than that, they receive no extra treatment, and get handled as any other attribute 🙂

lukewarlow · 2024-11-29T10:30:22Z

What if we just disallow all "on*" and tell people to use unsafe if they are unhappy with it?

My personal thoughts are we should just do this. Google decided against it for trusted types (which now makes it difficult to change retrospecitvely) but I think we should just take the slight annoyance from certain websites and do it for sanitizer like should have been done for trusted types.

As you say worst case they can use unsafe if they really need their badly named attribute to work.

I think we (whatwg) should also agree to leave onXYZ to only event handlers and if someone wants some other attribute to start with 'on' we simply get them to rename it.

otherdaniel · 2024-12-09T17:17:29Z

As Luke said, Trusted Types initially blocked any "on*", but we got too many complaints about that and then changed to an actual list, which is derived from the IDLs.

Examples problems are:

I think Sanitizer has an easier time with this, because there's always the "Unsafe" variants. But that sure isn't pretty.

As it presently stands the defaults would be an allow list, where this isn't a problem because no event handler is going to be in the default allow list anyhow. But removeUnsafe, plus the implied removeUnsafe in safe versions, would make this observable.

mozfreddyb · 2024-12-09T20:00:55Z

A couple of the linked issues are not public. I hope we can get away with this by a) codifying that on is for event handlers and assumed unsafe and b) supporting this anyway with the unsafe methods :-)

I understand that we would be dismissive of existing code and use cases for custom attributes... 😕

annevk · 2024-12-10T07:56:53Z

I'm a bit surprised that it seems the Trusted Types decision was made unilaterally by Chromium. Was there no discussion about this in a specification setting?

mozfreddyb · 2024-12-10T09:06:20Z

I mean, the alternative is to somewhere have a canonical list of event handler attributes but that seems tricky in two ways. First, it appears that there is no such list and the HTML spec might not be the best source to find all of them. Secondly, most implementations don't really know or can figure out at build time either.

cure53 · 2024-12-10T09:19:45Z

I mean, the alternative is to somewhere have a canonical list of event handler attributes but that seems tricky in two ways. First, it appears that there is no such list and the HTML spec might not be the best source to find all of them. Secondly, most implementations don't really know or can figure out at build time either.

Historically, as far as I know, there was only one case where a non-event handler started with "on" and if I remember correctly, that was for the VML implementation in MSIE.

Other than that, I am not aware of any attributes starting with "on" that do not trigger JavaScript execution.

mozfreddyb · 2024-12-10T09:24:23Z

Well, see links above. Lots of frameworks do funky stuff with on= or one=, that do not lead to script execution (but may cause script gadget funkiness)

lukewarlow · 2024-12-10T09:47:29Z

The current behaviour also makes it pretty hard to test trusted types coverage of event handlers (and would similarly be difficult for sanitizer). You can read from spec web idl but not all browsers implement all event handlers so that can show up failures that aren't failures. It also doesn't cover any non-standard event handlers, nor does it necessarily catch all existing ones if they're not on the IDL constructs you expect.

You can maybe loop through the element classes on the global and from their loop through their properties that start with 'on' and test against each of those, by converting the class name to a tag name creating the element and trying to set the attribute. Though just because the property exists on the prototype doesn't mean it actually executes script as a content attribute.

otherdaniel · 2024-12-10T16:30:27Z

I'm a bit surprised that it seems the Trusted Types decision was made unilaterally by Chromium. Was there no discussion about this in a specification setting?

The change was in May 2022, when no browser other than Chromium was implementing or interested in implementing Trusted Types. I don't think anyone else paid attention to the TT spec at that time.

I'd also like to clarify that this was an implementation change; not a spec change. The spec specified from the very beginning that TT would check "event handler content attributes". Which isn't correct spec language, but it surely doesn't mean "on*". Blocking "on*" was never allowed, per spec. (Old spec versions are preserved in https://github.com/w3c/trusted-types. I double checked for revisions going back to 2020.)

In the Chromium implementation I had taken a shortcut to just block "on*" because I didn't want to do the work to obtain a list of event handler names. Then, as TT was being picked up by users, we got a number of complaints about this. It became clear that my shortcut is untenable, and I got told to pretty please fix it. Which I did, and which is the change we're talking about here.

otherdaniel · 2024-12-10T17:11:13Z

The current behaviour also makes it pretty hard to test trusted types coverage of event handlers (and would similarly be difficult for sanitizer). You can read from spec web idl but not all browsers implement all event handlers so that can show up failures that aren't failures. It also doesn't cover any non-standard event handlers, nor does it necessarily catch all existing ones if they're not on the IDL constructs you expect.

I don't understand why this is hard to test. My usual test pattern is to pick a few block and a few non-block cases and test those. If we want to test exhaustively -- whatever that would even mean for the "on*" case -- I'd take a list of candidates and check whether they're event handlers to determine what the expected outcome under TT would be.

(Something like: div.setAttribute(probe, "return"); const expect_to_throw_with_tt = (typeof div[probe] == "function"); The underlying behaviour -- converting a string to a function, or not -- is spec mandated, I believe.)

I think for IDL tests, WPT runs tests against spec-derived baseline, and whoever does something differently will have to maintain a list of expectations themselves. That puts the maintenance burden on whoever does something different. That might also work.

I wish we could somehow re-use or hook into the webref / IDL tests. That seems like a fairly comprehensive and well-tested mechanism for testing APIs against a spec baseline. I'm admittedly not seeing an obvious way to do that.

I do understand the interop concern; but don't really understand the testing one.

noamr · 2024-12-10T21:50:21Z

Can we rely on the globally exposed WebIDL interfaces somehow here? As in:

By default, in unsafe mode, reject all the on* properties that are listed in WebIDLs exposed on this global object
Allow some WebIDL keyword to opt-out of this behavior, e.g. [Safe] or [TrustMeThisIsNotAnEvent] or whatever if we ever want to introduce attributes like once
This wouldn't reject author-defined on*, because those don't come from WebIDL

annevk · 2024-12-11T08:58:58Z

@otherdaniel FWIW, I have been paying attention since 2019 at least, from a cursory look at old issues in the relevant repository, but I did miss that the specification was already requiring that.

I've been trying to think of how to best decide on this issue. I think it comes down to how likely it is that sanitized content in one browser ends up being reflected in another. While that concern also impacts other features, event handler attributes are much more commonly introduced. Overall though that seems like a fairly minor concern and any website that does it that way will also run into other issues I suspect. So I think I'm leaning towards using the same backing list as Trusted Types.

mozfreddyb · 2024-12-11T09:53:15Z

We discussed this in our sanitizer meeting and found both approaches doable. However, we agreed on the following:

in the spec, use the lists of specified event handler attributes from HTML
in implementations, make sure that there is an exhaustive mechanism to get all the supported attributes on top of that
provide no defense-in-depth for authors that sanitize in one browser and then use the result in another browser, because we can only ever match the parsing/sanitization rules of the current browser.

lukewarlow mentioned this issue Apr 23, 2024

Upcoming WHATNOT meeting on 4/25/2024 whatwg/html#10287

Closed

lukewarlow mentioned this issue Apr 23, 2024

Rewrite metadata functions w3c/trusted-types#457

Merged

bkardell mentioned this issue Dec 5, 2024

MathML support in the HTML Sanitizer API w3c/mathml-core#227

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to handle event handler attributes #226

How to handle event handler attributes #226

lukewarlow commented Apr 23, 2024

annevk commented Apr 23, 2024

lukewarlow commented Apr 23, 2024

domenic commented Apr 24, 2024

annevk commented Apr 24, 2024

otherdaniel commented Apr 24, 2024

annevk commented Apr 24, 2024

annevk commented Nov 29, 2024

mozfreddyb commented Nov 29, 2024

annevk commented Nov 29, 2024

cure53 commented Nov 29, 2024

lukewarlow commented Nov 29, 2024 •

edited by mozfreddyb

Loading

otherdaniel commented Dec 9, 2024

mozfreddyb commented Dec 9, 2024

annevk commented Dec 10, 2024

mozfreddyb commented Dec 10, 2024

cure53 commented Dec 10, 2024

mozfreddyb commented Dec 10, 2024

lukewarlow commented Dec 10, 2024

otherdaniel commented Dec 10, 2024

otherdaniel commented Dec 10, 2024

noamr commented Dec 10, 2024 •

edited

Loading

annevk commented Dec 11, 2024

mozfreddyb commented Dec 11, 2024

How to handle event handler attributes #226

How to handle event handler attributes #226

Comments

lukewarlow commented Apr 23, 2024

annevk commented Apr 23, 2024

lukewarlow commented Apr 23, 2024

domenic commented Apr 24, 2024

annevk commented Apr 24, 2024

otherdaniel commented Apr 24, 2024

annevk commented Apr 24, 2024

annevk commented Nov 29, 2024

mozfreddyb commented Nov 29, 2024

annevk commented Nov 29, 2024

cure53 commented Nov 29, 2024

lukewarlow commented Nov 29, 2024 • edited by mozfreddyb Loading

otherdaniel commented Dec 9, 2024

mozfreddyb commented Dec 9, 2024

annevk commented Dec 10, 2024

mozfreddyb commented Dec 10, 2024

cure53 commented Dec 10, 2024

mozfreddyb commented Dec 10, 2024

lukewarlow commented Dec 10, 2024

otherdaniel commented Dec 10, 2024

otherdaniel commented Dec 10, 2024

noamr commented Dec 10, 2024 • edited Loading

annevk commented Dec 11, 2024

mozfreddyb commented Dec 11, 2024

lukewarlow commented Nov 29, 2024 •

edited by mozfreddyb

Loading

noamr commented Dec 10, 2024 •

edited

Loading