The input images would be part of the driver, not provided by the controller or its ROM.
And Microsoft are the only ones that can arbitrate it, as they're the ones both certify drivers and create the APIs that games use in Windows.
Microsoft did create an API for all sorts of controllers as part of DirectX. They just didn't update it since the early 00s. When the X-Box 360 launched, they created a different API for that controller, and presumably, just for that controller. And there's not been anything since. There is no universal API to handle input in Windows. There doesn't seem to be a common functionality target for manufacturers to hit either (compared to how graphics drivers can target levels of the DirectX API, for example) either.
Microsoft can absolutely set out a series of standards that drivers must meet to pass certification - declare inputs, allow Windows to handle mapping, and provide visual representation of each input. Microsoft originally recognised controller input as a gaming issue and put it under the purview of DirectX, but then they forgot about it. Nowadays you can buy a graphics card that advertises itself as being DirectX12 compatible. Is your controller DirectX12 compatible? No, because there's no such thing, because Microsoft didn't bother.
Maybe from Microsoft's perspective, this isn't an issue, because they think the X-Box controller is the best one ever and nobody should ever use anything different. And maybe they're right! But it's still asinine and out of character for a company that is all about the plurality of choice.