The way x86 instructions are variable length and not self-synchronizing means that you can see up to 15% of your core’s power budget go to decode if you aren’t running in the small cache of decoded instructions, at least a few generations ago when last I heard. That isn’t huge but it does mean that x86 architects have to put thought into how wide to make it, they can’t just size it to make sure it’s never a bottleneck like ARM designers can.
The way x86 instructions are variable length and not self-synchronizing means that you can see up to 15% of your core’s power budget go to decode if you aren’t running in the small cache of decoded instructions, at least a few generations ago when last I heard. That isn’t huge but it does mean that x86 architects have to put thought into how wide to make it, they can’t just size it to make sure it’s never a bottleneck like ARM designers can.