Friend
Professional
- Messages
- 2,653
- Reaction score
- 850
- Points
- 113
Matthew Garrett, a well-known Linux kernel developer who once received an award from the Free Software Foundation for his contribution to the development of free software, spoke about the essence of the SBAT (Secure Boot Advanced Targeting) mechanism, created to block vulnerabilities in the bootloader without revoking the digital signature, as well as about its role in the recent incident with an update to Windows that stopped downloading some Linux distributions installed in parallel with Windows on systems with UEFI Secure Boot enabled. In short, both Microsoft is to blame, which did not fully test the update and applied it to systems to which it should not have applied, and the developers of some Linux distributions, who did not update the GRUB boot loader and the SBAT generation number when the vulnerabilities were discovered in GRUB.
Below is a translation of Garrett's note:
When the UEFI Secure Boot specification was being developed, everyone involved was, let's say, a little naïve. Secure Boot's core security model is that all code that runs in a privileged environment at the kernel level must be validated before execution – the firmware checks the bootloader, the bootloader checks the kernel, the kernel checks any additional kernel code loaded at runtime, and we now have a trusted environment to enforce any other security policy we want. Obviously, people can make a mistake, but the specification provided a way to revoke signed components that turned out to be untrustworthy: simply add a hash of the untrustworthy code to a variable, and then refuse to load anything with that hash, even if it is signed with a trusted key.
Unfortunately, as it turned out, the problem is in scale. Every Linux distribution running in the Secure Boot ecosystem generates its own bootloader binaries, and each has its own hash. If a vulnerability is found in the source code of such a loader, then a large number of different binaries must be recalled. And there is a limited amount of memory to store a variable that contains all these hashes. There's just not enough room to add a new set of hashes every time it turns out that GRUB (a loader originally written in a time when boot security wasn't thought of, and has several separate image parsers as well as a font parser, and you know where that leads) has another mechanism for an attacker to force them to execute arbitrary code, so a different solution was needed.
That decision was SBAT. The general concept of SBAT is quite simple. Each important component in the boot chain declares a security generation that is included in the signed binary. When a vulnerability is discovered and fixed, this generation increases. An update can then be released that defines the minimum generation - the boot components will look at the next item in the chain, compare its name and generation number with those stored in the firmware variable, and decide whether to execute it or not based on that. Instead of revoking a large number of individual hashes, you can release a single update that simply says, "Any version of GRUB with a security generation below this number is considered untrusted."
Why has this suddenly become relevant? SBAT was developed jointly by the Linux community and Microsoft, and Microsoft, decided to release an update for Windows that told systems not to trust versions of GRUB with a security generation below a certain level. This was done because these versions of GRUB had real security vulnerabilities that allowed attackers to disrupt the Windows Secure Boot chain, and we've seen real-world examples of malware that wanted to do this (Black Lotus exploited a vulnerability in the Windows bootloader, but the vulnerability in GRUB was just as effective). If you look at it purely from the point of view of security, this is a completely legitimate desire.
Now, regarding the "Something went completely wrong" message and the inability to download as a result of this update. It is derived by a shim, not from some code from Microsoft. Shim takes into account SBAT updates, and in order not to violate the security principles adopted by other boot loaders on the system, and although Microsoft has released the SBAT update, it is the Linux boot loader that refuses to run older versions of GRUB as a result. Everything works as it should.
The problem that people have encountered is that several Linux distributions have not released newer-generation versions of GRUB, and therefore these versions of GRUB are considered insecure (it's worth noting that GRUB is signed by the distributions themselves, not Microsoft, so there is no externally introduced lag). Microsoft's plan was for Windows Update to apply the SBAT update only to Windows-only systems, and any dual-boot installations would remain vulnerable to attack until the installed distribution updated GRUB and upgraded the SBAT generation. Unfortunately, as is now obvious, this did not work as intended, and at least some dual-boot systems applied the update, and the Shim of this distribution refused to download the GRUB of this distribution.
What is the result? Microsoft (understandably) did not want Windows to be attacked with a vulnerable version of GRUB that could be tricked into executing arbitrary code and then injecting a bootkit into the Windows kernel at boot time. Microsoft did this by releasing a Windows update that updated the SBAT variable to specify that vulnerable versions of GRUB should not boot on these systems. The first-stage loader provided by Shim would read this variable, read the SBAT partition from the installed copy of GRUB, realize that they were conflicting, and refuse to load grub with the message "Something went completely wrong". This update wasn't supposed to apply to dual-boot systems, but it was still applied.
In general:
1) Microsoft applied the update to systems to which it should not have been applied
2) Some Linux distributions did not update the GRUB bootloader and the SBAT security generation when vulnerabilities were discovered in GRUB.
As a result, some people are unable to boot their systems. I think that there are many guilty people here. Microsoft should have done more testing to ensure that dual-boot installations can be accurately identified. But also distributions that ship signed bootloaders need to make sure that they update them and update the security generation to comply because in otherwise, they provide an attack vector that can be used to hack into other operating systems, and that's kind of a violation of the social contract around all of this.
Unfortunately, the victims here are mostly end users who are faced with the system suddenly refusing to download the OS they want to download. This should never happen. I don't think a survey of end users as to whether or not they want Secure Boot system updates will lead to a good outcome, and while I'm vaguely inclined to believe that UEFI Secure Boot isn't something that benefits most end users, it's also something you don't want to discover after incidents like this, so I'm sympathetic to the fact that it's enabled by default, so I support enabling it by default, and share Microsoft's choices, except for the failed attempt to avoid the update on dual-boot systems.
Anyway, I was heavily involved in implementing this mechanism for Linux in 2012 and wrote the first prototype of Shim (which is now a significantly better bootloader supported by a wider range of people, and which I haven't touched in a few years), so if you want to blame one person, please feel free to blame me. This is something that shouldn't have happened, and if you're not Microsoft or a Linux distribution, then it's not your fault. Beg your pardon.
Below is a translation of Garrett's note:
When the UEFI Secure Boot specification was being developed, everyone involved was, let's say, a little naïve. Secure Boot's core security model is that all code that runs in a privileged environment at the kernel level must be validated before execution – the firmware checks the bootloader, the bootloader checks the kernel, the kernel checks any additional kernel code loaded at runtime, and we now have a trusted environment to enforce any other security policy we want. Obviously, people can make a mistake, but the specification provided a way to revoke signed components that turned out to be untrustworthy: simply add a hash of the untrustworthy code to a variable, and then refuse to load anything with that hash, even if it is signed with a trusted key.
Unfortunately, as it turned out, the problem is in scale. Every Linux distribution running in the Secure Boot ecosystem generates its own bootloader binaries, and each has its own hash. If a vulnerability is found in the source code of such a loader, then a large number of different binaries must be recalled. And there is a limited amount of memory to store a variable that contains all these hashes. There's just not enough room to add a new set of hashes every time it turns out that GRUB (a loader originally written in a time when boot security wasn't thought of, and has several separate image parsers as well as a font parser, and you know where that leads) has another mechanism for an attacker to force them to execute arbitrary code, so a different solution was needed.
That decision was SBAT. The general concept of SBAT is quite simple. Each important component in the boot chain declares a security generation that is included in the signed binary. When a vulnerability is discovered and fixed, this generation increases. An update can then be released that defines the minimum generation - the boot components will look at the next item in the chain, compare its name and generation number with those stored in the firmware variable, and decide whether to execute it or not based on that. Instead of revoking a large number of individual hashes, you can release a single update that simply says, "Any version of GRUB with a security generation below this number is considered untrusted."
Why has this suddenly become relevant? SBAT was developed jointly by the Linux community and Microsoft, and Microsoft, decided to release an update for Windows that told systems not to trust versions of GRUB with a security generation below a certain level. This was done because these versions of GRUB had real security vulnerabilities that allowed attackers to disrupt the Windows Secure Boot chain, and we've seen real-world examples of malware that wanted to do this (Black Lotus exploited a vulnerability in the Windows bootloader, but the vulnerability in GRUB was just as effective). If you look at it purely from the point of view of security, this is a completely legitimate desire.
Now, regarding the "Something went completely wrong" message and the inability to download as a result of this update. It is derived by a shim, not from some code from Microsoft. Shim takes into account SBAT updates, and in order not to violate the security principles adopted by other boot loaders on the system, and although Microsoft has released the SBAT update, it is the Linux boot loader that refuses to run older versions of GRUB as a result. Everything works as it should.
The problem that people have encountered is that several Linux distributions have not released newer-generation versions of GRUB, and therefore these versions of GRUB are considered insecure (it's worth noting that GRUB is signed by the distributions themselves, not Microsoft, so there is no externally introduced lag). Microsoft's plan was for Windows Update to apply the SBAT update only to Windows-only systems, and any dual-boot installations would remain vulnerable to attack until the installed distribution updated GRUB and upgraded the SBAT generation. Unfortunately, as is now obvious, this did not work as intended, and at least some dual-boot systems applied the update, and the Shim of this distribution refused to download the GRUB of this distribution.
What is the result? Microsoft (understandably) did not want Windows to be attacked with a vulnerable version of GRUB that could be tricked into executing arbitrary code and then injecting a bootkit into the Windows kernel at boot time. Microsoft did this by releasing a Windows update that updated the SBAT variable to specify that vulnerable versions of GRUB should not boot on these systems. The first-stage loader provided by Shim would read this variable, read the SBAT partition from the installed copy of GRUB, realize that they were conflicting, and refuse to load grub with the message "Something went completely wrong". This update wasn't supposed to apply to dual-boot systems, but it was still applied.
In general:
1) Microsoft applied the update to systems to which it should not have been applied
2) Some Linux distributions did not update the GRUB bootloader and the SBAT security generation when vulnerabilities were discovered in GRUB.
As a result, some people are unable to boot their systems. I think that there are many guilty people here. Microsoft should have done more testing to ensure that dual-boot installations can be accurately identified. But also distributions that ship signed bootloaders need to make sure that they update them and update the security generation to comply because in otherwise, they provide an attack vector that can be used to hack into other operating systems, and that's kind of a violation of the social contract around all of this.
Unfortunately, the victims here are mostly end users who are faced with the system suddenly refusing to download the OS they want to download. This should never happen. I don't think a survey of end users as to whether or not they want Secure Boot system updates will lead to a good outcome, and while I'm vaguely inclined to believe that UEFI Secure Boot isn't something that benefits most end users, it's also something you don't want to discover after incidents like this, so I'm sympathetic to the fact that it's enabled by default, so I support enabling it by default, and share Microsoft's choices, except for the failed attempt to avoid the update on dual-boot systems.
Anyway, I was heavily involved in implementing this mechanism for Linux in 2012 and wrote the first prototype of Shim (which is now a significantly better bootloader supported by a wider range of people, and which I haven't touched in a few years), so if you want to blame one person, please feel free to blame me. This is something that shouldn't have happened, and if you're not Microsoft or a Linux distribution, then it's not your fault. Beg your pardon.