Linux kernel becoming their own CVE Numbering Authority (CNA) is wasting resources they’d have previously put towards higher quantity and quality backporting. We’ve noticed a drop in both for the stable/longterm branches and particularly Android Generic Kernel Image LTS branches.

We’ve had around 2.5 years to evaluate impact of Generic Kernel Images. Our conclusion is that this caused more harm than good to GrapheneOS.

Generic Kernel Images are supposed to make kernel updates easier via a stable ABI, but Pixels update all drivers for GKI updates anyway.

The stability of the ABI isn’t perfect and many changes get reverted due to breaking the ABI. It also leads to even the GKI LTS branch with the latest merges of LTS releases to lag behind, particularly recently. We attribute some of that to the resources wasted on their CNA work.

CVE system did not work for the Linux kernel either way, but it’s certainly not fixed through making nearly every backport into a CVE and ignoring anything not backported. We don’t particularly care about it but rather our concern is wasting scarce resources on something useless.

Barely any resources are dedicated to stable Linux kernel releases. There’s very little testing and review. There have been multiple filesystem corruption bugs backported to ext4 and f2fs recently. Some didn’t exist in mainline but rather are from missing interdependent changes.

GKI LTS branch reverting a bunch of commits changing the ABI, working around the changed ABI in other cases and lagging behind is making it harder for us to deal with these issues. It’d be smoother upgrading the kernel and fixing API/ABI conflicts. ABI isn’t fully stable anyway.

Android reached the point where mainline kernels were usable beyond needing out-of-tree drivers for hardware and the Tensor Pixel drivers are way less invasive and easier to port to new releases. GKI has made a mess of it, and it doesn’t even make it easier for Pixels but harder.

5.10 kernel drivers for Pixel 6 were ported to 5.15, 6.1 and 6.6. They simply haven’t decided to move to a newer branch yet. The kernel for Pixel 8 doesn’t bother having a device kernel tree anyway but rather uses generic sources for GKI and all the drivers, so what’s the point?

We’re increasingly scared of updating LTS revisions and it does not help that the GKI LTS branch is lagging a bit behind since it’s not lagging behind due to any further stabilization but rather lack of resources to keep up. Any LTS revision with f2fs changes is terrifying now.

Unlike the stock Pixel OS, we’ve avoided shipping common f2fs corruption bugs in production by being way ahead on LTS adoption while narrowing avoiding shipping new serious issues. Has been way too close for comfort and we have low confidence in any LTS release with f2fs changes.

Generic Kernel Images have directly interfered with both hardening and performance due to the impact of vendor hooks working around not being able to change core kernel code. We don’t want dynamic kernel modules but we’re essentially forced into using them to avoid init bugs.

They’ve made the usual mistake of burning resources on branches by having 2 variants of each LTS branch (Android 12/13 variants of 5.10, Android 13/14 variants of 5.15, Android 14/15 variants of 6.1, etc.) and then making many overlapping branches from those to stabilize them.

We’re unconvinced that the Linux kernel is headed in the right direction. It’s not truly getting more robust or secure. The accelerating complexity and churn is opposed to both, as are the culture and tools. We’re hitting more issues including on our workstations and servers.

  • KindnessInfinityOPM
    link
    fedilink
    English
    arrow-up
    2
    ·
    8 months ago

    This would be:

    In the long term, GrapheneOS aims to move beyond a hardened fork of the Android Open Source Project. Achieving the goals requires moving away from relying on the Linux kernel as the core of the OS and foundation of the security model. It needs to move towards a microkernel-based model with a Linux compatibility layer, with many stepping stones leading towards that goal including adopting virtualization-based isolation.

    The initial phase for the long-term roadmap of moving away from the current foundation will be to deploy and integrate a hypervisor like Xen to leverage it for reinforcing existing security boundaries. Linux would be running inside the virtual machines at this point, inside and outside of the sandboxes being reinforced. In the longer term, Linux inside the sandboxes can be replaced with a compatibility layer like gVisor, which would need to be ported to arm64 and given a new backend alongside the existing KVM backend. Over the longer term, i.e. many years from now, Linux can fade away completely and so can the usage of virtualization. The anticipation is that many other projects are going to be interested in this kind of migration, so it’s not going to be solely a GrapheneOS project, as demonstrated by the current existence of the gVisor project and various other projects working on virtualization deployments for mobile. Having a hypervisor with verified boot still intact will also provide a way to achieve some of the goals based on extensions to Trusted Execution Environment (TEE) functionality even without having GrapheneOS hardware.

    Hardware and firmware security are core parts of the project, but it’s currently limited to research and submitting suggestions and bug reports upstream. In the long term, the project will need to move into the hardware space.

    source