Forum | Documentation | Website | Blog

Skip to content
Snippets Groups Projects
  1. Mar 23, 2022
  2. Mar 11, 2022
  3. Feb 14, 2022
  4. Feb 13, 2022
    • Elliot Berman's avatar
      kbuild: Add environment variables for userprogs flags · f67695c9
      Elliot Berman authored
      
      Allow additional arguments be passed to userprogs compilation.
      Reproducible clang builds need to provide a sysroot and gcc path to
      ensure the same toolchain is used across hosts. KCFLAGS is not currently
      used for any user programs compilation, so add new USERCFLAGS and
      USERLDFLAGS which serves similar purpose as HOSTCFLAGS/HOSTLDFLAGS.
      
      Clang might detect GCC installation on hosts which have it installed
      to a default location in /. With addition of these environment
      variables, you can specify flags such as:
      
      $ make USERCFLAGS=--sysroot=/path/to/sysroot
      
      This can also be used to specify different sysroots such as musl or
      bionic which may be installed on the host in paths that the compiler
      may not search by default.
      
      Signed-off-by: default avatarElliot Berman <quic_eberman@quicinc.com>
      Reviewed-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Reviewed-by: default avatarFangrui Song <maskray@google.com>
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      f67695c9
  5. Feb 11, 2022
  6. Feb 06, 2022
    • Eric Dumazet's avatar
      net: initialize init_net earlier · 9c1be193
      Eric Dumazet authored
      While testing a patch that will follow later
      ("net: add netns refcount tracker to struct nsproxy")
      I found that devtmpfs_init() was called before init_net
      was initialized.
      
      This is a bug, because devtmpfs_setup() calls
      ksys_unshare(CLONE_NEWNS);
      
      This has the effect of increasing init_net refcount,
      which will be later overwritten to 1, as part of setup_net(&init_net)
      
      We had too many prior patches [1] trying to work around the root cause.
      
      Really, make sure init_net is in BSS section, and that net_ns_init()
      is called earlier at boot time.
      
      Note that another patch ("vfs: add netns refcount tracker
      to struct fs_context") also will need net_ns_init() being called
      before vfs_caches_init()
      
      As a bonus, this patch saves around 4KB in .data section.
      
      [1]
      
      f8c46cb3 ("netns: do not call pernet ops for not yet set up init_net namespace")
      b5082df8 ("net: Initialise init_net.count to 1")
      734b6541
      
       ("net: Statically initialize init_net.dev_base_head")
      
      v2: fixed a build error reported by kernel build bots (CONFIG_NET=n)
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9c1be193
  7. Feb 02, 2022
  8. Jan 22, 2022
    • Vlastimil Babka's avatar
      lib/stackdepot: allow optional init and stack_table allocation by kvmalloc() · 2dba5eb1
      Vlastimil Babka authored
      Currently, enabling CONFIG_STACKDEPOT means its stack_table will be
      allocated from memblock, even if stack depot ends up not actually used.
      The default size of stack_table is 4MB on 32-bit, 8MB on 64-bit.
      
      This is fine for use-cases such as KASAN which is also a config option
      and has overhead on its own.  But it's an issue for functionality that
      has to be actually enabled on boot (page_owner) or depends on hardware
      (GPU drivers) and thus the memory might be wasted.  This was raised as
      an issue [1] when attempting to add stackdepot support for SLUB's debug
      object tracking functionality.  It's common to build kernels with
      CONFIG_SLUB_DEBUG and enable slub_debug on boot only when needed, or
      create only specific kmem caches with debugging for testing purposes.
      
      It would thus be more efficient if stackdepot's table was allocated only
      when actually going to be used.  This patch thus makes the allocation
      (and whole stack_depot_init() ca...
      2dba5eb1
  9. Jan 11, 2022
    • Dmitry Torokhov's avatar
      module: add in-kernel support for decompressing · b1ae6dc4
      Dmitry Torokhov authored
      Current scheme of having userspace decompress kernel modules before
      loading them into the kernel runs afoul of LoadPin security policy, as
      it loses link between the source of kernel module on the disk and binary
      blob that is being loaded into the kernel. To solve this issue let's
      implement decompression in kernel, so that we can pass a file descriptor
      of compressed module file into finit_module() which will keep LoadPin
      happy.
      
      To let userspace know what compression/decompression scheme kernel
      supports it will create /sys/module/compression attribute. kmod can read
      this attribute and decide if it can pass compressed file to
      finit_module(). New MODULE_INIT_COMPRESSED_DATA flag indicates that the
      kernel should attempt to decompress the data read from file descriptor
      prior to trying load the module.
      
      To simplify things kernel will only implement single decompression
      method matching compression method selected when generating modules.
      This patch implements gzip ...
      b1ae6dc4
  10. Jan 08, 2022
    • Masahiro Yamada's avatar
      kbuild: do not quote string values in include/config/auto.conf · 129ab0d2
      Masahiro Yamada authored
      The previous commit fixed up all shell scripts to not include
      include/config/auto.conf.
      
      Now that include/config/auto.conf is only included by Makefiles,
      we can change it into a more Make-friendly form.
      
      Previously, Kconfig output string values enclosed with double-quotes
      (both in the .config and include/config/auto.conf):
      
          CONFIG_X="foo bar"
      
      Unlike shell, Make handles double-quotes (and single-quotes as well)
      verbatim. We must rip them off when used.
      
      There are some patterns:
      
        [1] $(patsubst "%",%,$(CONFIG_X))
        [2] $(CONFIG_X:"%"=%)
        [3] $(subst ",,$(CONFIG_X))
        [4] $(shell echo $(CONFIG_X))
      
      These are not only ugly, but also fragile.
      
      [1] and [2] do not work if the value contains spaces, like
         CONFIG_X=" foo bar "
      
      [3] does not work correctly if the value contains double-quotes like
         CONFIG_X="foo\"bar"
      
      [4] seems to work better, but has a cost of forking a process.
      
      Anyway, quoted strings were always PITA for our Makefiles.
      
      T...
      129ab0d2
  11. Jan 05, 2022
  12. Dec 09, 2021
  13. Dec 02, 2021
  14. Nov 24, 2021
  15. Nov 23, 2021
  16. Nov 17, 2021
  17. Nov 14, 2021
  18. Nov 11, 2021
    • Ingo Molnar's avatar
      mm: allow only SLUB on PREEMPT_RT · 252220da
      Ingo Molnar authored
      Memory allocators may disable interrupts or preemption as part of the
      allocation and freeing process.  For PREEMPT_RT it is important that
      these sections remain deterministic and short and therefore don't depend
      on the size of the memory to allocate/ free or the inner state of the
      algorithm.
      
      Until v3.12-RT the SLAB allocator was an option but involved several
      changes to meet all the requirements.  The SLUB design fits better with
      PREEMPT_RT model and so the SLAB patches were dropped in the 3.12-RT
      patchset.  Comparing the two allocator, SLUB outperformed SLAB in both
      throughput (time needed to allocate and free memory) and the maximal
      latency of the system measured with cyclictest during hackbench.
      
      SLOB was never evaluated since it was unlikely that it preforms better
      than SLAB.  During a quick test, the kernel crashed with SLOB enabled
      during boot.
      
      Disable SLAB and SLOB on PREEMPT_RT.
      
      [bigeasy@linutronix.de: commit description]
      
      Link: https://lkml.kernel.org/r/202110...
      252220da
    • Valentin Schneider's avatar
      preempt: Restore preemption model selection configs · a8b76910
      Valentin Schneider authored
      Commit c597bfdd ("sched: Provide Kconfig support for default dynamic
      preempt mode") changed the selectable config names for the preemption
      model. This means a config file must now select
      
        CONFIG_PREEMPT_BEHAVIOUR=y
      
      rather than
      
        CONFIG_PREEMPT=y
      
      to get a preemptible kernel. This means all arch config files would need to
      be updated - right now they'll all end up with the default
      CONFIG_PREEMPT_NONE_BEHAVIOUR.
      
      Rather than touch a good hundred of config files, restore usage of
      CONFIG_PREEMPT{_NONE, _VOLUNTARY}. Make them configure:
      o The build-time preemption model when !PREEMPT_DYNAMIC
      o The default boot-time preemption model when PREEMPT_DYNAMIC
      
      Add siblings of those configs with the _BUILD suffix to unconditionally
      designate the build-time preemption model (PREEMPT_DYNAMIC is built with
      the "highest" preemption model it supports, aka PREEMPT). Downstream
      configs should by now all be depending / selected by CONFIG_PREEMPTION
      rather than CONFIG_...
      a8b76910
  19. Nov 09, 2021
  20. Nov 06, 2021
  21. Oct 18, 2021
  22. Oct 10, 2021
  23. Sep 22, 2021
  24. Sep 19, 2021
    • Leon Romanovsky's avatar
      init: don't panic if mount_nodev_root failed · 40c8ee67
      Leon Romanovsky authored
      Attempt to mount 9p file system as root gives the following kernel panic:
      
       9pnet_virtio: no channels available for device root
       Kernel panic - not syncing: VFS: Unable to mount root "root" (9p), err=-2
       CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.15.0-rc1+ #127
       Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
       Call Trace:
        dump_stack_lvl+0x45/0x59
        panic+0x1e2/0x44b
        ? __warn_printk+0xf3/0xf3
        ? free_unref_page+0x2d4/0x4a0
        ? trace_hardirqs_on+0x32/0x120
        ? free_unref_page+0x2d4/0x4a0
        mount_root+0x189/0x1e0
        prepare_namespace+0x136/0x165
        kernel_init_freeable+0x3b8/0x3cb
        ? rest_init+0x2e0/0x2e0
        kernel_init+0x19/0x130
        ret_from_fork+0x1f/0x30
       Kernel Offset: disabled
       ---[ end Kernel panic - not syncing: VFS: Unable to mount root "root" (9p), err=-2 ]---
      
      QEMU command line:
       "qemu-system-x86_64 -append root=/dev/root rw rootfstype=9p rootflags=trans=virtio ..."
      
      This error is because root_device_name is truncated in prepare_namespace() from
      being "/dev/root" to be "root" prior to call to mount_nodev_root().
      
      As a solution, don't treat errors in mount_nodev_root() as errors that
      require panics and allow failback to the mount flow that existed before
      patch citied in Fixes tag.
      
      Fixes: f9259be6
      
       ("init: allow mounting arbitrary non-blockdevice filesystems as root")
      Signed-off-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      40c8ee67
    • Vivek Goyal's avatar
      init/do_mounts.c: Harden split_fs_names() against buffer overflow · b51593c4
      Vivek Goyal authored
      
      split_fs_names() currently takes comma separate list of filesystems
      and converts it into individual filesystem strings. Pleaces these
      strings in the input buffer passed by caller and returns number of
      strings.
      
      If caller manages to pass input string bigger than buffer, then we
      can write beyond the buffer. Or if string just fits buffer, we will
      still write beyond the buffer as we append a '\0' byte at the end.
      
      Pass size of input buffer to split_fs_names() and put enough checks
      in place so such buffer overrun possibilities do not occur.
      
      This patch does few things.
      
      - Add a parameter "size" to split_fs_names(). This specifies size
        of input buffer.
      
      - Use strlcpy() (instead of strcpy()) so that we can't go beyond
        buffer size. If input string "names" is larger than passed in
        buffer, input string will be truncated to fit in buffer.
      
      - Stop appending extra '\0' character at the end and avoid one
        possibility of going beyond the input buffer size.
      
      - Do not use extra loop to count number of strings.
      
      - Previously if one passed "rootfstype=foo,,bar", split_fs_names()
        will return only 1 string "foo" (and "bar" will be truncated
        due to extra ,). After this patch, now split_fs_names() will
        return 3 strings ("foo", zero-sized-string, and "bar").
      
        Callers of split_fs_names() have been modified to check for
        zero sized string and skip to next one.
      
      Reported-by: default avatarxu xin <xu.xin16@zte.com.cn>
      Signed-off-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      b51593c4
  25. Sep 14, 2021
  26. Sep 08, 2021