In the filesystem track of the2023 Linux Storage, Filesystem,Memory-Management and BPF Summit, Amir Goldstein led a session on usingfanotifyfor hierarchicalstorage management (HSM). Linux had some support for HSM in the XFSfilesystem's implementation of the data management API (DMAPI),but that code was removedback in 2010. Goldstein has done some work on using fanotify for HSM features, but he has run into some problems withdeadlocks that he wanted to discuss with attendees.
A complete stack trace is needed for a number of debugging and optimizationtasks, but getting such traces reliably can be surprisingly challenging.At the 2023 Linux Storage, Filesystem,Memory-Management and BPF Summit, Steve Rostedt and Indu Bhagatdescribed a mechanism called SFrame that enables the creation of reliableuser-space stack traces in the kernel withoutthe memory and run-time overhead of some other solutions.
The kernel developers try hard to avoid duplicating functionality in thekernel, which is enough of a challenge to maintain as it is. So it hasoften seemed out of character for the kernel to support three differentslab allocators (called SLAB, SLOB, and SLUB), all of which handle themanagement of small memory allocations in similar ways. At the 2023 Linux Storage, Filesystem,Memory-Management and BPF Summit, slab maintainer Vlastimil Babkaupdated the group on progress toward the goal of reducing the number ofslab allocators in the kernel and gave an overview of what to expect inthat area.
The kernel's swapping code tends to not get much love. Users try to avoidit, and developers often find better things to do with their time thantrying to improve it. At the 2023 LinuxStorage, Filesystem, Memory-Management and BPF Summit, though, YosryAhmed dedicated a memory-management-track session to the problem of theswap layer and what might be done to make it better.
Security updates have been issued by Debian (cups-filters, imagemagick, libwebp, sqlite, and texlive-bin), Fedora (chromium and vim), Gentoo (librecad, mediawiki, modsecurity-crs, snakeyaml, and tinyproxy), Mageia (apache-mod_security, cmark, dmidecode, freetype2, glib2.0, libssh, patchelf, python-sqlparse, sniproxy, suricata, and webkit2), Oracle (apr-util and firefox), Red Hat (git), SUSE (containerd, openvswitch, python-Flask, runc, terraform-provider-aws, and terraform-provider-null), and Ubuntu (tar).
Memory control groups (or "memcgs") allow an administrator to manage thememory resources given to the processes running on a system. Often,though, memcgs seem to have memory-use problems of their own, and that hasmade them into a recurring Linux Storage, Filesystem, and Memory-ManagementSummit topic since at least 2019. The topic returned at the 2023 event with a focus on thehandling of shared, anonymous memory. The quirks associated with thismemory type, it seems, can subject systems to an unpleasant sort of zombieinvasion; a session in the memory-management track led by T.J. Mercier,Yosry Ahmed, and Chris Li discussed possible solutions.
Bernd Schubert led a session at the 2023 Linux Storage, Filesystem,Memory-Management and BPF Summit on the intersectionof FUSEand io_uring. Heworks for DDN Storage, which is using FUSE for two network-storageproducts; he has found FUSE to be a bottleneck for those filesystems. Thatcould perhaps be improved by using io_uring, which is something he has been working on andwanted to discuss.
The "scatterlist" is a core-kernel data structure used to describe DMA I/Ooperations from the point of view of both the CPU and the peripheraldevice. Over the years, the shortcomings of scatterlists have become moreapparent, but there has not been a viable replacement on the horizon.During a memory-management session at the 2023 Linux Storage, Filesystem, Memory-Managementand BPF Summit, Jason Gunthorpe described a possible alternative, knownalternatively as "phyr", "physr", or "rlist", that might improve onscatterlists for at least some use cases.
Memory management is tricky enough on it own, but virtualization addsanother twist: now there are two kernels (host and guest) managing the samememory. This duplicated effort can be wasteful if not implementedcarefully, so it is not surprising that a lot of effort, from both hardwareand software developers, has gone into this problem. As Pasha Tatashinpointed out during a memory-management-track session at the 2023 Linux Storage, Filesystem, Memory-Managementand BPF Summit, though, there are still ways in which these systems runless efficiently than they could. He has put some effort into improvingthat situation.
Security updates have been issued by Fedora (cups-filters, kitty, mingw-LibRaw, nispor, rust-ybaas, and rust-yubibomb), Mageia (kernel-linus), Red Hat (jenkins and jenkins-2-plugins), SUSE (openvswitch and ucode-intel), and Ubuntu (linux-azure, linux-azure-4.15, linux-gcp, linux-gcp-5.15, linux-gke, linux-gke-5.15, linux-gkeop, linux-oracle-5.15, linux-ibm, linux-oracle, and linux-oem-6.0).
Joel Fernandes introduced himself to the memory-management track at the2023 Linux Storage, Filesystem,Memory-Management and BPF Summit as a co-maintainer of theread-copy-update (RCU) subsystem and an implementer of the "lazy RCU"functionality. Lazy RCU can improve performance, especially on systemsthat are not heavily utilized, but it also has some implications for memorymanagement that he wanted to discuss with the group.
The memory-management subsystem has the unenviable task of trying topredict which pages of memory will be needed in the near future. Sincepredictions tend to be difficult, the code relies heavily on the heuristicthat memory used in the recent past is likely to be used again in the nearfuture. However, even knowing which memory has been recently used can be achallenge. At the 2023 Linux Storage,Filesystem, Memory-Management and BPF Summit, Aneesh Kumar and Wei Xu,both presenting remotely,discussed some ways to use the increasingly capable hardware counters thatare provided by current and upcoming CPUs.
The buffer head is a kernel data structure that dates back to the firstLinux release; for much of the time since then, kernel developers have beenhoping to get rid of it. Hannes Reineckestarted a plenary session at the 2023 Linux Storage, Filesystem, Memory-Managementand BPF Summit by saying that everybody agrees that buffer heads are abad idea, but there is less agreement on how to take them out of thekernel. The core functionality they provide — facilitating sector-size I/Ooperations to a block device underlying a filesystem — must be providedsomehow.
When OpenAI made its chatbot ChatGPT available to the publicin November 2022, it immediately became a hit. However, despite thecompany's name, the underlying algorithm isn't open. Furthermore, ChatGPTusers require a connection to OpenAI's cloud service and face usagerestrictions. In the meantime, several open-source or freely availablealternatives have emerged, with some even able to run on consumer hardware. Although theycan't match ChatGPT's performance yet, rapid advancements are occurring inthis field, to the extent that some people at the companies developing theseartificial intelligence (AI) models have begun to worry.
Version2.39 of the util-linux tool collection has been released. The mostsignificant change, perhaps, is support for the new filesystem-mounting API, which enables anumber of new features, including ID-mappedmounts.
There are some filesystems that use the Filesystemin Userspace (FUSE) framework but only to provide a different view ofan underlying filesystem, such as different filemetadata, a changed directory hierarchy, or other changes of that sort.The read-only filteredfilesystem, which simply filters the view of which filesare available, is one example; the file data could come directly from theunderlying filesystem, but currently needs to traverse the FUSE user-space serverprocess. Finding a way to bypass the server, so that the file I/O operations godirectly from the application to the underlying filesystem would be beneficial. Ina filesystem session at the 2023 Linux Storage,Filesystem, Memory-Management and BPF Summit, Miklos Szeredi wanted to exploredifferent options for adding such a mechanism, which was referred to asa "FUSE passthrough"—though "bypass" might be a better alternative.
The conversion of the kernel's memory-management subsystem over to folios was never going to be done in a day.At a plenary session at the start of the second day of the 2023 Linux Storage, Filesystem,Memory-Management and BPF Summit, Matthew Wilcox discussed the currentstate and future direction of this work. Quite a lot of progress has beenmade — and a lot of work remains to be done.
A new development in the NVMe world was the subject of a combined storageand filesystem session led by Stephen Bates at the 2023 Linux Storage, Filesystem,Memory-Management and BPF Summit. Computational storage namespaceswill allow NVMe devices to offer various types of computation—anything fromsimple compression through complex queries and data manipulations—to beperformed on the data stored on the device.
The use of huge pages can make memory management more efficient in a numberof ways, but it can also impose costs in the form of internal fragmentation andI/O amplification. At the 2023 LinuxStorage, Filesystem, Memory-Management and BPF Summit, James Houghtonran a session on a scheme to get the best of both worlds: using huge pageswhile maintaining base-page mappings within them.
The6.3.3,6.2.16,6.1.29,5.15.112,5.10.180,5.4.243,4.19.283, and4.14.315stable kernels have all been released; each contains another set ofimportant fixes. Note that 6.2.16 will be the final update for the 6.2kernel.
In a plenary session on the first day of the 2023 Linux Storage, Filesystem,Memory-Management and BPF Summit, Stephen Bates led a discussion about peer-to-peer DMA (P2PDMA). The idea is toremove the host system's participation in a transfer of data from onePCIe-connected device to another. The feature was originally aimed at NVMeSSDs so that data could simply be copied directly to and from the storagedevice without needing to move it to system memory and then fromthere to somewhere else.
DAMON is a framework that allows user spaceto influence and control the kernel's memory-management operations. Itfirst entered the kernel with the 5.15 release, and has been gainingcapabilities ever since. At the 2023 Linux Storage, Filesystem,Memory-Management and BPF Summit, DAMON author Seongjae Park providedan overview of the current status of DAMON development and where it can beexpected to go in the near future.
In a remotely presented, memory-management-track session at the 2023 Linux Storage, Filesystem,Memory-Management and BPF Summit, Frank van der Linden pointed out thatthe line dividing resources controlled by the kernel from those managed byuser space has moved back and forth over the years. He is currentlyinterested in making it possible for user space to take more control overthe management of memory resources. A proposal was discussed in generalterms, but it will require some real scrutiny on its way toward themainline, if it ever gets there.
Sourceware.org, which has long played host to many important projects, hasannounced that it has become a member project of the Software FreedomConservancy — a move that has been in theworks for some time.
Overcommitting memory is a longstanding tradition in the Linux world(and beyond); it is rare that an application uses all of the memoryallocated to it, so overcommitting can help to improve overall memoryutilization. In situations where memory has been overcommitted, though, itmay be necessary to respond quickly to ensure that applications have thememory they actually need, even when those needs change. At the 2023 Linux Storage, Filesystem,Memory-Management and BPF Summit, T.J. Alumbaugh (in the room) andYuanchu Xie (remotely)presented a new mechanism intended to help hosts provide containerizedguests with the memory resources they need.
Virtual-machine hosting can be a fickle business; once a virtual machinehas been placed on a physical host, there may arise a desire to move it toa different host. The problem with migrating virtual machines, though, isthat there is a period during which the machine is not running; that can bedisruptive even if it is brief. At the 2023 Linux Storage, Filesystem,Memory-Management and BPF Summit, Dragan Stancevic, presentingremotely, showed how CXLshared memory can be used to migrate virtual machines with no offline time.
Security updates have been issued by Debian (golang-websocket, kernel, postgresql-11, and thunderbird), Fedora (firefox, kernel, libreswan, libssh, tcpreplay, and thunderbird), SUSE (dcmtk, gradle, libraw, postgresql12, postgresql13, postgresql14, and postgresql15), and Ubuntu (firefox, nova, and thunderbird).
The second 6.4 kernel prepatch is out fortesting. "This being rc2, it's been a fairly calm week as people areonly starting to find any issues from the merge window, but it all looksfine."
The Linux CPU scheduler will let realtime tasks hog the CPU to theexclusion of everything else — except when it doesn't. At the 2023 OpenSource Summit North America, Joel Fernandes covered the problems withthe kernel's realtime throttling mechanism and a couple of potentialsolutions. As a bonus, since the room was unscheduled for the followingslot, attendees were treated to a spontaneous session onadaptive spinning in user space run by André Almeida.
Memory tiering is the practice of dividing physical memory into separatelevels according to its performance characteristics, then allocating thatmemory in a (hopefully) optimal manner for the workload the system isrunning. The subject came up repeatedly during the 2023 Linux Storage, Filesystem,Memory-Management and BPF Summit. One session, led by David Rientjes,focused directly on tiering and how it might be better supported by theLinux kernel.
Kyungsan Kim began his talk at the 2023Linux Storage, Filesystem, Memory-Management and BPF Summit with aclaim that the Compute Express Link (CXL) technology is leading tofundamental changes in computer architecture. The kernel will have torespond with changes of its own, including in its memory-management layer.Drawing on some experience gained at Samsung, Kim had a few suggestions onthe form those changes should take — suggestions that ran into somedisagreement from other memory-management developers.
In part one of the tale, Brandt Bucherlooked specifically at the CPython optimizations that went intoPython 3.11 as part of the Faster CPython project. More of that workwill be appearing in future Python versions, but on day two of PyCon 2023 in Salt Lake City, Utah,Mark Shannon provided an overall picture of CPython optimizations,including efforts made over the last decade or more, with an eye toward theother areas that have been optimized, such as the memory layout for theinternal C data structures of the interpreter. He also described someadditional optimization techniques that will be used in Python 3.12and beyond.
Security updates have been issued by Debian (postgresql-13 and webkit2gtk), Fedora (git), SUSE (helm and skopeo), and Ubuntu (cinder, nova, python-glance-store, and python-os-brick).
Mike Rapoport has put a considerable amount of effort into solving theproblem of direct-map fragmentation over the years; this has resulted inproposals like __GFP_UNMAPPED anda session at the 2022 Linux Storage,Filesystem, Memory-Management, and BPF Summit. Rapoport returned at the 2023 Summit to revisit this issue, but hestarted with a somewhat surprising spoiler.
Greg Kroah-Hartman has announced the release of the 6.3.2, 6.2.15,6.1.28, and 5.15.111 stable kernels. These all containimportant fixes throughout the kernel tree, as usual.
Storage technology may seem like a slow-moving area, but there is, instead,a lot of development activity happening there. An early session at the2023 Linux Storage, Filesystem,Memory-management and BPF Summit, led by Martin Petersen and Vincent Haché, updated the assembled group onthe latest changes to the storage landscape, with an emphasis on theCompute Express Link (CXL) 3.0 specification.
The MicroPython programming language implements a sizable subset of Python that can run on microcontrollers, thus bringing Python's easy-to-learn syntax, readability, and versatility to the embedded world. With its recent 1.20 release, MicroPython introduces a new package manager, reduces its code size, and adds supportfor many new boards, including the Raspberry PiPico W. The project has come a long way since its inception ten years ago, making it an easy-to-use tool for developing software forresource-constrained environments.