The 4.17-rc4 kernel prepatch is out."Two thirds of the 4.17-rc4 patch is drivers, which sounds about right.Media, networking, rdma, input, nvme, usb. A little bit of everything, inother words." The codename has been changed, for the first timesince 4.10, to "Merciless Moray".
The mount()system call suffers from a number of different shortcomings that has ledsome to consider a different API. At last year's Linux Storage,Filesystem, and Memory-Management Summit (LSFMM), that someone wasMiklos Szeredi, who led a session to discuss hisideas for a new filesystem mounting API. Since then, David Howells has beenworking with Szeredi and VFS maintainer Al Viro on this API; at the 2018LSFMM, he presented that work.
The DMA zone (ZONE_DMA) is a memory-management holdover from thedistant past. Once upon a time, many devices (those on the ISA bus inparticular) could only use 24 bits for DMA addresses, and were thuslimited to the bottom 16MB of memory. Such devices are hard to find oncontemporary computers. Luis Rodriguez scheduled the lastmemory-management-track session of the 2018 Linux Storage, Filesystem, andMemory-Management Summit to discuss whether the time has come to removeZONE_DMA altogether.
A system's page tables are organized into a tree that is as many as fivelevels deep. In many ways those levels are all similar, but the kerneltreats them all as being different, with the result that page-tablemanipulations include a fair amount of repetitive code. During thememory-management track of the 2018 Linux Storage, Filesystem, andMemory-Management Summit, Kirill Shutemov proposed reworking how pagetables are maintained. The idea was popular, but the implementation islikely to be tricky.
Security updates have been issued by Debian (jackson-databind, quassel, and redmine), Fedora (community-mysql and php), Red Hat (chromium-browser), Scientific Linux (java-1.7.0-openjdk), and Slackware (seamonkey).
At a plenary session heldrelatively early during the 2018 Linux Storage, Filesystem, andMemory-Management Summit, the developers discussed a number of problemswith the kernel's get_user_pages() interface. During the waninghours of LSFMM, a tired (but dedicated) set of developers convened again inthe memory-management track tocontinue the discussion and try to push it toward a real solution.
<p>Chris Mason and Josef Bacik led a brief discussion on the block-I/Ocontroller for control groups (cgroups) in the filesystem track at the 2018 LinuxStorage, Filesystem, and Memory-Management Summit. Mostly they were justaiming to get feedback on the approach they have taken. They are trying toaddress the needs of their employer, Facebook, with regard to the latencyof I/O operations.
Memory hotplugging is one of the least-loved areas of the memory-managementsubsystem; there are many use cases for it, but nobody has taken ownershipof it. A similar situation exists for hardware pagepoisoning, a somewhat neglected mechanism for dealing with memory errors.At the 2018 Linux Storage, Filesystem, and Memory-Management summit, MichalHocko and Mike Kravetz dedicated a pair of brief memory-management tracksessions to problems that have been encountered in these subsystems, one ofwhich seems more likely to get the attention it needs than the other.
The memory-management subsystem is a central point that handles all of thesystem's memory, so it is naturally subject to scalability problems assystems grow larger. Two sessions during the memory-management track ofthe 2018 Linux Storage, Filesystem, and Memory-Management Summit looked atspecific contention points: the zone locks and the mmap_semsemaphore.
Security updates have been issued by CentOS (firefox, java-1.7.0-openjdk, java-1.8.0-openjdk, librelp, patch, and python-paramiko), Debian (kernel and quassel), Gentoo (chromium, hesiod, and python), openSUSE (corosync, dovecot22, libraw, patch, and squid), Oracle (java-1.7.0-openjdk), Red Hat (go-toolset-7 and go-toolset-7-golang, java-1.7.0-openjdk, and rh-php70-php), and SUSE (corosync and patch).
At the 2018 Linux Storage, Filesystem, and Memory-Management Summit, MimiZohar gave a presentation in thefilesystem track on the Linux integrity subsystem. There is a lotof talk that the integrity subsystem (usually referred to as "IMA", whichis the integritymeasurement architecture, though there is more to the subsystem) iscomplex and not documented well, she said. So she wanted to give an overview of the subsystem and then todiscuss some filesystem-related concerns.
Greg Kroah-Hartman has released a full set of stable kernels: 4.16.7, 4.14.39, 4.9.98, 4.4.131, and 3.18.108. All of them contain important fixesand users should update.
Containers are, of course, all the rage these days; in fact, during his2018 Legal andLicensing Workshop (LLW) talk, Dirk Hohndelsaid with a grin that he hears "containers may take off". But, whilecontainers are easy to set up and use, license compliance for containers is "incrediblyhard". He has been spending "way too much time" thinking about containercompliance recently and, beyond the standard "let's go shopping" solutionto hard problems, has come up with some ideas.Hohndel is a longtime member of the FOSS community who is now the chiefopen source officer at VMware—a company that ships some container images.
Google has announcedthe open-sourcing of gVisor, a sandboxed container runtime."gVisor is more lightweight than a VM while maintaining a similarlevel of isolation. The core of gVisor is a kernel that runs as a normal,unprivileged process that supports most Linux system calls. This kernel iswritten in Go, which was chosen for its memory- and type-safety. Just likewithin a VM, an application running in a gVisor sandbox gets its own kerneland set of virtualized devices, distinct from the host and othersandboxes."
Security updates have been issued by Debian (kernel), Fedora (haproxy), openSUSE (flac, GraphicsMagick, and quassel), Oracle (kernel), Red Hat (python-paramiko and redhat-virtualization-host), and SUSE (corosync).
There is a new initiative in the Fedora community based on what used to becalled "Fedora Atomic Workstation". From thiswhitepaper [PDF]: "The descriptive name for this product is ​image-mode container-based Fedora Workstation based on rpm-ostree, which isclear but terrible for branding. Therefore, we call it Team Silverblue.The long-term goal for this effort is to transform Fedora Workstation intoan image-based system where applications are separate from the OS andupdates are atomic."
Version 8.1 of the GCC compiler suite is out."Are you tired of your existing compilers?Want fresh new language features and better optimizations?Make your day with the new GCC 8.1!" See this page for a completelist of changes in this release.
Martin Pitt describes hisexperience running a fully free-software Android phone."I previously used Opera as a web browser, because it is relativelylightweight (important on my previous phone) and the really good builtin adblocker. But these days Firefox is really fast and good enough, so Ireplaced it with Fennec, which is more or less Firefox with some non-freebits removed. After installing uBlock Origin I’ve never lookedback."
Christoph Lameter works in a different computing environment than most ofus; he supports high-volume trading applications that need every bit ofperformance that the fastest hardware can give them. Even then, it seemsthat isn't fast enough. In a memory-management-track session at the 2018Linux Storage, Filesystem, and Memory-Management Summit, Lameter describedsome of the problems he has encountered and approaches he is considering toaddress them.
Allocating chunks of memory that are both large and physically contiguoushas long been a difficult thing to do in the kernel. But there are times wherethere is no alternative. Two sessions in the memory-management track ofthe 2018 Linux Storage, Filesystem, and Memory-Management Summit exploredways of making those allocations more reliable. It turns out that some usecases have a rather larger value of "large" than others.
Memory control groups allow the system administrator to impose memory-uselimits on the members of control groups. In many ways, these limits behavelike the overall limit on available memory, but there are also somedifferences. The behavior of the memory controller also changed with theadvent of the version-2 control-group API, creating problems for at leastone significant user. Three sessions held in the memory-management track ofthe Linux Storage, Filesystem, and Memory-Management Summit explored someof these problems.
The recent fsync() woesexperienced by PostgreSQL led to a session on the firstday (April 23) of the 2018 Linux Storage, Filesystem, and Memory-Management Summit (LSFMM). Those problemsalso led to a second-day session with PostgreSQL developer Andres Freund who gave anoverview of how PostgreSQL does I/O and where that ran aground on someassumptions that had been made. The session led to a fair amount ofdiscussion with the filesystem-track developers; real solutions seem to bein the offing.
One of the core jobs of the memory-management subsystem is to make memoryavailable to other parts of the kernel when the need arises. Thememory-management track of the 2018 Linux Storage, Filesystem, andMemory-Management Summit hosted a pair of sessions on new or improvedallocation functions for the kernel covering the slab allocators andprotectable memory.
The Fedora 28 release has been announced."The headline feature for Fedora 28 Server is the inclusion of thenew Modular repository. This lets you select between different versions ofsoftware like NodeJS or Django, so you can chose the stack you need foryour software." Some users will also appreciate that proprietaryblobs (such as the NVIDIA drivers) are now easier to obtain and install.
Matthew "Willy" Wilcox has been doing a fair amount of work in thememory-management area recently. He showed up at the 2018 Linux Storage,Filesystem, and Memory-Management Summit with a list of discussion topicsrelated to that work; it was enough to fill a plenary session with somespillover into the memory-management track the next day. Some of histopics were fairly straightforward; others look to be somewhat moreinvolved.
The kernel's memory-management subsystem has to manage a great deal ofconcurrency; that leads to an ongoing series of locking challenges thatsometimes seem intractable. Two recurring locking issues — the LRU locksand the mmap_sem lock — were the topic of sessions held during thememory-management track of the 2018 Linux Storage, Filesystem, andMemory-Management Summit. In both cases, it quickly became clear that,while some interesting ideas are being pursued, easysolutions are not on offer.
When kernel code needs to work directly with user-space pages, it oftencalls get_user_pages()(or one of several variants) to fault those pages into RAM and pin themthere. This function is not entirely easy to use, though, and recentchanges have made it harder to use safely. Jan Kara and Dan Williams led aplenary session at the 2018 Linux Storage, Filesystem, andMemory-Management Summit to discuss potential solutions, but it is notentirely clear that any were found.
Stable kernels 4.16.6, 4.14.38, 4.9.97, 4.4.130, and 3.18.107 have been released. They all containimportant fixes throughout the tree and users should upgrade.
The 4.17-rc3 kernel prepatch is out."And by now, I think we've fixed all the nastiest fall-out from themerge window. In particular, the PTI large-page fallout that hit somepeople with particular configurations should all be good."
The memory-management subsystem is maintained by a small but dedicatedgroup of developers. How healthy is that development community? MichalHocko raised that question during the memory-management track at the 2018Linux Storage, Filesystem, and Memory-Management Summit. Hocko is worried,but it appears that his concerns are not universally felt.
The non-uniform memory architecture (NUMA) was designed around the ideathat there are two types of memory on complex systems: local (faster) andremote (slower). During the memory-management track of the 2018 LinuxStorage, Filesystem, and Memory-Management Summit, Anshuman Khandualasserted that the situation has since become rather more complicated.Perhaps, he said, the time has come to rethink how we view NUMA systems.
Storage devices are in a period of extensive change. As theyget faster and become byte-addressable by the CPU, they tend to lookincreasingly like ordinary memory. But they aren't memory, so it stillisn't clear what the best model for accessing them should be. AdamManzanares led a session during the memory-management track of the 2018Linux Storage, Filesystem, and Memory-Management Summit, where his proposalof a new access mechanism ran into some skepticism.
At the 2018 Linux Storage, Filesystem, and Memory Management Summit, TedTs'o introduced an integrity feature akin to dm-verity that targets Android,at least to start with. It is meant to protect the integrity of files onthe system so that any tampering would be detectable. Theinitial use case would be for a certain special type of Android file, but othersystems may find uses for it as well.
Security updates have been issued by Debian (wordpress), Fedora (boost), openSUSE (perl and zsh), Oracle (kernel), Red Hat (apr), and Slackware (openvpn).
Ubuntu 18.04, a long-term-support release, is out."Codenamed 'Bionic Beaver', 18.04 LTS continues Ubuntu's proud traditionof integrating the latest and greatest open source technologies into ahigh-quality, easy-to-use Linux distribution. The team has been hard atwork through this cycle, introducing new features and fixing bugs."It features a 4.15 kernel, a new GNOME-based desktop environment, andmore. See therelease notes and this overview for details.
Christian Schaller looksforward to the Fedora 28 release (which will evidently be the first on-time Fedora release ever)."The Spectre/Meltdown situation did hammer home to a lot of peoplethe need to have firmware updates easily available and easy to update. Wecreated the Linux Vendor Firmware service for Fedora Workstation users withthat in mind and it was great to see the service paying off for many Linuxusers, not only on Fedora, but also on other distributions who startedusing the service we provided. I would like to call out to Dell who was acritical partner for the Linux Vendor Firmware effort from day 1 and thustheir users got the most benefit from it when Spectre and Meltdownhit. Spectre and Meltdown also helped get a lot of other vendors off thefence or to accelerate their efforts to support LVFS and Richard Hughes andPeter Jones have been working closely with a lot of new vendors during thiscycle to get support for their hardware and devices into LVFS."
Security updates have been issued by Debian (drupal7, gcc-4.9-backport, ghostscript, and openslp-dfsg), Fedora (anki, composer, perl, and perl-Module-CoreList), Red Hat (kernel and rh-mysql56-mysql), and SUSE (kernel, kvm, and zsh).
Once a niche feature, memory encryption is becoming mainstream with supportin both Intel and AMD processors, Kirill Shutemov said at the beginning ofhis session during the memory-management track of the 2018 Linux Storage,Filesystem, and Memory-Management Summit. Memory encryption can harden thesystem against attack, but it also presents some interesting challenges forthe kernel.
Security updates have been issued by Debian (lucene-solr and psensor), Oracle (librelp and PackageKit), Red Hat (kernel, librelp, and PackageKit), Scientific Linux (librelp), and Ubuntu (mysql-5.5 and packagekit).
After a session at last year's LinuxStorage, Filesystem, and Memory Management Summit (LSFMM), Jeff Layton was able tomake some improvements to block-layer errorhandling. Those changes, which added a newerrseq_t type to hold an error number and sequence number, seemedto help and were well received—except by the PostgreSQLdevelopers. So Layton led a session at the 2018 LSFMM to discuss waysto improve things further; it would be followed later in the week with asession by one of the PostgreSQL developers to look at the specifics of theproblem from their perspective.
Using the kernel thread (kthread) freezer has been a longtime problemfor a variety of reasons. It is meant as a way to suspend kthreads on theway toward system suspend, but in practice has proved problematic to thepoint that it came up at both the 2015 and2016 Kernel Summits (as well as on themailing lists over the years); the intent is to tryto remove the kthread freezer entirely. To that end, Luis Rodriguez led adiscussion in the filesystem track of the 2018 Linux Storage, Filesystem,and Memory-Management Summit on the problems and possible solutions.
Dave Hansen did much of the work to get kernel page-table isolation(PTI) into the kernel in response to the Meltdown CPU vulnerability. In thememory-management track of the2018 Linux Storage, Filesystem, and Memory-Management Summit, he ran adiscussion on how PTI came about, what the costs are, and what can be doneto minimize its performance impact.
Ever since kernel page-table isolation(PTI) was introduced as a mitigation for the Meltdown CPU vulnerability, users have worried about how it affects theperformance of their systems. Most of that concern has been directedtoward its impact on computing performance, but I/O performance alsomatters. At the 2018 Linux Storage, Filesystem, and Memory-ManagementSummit, Ming Lei presented some preliminary work he has done to try toquantify how severely PTI affects block I/O operations.