Monday, October 22, 2007

Seven Areas Where Linux Could Get Better

Not all these features will get in, showing the stop-and-go road for improvements to make their way into the Linux kernel.

To a business user of Linux, the development of its kernel may appear so Byzantine, with dozens of people maintaining different pieces and hundreds more volunteers submitting code, that it's hard to see where new features are headed.

There is no Linux road map, per se. To give a glimpse of the process, here are seven areas of development worth watching, based on interviews with developers and kernel maintainers, and time on www.kernelnewbies.org. Not all are moving ahead smoothly, illustrating the stop-and-go path improvements must travel to get into the kernel.

1. Virtualization

Recognizing virtualization as a "megatrend" of the decade, Linux kernel maintainers have made it a priority to add virtualization features to the kernel at a rapid pace. The hypervisor KVM, contributed by Avi Kivity of startup Qumranet, was included in the kernel of late 2006 and updated in last month's release. But it's an example of the conflict between rapid kernel releases and the slower-advancing enterprise editions.

"KVM is a very good example of things we think are not enterprise-ready," says Holger Dryoff, VP of management at Novell. KVM needs more testing on how it interacts with kernel subsystems, including the scheduler, he adds, before it gets into SUSE Linux Enterprise Server.

XenSource, the commercial open source virtualization company recently bought by Citrix Systems for $500 million, has lobbied to get the Xen hypervisor in the kernel with its own architecture, much as a new chip would. Kernel maintainers contend that's a maintenance-heavy way to add a virtualization feature. XenSource engineers have conceded, but work remains to get Xen aligned with the kernel's operations. It hasn't made it into the kernel, beyond just-added support that lets Linux recognize when it's running in a virtualized environment.

Other virtualization features are moving faster, including KVM and Lguest, a minimalist 5,000-line hypervisor written by IBM engineer Rusty Russell that's included in the most recent kernel. Like KVM, it taps the virtualization hooks in the latest chips from Intel and Advanced Micro Devices. Unlike VMware's ESX Server, however, Lguest creates a virtual machine whose operating system realizes it has been virtualized. This architecture lets the operating system more efficiently pass some calls for CPU cycles straight to the hardware instead of slowing things down by acting as an intermediary.

2. Real-Time Operations

Linux has been rapidly improving in real-time operations and is now a frequently used embedded system in cell phones and other devices. But the recently issued 2.6.23 kernel shows "a little bit of regression" in real-time operations, says Jim Ready, CTO and founder of MontaVista, a maker of commercial embedded Linux. A new process scheduler tilted more toward "fairness"--the notion that tasks the end user tells the processor to do should get more priority.

"A real-time guy doesn't want fairness," says Ready, since real-time advocates want the operating system to interrupt whatever the CPU is doing and assert a new priority over it. A simple example is that the software in a medical device monitoring a patient's breathing should send an immediate alert if breathing stops, interrupting whatever process the software's doing. MontaVista won't incorporate the new kernel into its product line until performance is restored, Ready says. Gartner analyst George Weiss predicts standard Linux will be competitive as a real-time system in 2008.

3. Interrupt Handlers

One reason Weiss can say that is because kernel developers are working on giving the scheduler another real-time characteristic. One key role for the operating system is to manage interrupts--to decide which tasks should grab the CPU's attention and how to prioritize different actions. If all the interrupt handlers can be combined into their own thread, that thread can be scheduled and prioritized instead of occurring unpredictably and delaying real-time responses.

Work on such an approach has been going on for three years. MontaVista's Sven-Thorsten Dietrich submitted code back in 2004 in hopes of preventing interrupt handlers from tying up the kernel for routine tasks, since they hurt real-time response. But that code was too disruptive to get past Ingo Molnar, the kernel's scheduler domain expert. The code trespassed on a key kernel feature, spinlocks, which tie up the CPU as a process waits for a needed bit of data or an event. Many processes rely on spinlocks. Dietrich's code reduced hundreds of spinlocks to 30; Molnar's revision kept 90 spinlocks, a less disruptive change.

The collection of interrupt handlers into a separate thread now appears ready to go into the kernel. "Ingo replaced what we did, but his work is good," says Ready. MontaVista wouldn't mind more credit for the work it did, but Ready knows this is how open source collaboration works, and he'll settle for real-time changes progressing into the kernel.

4. Security

Everyone wants more secure systems. Novell distributes AppArmor with its SUSE Linux Enterprise Server 10 as a way of limiting how much of the operating system an application can access, thus limiting the damage if the app is accessed without authorization. Still, it's not likely to be included in the kernel any time soon.

A key Linux security authority, Stephen Smalley--developer of another security scheme, SELinux--argues that AppArmor couldn't be merged into the kernel because its protective mechanism is based on a "pathname" approach, essentially a whitelisting scheme in which AppArmor allows access to only those named files for an application, and all others are excluded. According to a report last year by Jonathan Corbet, Smalley believes an artful intruder could use the approved pathnames to guess additional names, creating an unwanted exposure.

Kernel maintainer Andrew Morton agrees that this fundamental objection to the pathname approach has kept AppArmor out of the kernel. "I'm not a security programmer," he says. "I don't know how to get that one unstuck."

5. System Diagnostics

Solaris has DTrace for probing what's going on in the heart of the operating system, but Linux is short of user-friendly diagnostic tools. One of the few existing tools is ptrace, which lets one process track the actions of another. But ptrace is clumsy to use and prone to error, and now a replacement, utrace, has made it as far as Morton's memory management tree, one of the last hurdles before submission to Linus Torvalds. Utrace can track the behavior of a process as it's executed by a program, without some of ptrace's problems, but it still causes locking problems in the kernel. Corbet predicts inclusion is unlikely in the next kernel.

6. File Systems

The Reiser4 file system has been under consideration for addition to the kernel, which already contains 30 file systems. It's a large file management system, good at handling a large number of small files while using the minimum of disk space, according to Hans Reiser's documentation.

The file system requires a file operation to either be completed or disallowed, eliminating the hazard of files corrupted by half-completed operations. It would seem ideal for many Linux uses, but after years of debate, Reiser4 hasn't made it into the kernel. It doesn't fit well with parts of the kernel, and Reiser has dropped out as lead developer. "It will need a new champion if it is to eventually become part of mainline Linux," wrote Corbet in his forecast on its prospects earlier this month.

ZFS, Sun Microsystems' 128-bit file system, multiplies Linux's address space beyond the needs of the largest systems in use today. Parties that admire it point out that its open source code should be considered for the kernel. But its current license isn't compatible with Linux GPL.

7. Power Management

Linux lags in power management, where Windows laptops have scored impressive gains, spurring Intel engineers, kernel developers Molnar and Thomas Gleixner, and others to push for progress. A year ago, the kernel got "tick-less idle," telling the processor to stay in an idle state when there's no work to be done. Without it, the CPU's clock would ask the kernel 1,000 times per second for something to do, chewing up electricity.

Dirk Hohndel, chief Linux technologist at Intel, expects more improvements to power management. But any change between the kernel and the system clock threatens many other interactions. "These things can be very difficult and take a long time," he says. "I think that's the right way to do it."

cheers Aurobindo
courtesy@ informationweek.com

No comments: