Problem solve Get help with specific problems with your technologies, process and projects.

Linux sendfile and Apache servers: How an underused feature can offset overused resources

The sendfile() system call is an underused tool that can bring surprising performance benefits for an Apache server handling large payload data transfers among multiple clients.

Apache handles itself well when it comes to serving content on demand. But you can still tweak several underlying parameters to improve performance. One area that deserves special attention is how Apache handles data transfers when servicing multiple clients, especially when dealing with large payloads. What better way to eke out a few extra performance points than to take advantage of the operating system's efficient data transfer routines?

Enter the sendfile() system call.

The sendfile() system call duplicates data from one file descriptor (a numeric value corresponding to an open file handle) to another within the Linux kernel, making more efficient use of resources than typical read()/write() calls that needlessly coerce data into userspace. Arbitrating through this extra abstraction layer just to copy on-disk data into userspace buffers causes performance penalties that compound over time and worsen with multiple connections or long-haul transfers. A sendfile call, which is not unique to Linux, works by using a kernel cache for on-disk data instead of inefficiently copying buffers around between kernel and userspace. A sendfile call requires no buffer for operation. When properly configured, this approach gives the Apache server a little more torque off the line -- and all the way through the power band.

Although the sendfile call has been implemented in both Linux and Apache for some time now, its debut had initial problems with transferring large files (2 GB and above) and a host of other ailments. Today, the EnableSendfile directive in Apache's server configuration file is enabled by default for newer installations. Apache's http_core module uses a default handler to send these files direct from disk to client. It can handle nearly any files -- static mark-up content, plain text documents, compressed archives and images of virtually any format. Invoking this behavior within Perl or Python (using mod_perl or mod_python, respectively) is a simple matter as shown below.

Perl Example

use strict;
use Apache::Constants qw(:common);
use Apache::File ();

sub handler
  my $fp = shift;
  my $fh = Apache::File->new($fp->filename) or return FORBIDDEN;
  return OK;

Python Example

def req_sendfile(req):
  from mod_python import apache
    while size > 0:
      sent = 0
      sent = req.sendfile(filename, offs, size)
      if sent < 0:
        raise OSError(5, "Error using sendfile")
    offs += sent
    size -= sent

    # handle error case

  return apache.OK

Normally, a request like this would require a userspace buffer and (for big transfers) copious calls between read() and write() operations across the network. During the copying from kernel to userspace, a context switch occurs -- an expensive event in terms of performance. Most file transfers require that nothing be done with the desired content, only that it be passed on to the requesting client, so copying between these two layers is unnecessarily wasteful.

To maintain better economy of resources and increase performance on the bottom line, make sure you have the EnableSendfile (and possibly EnableMMAP) directives within the Apache configuration file and employ the Apache module of choice. Your clients and your server will appreciate the resulting boost in performance.

Justin Korelc is a long-time Linux hacker and system administrator who concentrates on hardware and software security, virtualization and high-performance Linux systems. Ed Tittel is a full-time freelance writer based in Austin, Tex., who specializes in markup languages, information security, networking and IT certification. Justin and Ed have contributed to books on Home Theater PCs and the Linux-based MythTV environment, and they write regularly about Linux for various TomsHardware sites.

Dig Deeper on Linux servers