sendfile (2)
Leading comments
Copyright (c) 2003, David G. Lawrence All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice unmodified, this list of conditions, and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the docum...
NAME
sendfile - send a file to a socketLIBRARY
Lb libcSYNOPSIS
In sys/types.h In sys/socket.h In sys/uio.h Ft int Fo sendfile Fa int fd int s off_t offset size_t nbytes Fa struct sf_hdtr *hdtr off_t *sbytes int flags FcDESCRIPTION
The Fn sendfile system call sends a regular file specified by descriptor Fa fd out a stream socket specified by descriptor Fa s .The Fa offset argument specifies where to begin in the file. Should Fa offset fall beyond the end of file, the system will return success and report 0 bytes sent as described below. The Fa nbytes argument specifies how many bytes of the file should be sent, with 0 having the special meaning of send until the end of file has been reached.
An optional header and/or trailer can be sent before and after the file data by specifying a pointer to a Vt struct sf_hdtr , which has the following structure:
struct sf_hdtr { struct iovec *headers; /* pointer to header iovecs */ int hdr_cnt; /* number of header iovecs */ struct iovec *trailers; /* pointer to trailer iovecs */ int trl_cnt; /* number of trailer iovecs */ };
The Fa headers and Fa trailers pointers, if non- NULL point to arrays of Vt struct iovec structures. See the Fn writev system call for information on the iovec structure. The number of iovecs in these arrays is specified by Fa hdr_cnt and Fa trl_cnt .
If non- NULL the system will write the total number of bytes sent on the socket to the variable pointed to by Fa sbytes .
The Fa flags argument is a bitmap of these values:
- SF_NODISKIO
- This flag causes any Fn sendfile call which would block on disk I/O to instead return Er EBUSY . Busy servers may benefit by transferring requests that would block to a separate I/O worker thread.
- SF_MNOWAIT
- Do not wait for some kernel resource to become available, in particular, Vt mbuf and Vt sf_buf . The flag does not make the Fn sendfile syscall truly non-blocking, since other resources are still allocated in a blocking fashion.
- SF_SYNC
- sleeps until the network stack no longer references the VM pages of the file, making subsequent modifications to it safe. Please note that this is not a guarantee that the data has actually been sent.
When using a socket marked for non-blocking I/O, Fn sendfile may send fewer bytes than requested. In this case, the number of bytes successfully written is returned in Fa *sbytes (if specified), and the error Er EAGAIN is returned.
IMPLEMENTATION NOTES
The Fx implementation of Fn sendfile is "zero-copy", meaning that it has been optimized so that copying of the file data is avoided.TUNING
On some architectures, this system call internally uses a special Fn sendfile buffer (Vt struct sf_buf ) to handle sending file data to the client. If the sending socket is blocking, and there are not enough Fn sendfile buffers available, Fn sendfile will block and report a state of ``sfbufa '' If the sending socket is non-blocking and there are not enough Fn sendfile buffers available, the call will block and wait for the necessary buffers to become available before finishing the call.The number of Vt sf_buf Ns 's allocated should be proportional to the number of nmbclusters used to send data to a client via Fn sendfile . Tune accordingly to avoid blocking! Busy installations that make extensive use of Fn sendfile may want to increase these values to be inline with their kern.ipc.nmbclusters (see tuning(7) for details).
The number of Fn sendfile buffers available is determined at boot time by either the kern.ipc.nsfbufs loader.conf5 variable or the NSFBUFS kernel configuration tunable. The number of Fn sendfile buffers scales with kern.maxusers The kern.ipc.nsfbufsused and kern.ipc.nsfbufspeak read-only sysctl(8) variables show current and peak Fn sendfile buffers usage respectively. These values may also be viewed through netstat -m
If a value of zero is reported for kern.ipc.nsfbufs your architecture does not need to use Fn sendfile buffers because their task can be efficiently performed by the generic virtual memory structures.
RETURN VALUES
Rv -std sendfileERRORS
- Bq Er EAGAIN
- The socket is marked for non-blocking I/O and not all data was sent due to the socket buffer being filled. If specified, the number of bytes successfully sent will be returned in Fa *sbytes .
- Bq Er EBADF
- The Fa fd argument is not a valid file descriptor.
- Bq Er EBADF
- The Fa s argument is not a valid socket descriptor.
- Bq Er EBUSY
- Completing the entire transfer would have required disk I/O, so it was aborted. Partial data may have been sent. (This error can only occur when SF_NODISKIO is specified.)
- Bq Er EFAULT
- An invalid address was specified for an argument.
- Bq Er EINTR
- A signal interrupted Fn sendfile before it could be completed. If specified, the number of bytes successfully sent will be returned in Fa *sbytes .
- Bq Er EINVAL
- The Fa fd argument is not a regular file.
- Bq Er EINVAL
- The Fa s argument is not a SOCK_STREAM type socket.
- Bq Er EINVAL
- The Fa offset argument is negative.
- Bq Er EIO
- An error occurred while reading from Fa fd .
- Bq Er ENOBUFS
- The system was unable to allocate an internal buffer.
- Bq Er ENOTCONN
- The Fa s argument points to an unconnected socket.
- Bq Er ENOTSOCK
- The Fa s argument is not a socket.
- Bq Er EOPNOTSUPP
- The file system for descriptor Fa fd does not support Fn sendfile .
- Bq Er EPIPE
- The socket peer has closed the connection.
SEE ALSO
netstat(1), open(2), send(2), socket(2), writev(2), tuning(7)- K. Elmeleegy A. Chanda A. L. Cox W. Zwaenepoel A Portable Kernel Abstraction for Low-Overhead Ephemeral Mapping Management The Proceedings of the 2005 USENIX Annual Technical Conference pp 223-236 2005