DECLARE_CC_MODULE (9)
Leading comments
Copyright (c) 2008-2009 Lawrence Stewart <lstewart@FreeBSD.org> Copyright (c) 2010-2011 The FreeBSD Foundation All rights reserved. Portions of this documentation were written at the Centre for Advanced Internet Architectures, Swinburne University of Technology, Melbourne, Australia by David Hayes and Lawrence Stewart under sponsorship from the FreeBSD Foundation. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following condi...
NAME
mod_cc DECLARE_CC_MODULE CCV - Modular Congestion ControlSYNOPSIS
In netinet/tcp.h In netinet/cc/cc.h In netinet/cc/cc_module.h Fn DECLARE_CC_MODULE ccname ccalgo Fn CCV ccv whatDESCRIPTION
The framework allows congestion control algorithms to be implemented as dynamically loadable kernel modules via the kld(4) facility. Transport protocols can select from the list of available algorithms on a connection-by-connection basis, or use the system default (see mod_cc4 for more details).modules are identified by an ascii(7) name and set of hook functions encapsulated in a Vt struct cc_algo , which has the following members:
struct cc_algo { char name[TCP_CA_NAME_MAX]; int (*mod_init) (void); int (*mod_destroy) (void); int (*cb_init) (struct cc_var *ccv); void (*cb_destroy) (struct cc_var *ccv); void (*conn_init) (struct cc_var *ccv); void (*ack_received) (struct cc_var *ccv, uint16_t type); void (*cong_signal) (struct cc_var *ccv, uint32_t type); void (*post_recovery) (struct cc_var *ccv); void (*after_idle) (struct cc_var *ccv); int (*ctl_output)(struct cc_var *, struct sockopt *, void *); };
The name field identifies the unique name of the algorithm, and should be no longer than TCP_CA_NAME_MAX-1 characters in length (the TCP_CA_NAME_MAX define lives in In netinet/tcp.h for compatibility reasons).
The mod_init function is called when a new module is loaded into the system but before the registration process is complete. It should be implemented if a module needs to set up some global state prior to being available for use by new connections. Returning a non-zero value from mod_init will cause the loading of the module to fail.
The mod_destroy function is called prior to unloading an existing module from the kernel. It should be implemented if a module needs to clean up any global state before being removed from the kernel. The return value is currently ignored.
The cb_init function is called when a TCP control block Vt struct tcpcb is created. It should be implemented if a module needs to allocate memory for storing private per-connection state. Returning a non-zero value from cb_init will cause the connection set up to be aborted, terminating the connection as a result.
The cb_destroy function is called when a TCP control block Vt struct tcpcb is destroyed. It should be implemented if a module needs to free memory allocated in cb_init
The conn_init function is called when a new connection has been established and variables are being initialised. It should be implemented to initialise congestion control algorithm variables for the newly established connection.
The ack_received function is called when a TCP acknowledgement (ACK) packet is received. Modules use the Fa type argument as an input to their congestion management algorithms. The ACK types currently reported by the stack are CC_ACK and CC_DUPACK. CC_ACK indicates the received ACK acknowledges previously unacknowledged data. CC_DUPACK indicates the received ACK acknowledges data we have already received an ACK for.
The cong_signal function is called when a congestion event is detected by the TCP stack. Modules use the Fa type argument as an input to their congestion management algorithms. The congestion event types currently reported by the stack are CC_ECN, CC_RTO, CC_RTO_ERR and CC_NDUPACK. CC_ECN is reported when the TCP stack receives an explicit congestion notification (RFC3168). CC_RTO is reported when the retransmission time out timer fires. CC_RTO_ERR is reported if the retransmission time out timer fired in error. CC_NDUPACK is reported if N duplicate ACKs have been received back-to-back, where N is the fast retransmit duplicate ack threshold (N=3 currently as per RFC5681).
The post_recovery function is called after the TCP connection has recovered from a congestion event. It should be implemented to adjust state as required.
The after_idle function is called when data transfer resumes after an idle period. It should be implemented to adjust state as required.
The ctl_output function is called when getsockopt(2) or setsockopt(2) is called on a tcp(4) socket with the struct sockopt pointer forwarded unmodified from the TCP control, and a void pointer to algorithm specific argument.
The Fn DECLARE_CC_MODULE macro provides a convenient wrapper around the DECLARE_MODULE9 macro, and is used to register a module with the framework. The Fa ccname argument specifies the module's name. The Fa ccalgo argument points to the module's Vt struct cc_algo .
modules must instantiate a Vt struct cc_algo , but are only required to set the name field, and optionally any of the function pointers. The stack will skip calling any function pointer which is NULL, so there is no requirement to implement any of the function pointers. Using the C99 designated initialiser feature to set fields is encouraged.
Each function pointer which deals with congestion control state is passed a pointer to a Vt struct cc_var , which has the following members:
struct cc_var { void *cc_data; int bytes_this_ack; tcp_seq curack; uint32_t flags; int type; union ccv_container { struct tcpcb *tcp; struct sctp_nets *sctp; } ccvc; };
Vt struct cc_var groups congestion control related variables into a single, embeddable structure and adds a layer of indirection to accessing transport protocol control blocks. The eventual goal is to allow a single set of modules to be shared between all congestion aware transport protocols, though currently only tcp(4) is supported.
To aid the eventual transition towards this goal, direct use of variables from the transport protocol's data structures is strongly discouraged. However, it is inevitable at the current time to require access to some of these variables, and so the Fn CCV macro exists as a convenience accessor. The Fa ccv argument points to the Vt struct cc_var passed into the function by the framework. The Fa what argument specifies the name of the variable to access.
Apart from the type and ccv_container fields, the remaining fields in Vt struct cc_var are for use by modules.
The cc_data field is available for algorithms requiring additional per-connection state to attach a dynamic memory pointer to. The memory should be allocated and attached in the module's cb_init hook function.
The bytes_this_ack field specifies the number of new bytes acknowledged by the most recently received ACK packet. It is only valid in the ack_received hook function.
The curack field specifies the sequence number of the most recently received ACK packet. It is only valid in the ack_received cong_signal and post_recovery hook functions.
The flags field is used to pass useful information from the stack to a module. The CCF_ABC_SENTAWND flag is relevant in ack_received and is set when appropriate byte counting (RFC3465) has counted a window's worth of bytes has been sent. It is the module's responsibility to clear the flag after it has processed the signal. The CCF_CWND_LIMITED flag is relevant in ack_received and is set when the connection's ability to send data is currently constrained by the value of the congestion window. Algorithms should use the absence of this flag being set to avoid accumulating a large difference between the congestion window and send window.
SEE ALSO
cc_cdg4, cc_chd4, cc_cubic4, cc_hd4, cc_htcp4, cc_newreno4, cc_vegas4, mod_cc4, tcp(4)ACKNOWLEDGEMENTS
Development and testing of this software were made possible in part by grants from the FreeBSD Foundation and Cisco University Research Program Fund at Community Foundation Silicon Valley.FUTURE WORK
Integrate with sctp(4).HISTORY
The modular Congestion Control (CC) framework first appeared in Fx 9.0 .The framework was first released in 2007 by James Healy and Lawrence Stewart whilst working on the NewTCP research project at Swinburne University of Technology's Centre for Advanced Internet Architectures, Melbourne, Australia, which was made possible in part by a grant from the Cisco University Research Program Fund at Community Foundation Silicon Valley. More details are available at:
AUTHORS
An -nosplit The framework was written by An Lawrence Stewart Aq Mt lstewart@FreeBSD.org , An James Healy Aq Mt jimmy@deefa.com and An David Hayes Aq Mt david.hayes@ieee.org .This manual page was written by An David Hayes Aq Mt david.hayes@ieee.org and An Lawrence Stewart Aq Mt lstewart@FreeBSD.org .