Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
Installing and Administering LAN/9000 Software

New for the HP-UX 11i Release

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

The 11i HP-UX release has the following transport (IP, TCP, and UDP) changes:

  • IP address subnet identifiers can now be all 1s or all 0s (RFC 1812).

  • IP PMTU (Path Maximum Transmission Unit) discovery algorithm reverts back to 10.20 behavior.

  • Virtual IP (VIP) addresses can now be configured for a system. A system VIP address is not tied to a specific physical interface and can be used to send and receive IP packets on any physical interface of the system.

  • TCP performance enhancements.

    TCP was enhanced to support the following features and RFCS:

    • Selective Acknowledgment* (SACK, RFC 2018)

    • Extensions for High Performance (scaled windows** and timestamps*, RFC 1323)

    • Increasing TCP's Initial Window (RFC 2414)

    • TCP Congestion Control (RFC 2581) and NewReno Modification for Fast Recovery (response to partial acknowledgments, RFC 2582)

    • Socket structure caching

  • TCP FIN WAIT 2 timer*.

  • System-wide limits for TCP and UDP socket buffer sizes.

  • OLAR

Many of the above changes are controlled by new ndd parameters that are not documented in the online help text for ndd shipped with 11i. A summary of the new ndd parameters and parameter changes is provided at the end of this article and this information will be included in a later version of ndd.

*By default, many of the TCP performance enhancements will be used only if the remote system initiates the use of them. Refer to the subsections below for information on how to configure the system to initiate the use of these enhancements.

**A scaling factor of 1 (same as no scaling) is used unless the application has a receive buffer greater than 2**16. Refer to the "Scaled Windows" section below for more information.

IP Subnet Mask

The ifconfig subnet mask default now allows all 1s or all 0s in the masked part of the subnet field. This provides up to twice as many IP addresses as before.

The subnet field (the portion of an IP address that identifies the subnet beyond the network portion of the address) can now be all 0's or all 1's, as described in RFC 1878. For example, a class A IP address used with the mask 255.192.0.0 (0xffc000000) has a two-bit subnet field:

ifconfig can now assign the following IP address and subnet mask to an interface, although the subnet field (subnet portion of the address) is all ones:

IP address:   15.192.1.1
Subnet mask: 255.192.0.0 (0xffc00000)

In binary:

00001111  11 000000 00000001 00000001
11111111 11 000000 00000000 00000000

ifconfig can now also assign the following IP address and subnet mask to an interface, although the subnet field is all zeroes:

IP address:   15.1.1.1
Subnet mask: 255.192.0.0 (0xffc00000)

In binary:

00001111  00 000001 00000001 00000001
11111111 11 000000 00000000 00000000

To disallow subnet fields with all ones or all zeroes (revert to RFC 1122 behavior), set the ndd parameter ip_check_subnet_addr to 1 in the nddconf file (/etc/rc.config.d/nddconf).

IP PMTU Discovery

The value 2 for the ndd parameter ip_pmtu_strategy is no longer supported. This was previously the default value for this parameter. The new default value for ip_pmtu_strategy is 1, which causes the system to use the HP-UX 10.20 IP PMTU behavior. If the nddconf file has the value 2 for this parameter, the new default value (1) will be used.

A description of the IP PMTU strategy is provided in the ndd online help facility.

Virtual IP (VIP) Addresses

Systems can have Virtual IP (VIP) addresses--addresses that are not permanently assigned to a single, specific physical interface. The system will accept to a packet addressed to its VIP (or VIPs) regardless of the physical interface on which it was received. This allows a system to have a "system IP" address that is available as long as one interface stays usable.

To configure VIPs, associate the VIP address with a secondary loopback interface (lo0:n, where n is 1 or greater, such as lo0:1). The VIP address does not have to be in the same subnet (or network) of the addresses used for the physical interfaces.

In the example below, the system has two LAN interfaces. One is attached to the 15.n.n.n network and has the address 15.1.1.1. The second LAN is attached to the 16.n.n.n network and has the address 16.1.1.1. The VIP address is 17.1.1.1.

Figure 1 Title not available (New for the HP-UX 11i Release)

Note that the infrastructure of the network (routers, switches) must allow IP packets with the address 17.1.1.1 to be properly routed to this system's interfaces on the 15.n.n.n and 16.n.n.n networks for this configuration to be useful.

/etc/rc.config.d/netconf file statements for the above VIP:

INTERFACE_NAME[2]=lo0:1
IP_ADDRESS[2]=15.1.1.1
:
:

ifconfig command for the above VIP:

ifconfig lo0:1 inet 15.1.1.1

Note that you cannot assign VIPs to the primary loopback interface, lo0:0, or lo0.

TCP Performance Enhancements

  • TCP Selective Acknowledgment (SACK, RFC 2018)

    TCP selective acknowledgment (SACK) defines a new TCP option that allows recipients to selectively acknowledge out-of-sequence data. The TCP sender can then retransmit only the lost segments and adjust its send window to reflect the actual amount of received data.

    The use of TCP SACK is controlled by the ndd parameter tcp_sack_enable.

    Supported parameter values are:

    0: Local system never uses SACK.

    1: Local system sends the SACK option in SYN packet

    2: Local system enables SACK if remote system negotiates the use of SACK in SYN packet (default)

    Negotiation of the use of SACK is done by sending TCP SYN packets with an Option Kind value of 4 to indicate that the system can receive (and process) SACKs.

    TCP packets with SACK information will have an Option Kind value of 5.

  • Scaled Windows and Timestamps (RFC 1323)

    RFC 1323 (TCP Extensions for High Performance) specifies TCP extensions intended to benefit high-speed networks and networks with both large bandwidths and long delays, such as high capacity satellite networks or long-distance fiber-optic networks.

    Scaled windows and timestamps were available beginning in release 10.30, but these features were not documented, and the ndd parameters to disable them were not supported.

    Scaled Windows

    Scaled windows increases the window size from 2**16 (65K) bytes to 2**32 bytes by allowing a scale factor (multiplier) to be applied to the window field.

    TCP will always offer to use Window Scaling in SYN and SYN ACK packets. The scale factor will be 1 (window size remains 2**16) unless the application is using a receive buffer that is larger than 2**16.

    Window Scale options have the Option Kind value of 3.

    Timestamps

    RFC 1323 defines a timestamps option that can be sent with every segment. The timestamps in the option are used for two purposes:

    • More accurate RTTM (Round Trip Time Measurement), or the interval between time a TCP segment is sent and the time return acknowledgment arrives.

    • PAWS (Protect Against Wrapped Sequences) on very high-speed networks. On connections with large transmission rates where the sequence number may wrap, the timestamps are used to detect old packets.

    The use of timestamps is controlled by the ndd parameter tcp_ts_enable.

    Supported parameter values are:

    0: Never timestamp

    1: Always initiate

    2: Allow but don't initiate (Default)

    Use of timestamps is requested by the initiator of a TCP connection by sending a timestamps option (Option Kind 8) in the initial TCP SYN packet.

  • Initial TCP Congestion Window

    RFC 2414 defines a formula for calculating the sender's initial congestion window that usually results in a larger window than in previous releases.

    The default initial congestion window is now calculated using the following formula:

    min((4 * MSS), max(2 * MSS, 4380))

    where MSS is the maximum segment size for the underlying link.

    With the new congestion window formula, it is possible for TCP to send a large, initial block of data without waiting for acknowledgments. This is useful in networks with large bandwidth and low error rates and particularly useful for short-lived connections that only need to send ~4Kbytes of data or less.

    To modify the initial congestion window, configure the ndd TCP parameter tcp_cwnd_init. TCP will calculate the initial congestion window using the following formula:

    min((tcp_cwnd_init * MSS), max(2 * MSS, 4380)),

    Default 4: (TCP implements RFC 2414)

    Range: 1 -4

    In prior releases, the initial TCP congestion window was (2 * MSS).

  • TCP Congestion Control (RFC 2581) and Fast Recovery (RFC 2582)

    RFC 2581 and 2582 may improve TCP performance, particularly in congested networks or networks with errors. These RFCs specify refinements to the TCP protocol for setting transmission windows in congested networks and recovering from lost packets.

    RFC 2581 specifies new TCP congestion control algorithms for Slow Start, Congestion Avoidance, Fast Retransmit and Fast Recovery, including provisions for TCP SACK (Selective Acknowledgments).

    RFC 2582 specifies refinements to the Fast Recovery algorithm to acknowledge blocks of data and increase the send window when SACK is not used.

    No configuration is needed to enable the implementation of RFC 2581 or RFC 2582.

  • TCP Socket Structure Caching

    The ndd TCP parameter tcp_conn_strategy is used to enable socket caching. This value determines how many cached data structures for BSD TCP sockets the system keeps. This could cause the system to speed up considerably if there are many short-lived connections on the system.

    The default value of 0 (zero) disables the feature. A value between 1 and 512 will set a minimum of 512. Any number above 512 will set that value.

    This feature was available but undocumented in HP-UX 11.00.

TCP FIN WAIT 2 Timer

The ndd TCP parameter tcp_fin_wait_2 determines how long a TCP connection will be in FIN_WAIT_2.

Normally one end of a connection initiates the close of its end of the connection (indicates that it has no more data to send) by sending a FIN. When the remote TCP acknowledges the FIN, TCP goes to the FIN_WAIT_2 state and will remain in that state until the remote TCP sends a FIN.

If the FIN_WAIT_2 timer is used, TCP will close the connection when it has remained in the FIN_WAIT_2 state for the length of the timer value.

The FIN_WAIT_2 timer must be used with caution because when TCP is in the FIN_WAIT_2 state the remote is still allowed to send data. In addition, if the remote TCP would terminate normally (it is not hung nor terminating abnormally) and the connection is closed because of the FIN_WAIT_2 timer, the connection may be closed prematurely.

Data may be lost if the remote sends a window update or FIN after the local TCP has closed the connection. In this situation, the local TCP will send a RESET. According to the TCP protocol specification, the remote TCP should flush its receive queue when it receives the RESET. This may cause data to be lost.

Default: 0 (indefinite)

Range: 0 - 2147483647

Units: Milliseconds

System-wide Limits for TCP and UDP Buffer Sizes

System administrators can now set system-wide limits on TCP send and receive buffers and UDP receive buffers. This can prevent situations where processes consume excessive amounts of memory by requesting large send or receive buffers filling these buffers by not reading data from the socket or by sending large amounts of data. This feature was added to 11.00 to solve a system hang problem.

The ndd parameters for setting buffer size limits are:

tcp_xmit_hiwater_max
tcp_recv_hiwater_max
udp_recv_hiwater_max

The default and maximum values for these parameters is 2147483647 bytes.

tcp_xmit_hiwater_max limits the send buffer size for TCP sockets or communication endpoints specified in a SO_SNDBUF option of a setsockopt() call or XTI_SNDBUF option in a t_optmgmt() call.

tcp_recv_hiwater_max and udp_recv_hiwater_max limit the receive buffer size for TCP and UDP sockets or communication endpoints specified in a SO_RCVBUF option of a setsockopt() call or XTI_RCVBUF option in a t_optmgmt() call.

A setsockopt() call with a SO_SNDBUF or SO_RCVBUF option that exceeds the corresponding kernel parameter value will fail and return the errno value EINVAL.

A t_optmgmt() call with an XTI_SNDBUF or XIT_RCVBUF option that exceeds the corresponding kernel parameter value will fail and return the t_errno value TBADOPT.

OLAR - Online Addition and Replacement

This manual does not contain the procedures for adding and replacing PCI cards using OLA/R. OLA/R stands for On Line Addition and Replacement. This refers to the ability of a PCI I/O card to be replaced (removed and/or added) to an HP-UX computer system designed to support this feature without the need for completely shutting down, then rebooting the system or affecting other system components.

If you want to utilize the OLAR feature that your system provides, refer to the appropriate chapter in the Configuring HP-UX for Peripherals for HP 9000 Computers, Part Number B2355-90698, for information.

Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© 2002 Hewlett-Packard Development Company, L.P.