Introduction
This is in reference to the following article:
http://www.codeproject.com/Articles/85602/PortQry-Implementation-using-TcpClient-Socket-and
It's been a few years since I've looked at this and recently I had received a notification that someone had posted a response. First I would like to agree with emilio_grv's response in that Application Programmer should be very careful about handling timeouts within the application. As with any application development, make sure you cleanup any non-used resources as soon as possible, especially with sockets, as you'll extinguish the available source ports that can be used. In windows environments, the default is 3977.
Background
As a connection timeout is not a paramter available within the TcpClient.BeginConnect()
or TcpClient.Connect()
functions, an issue arises for those who have large scale processes that must be accomplished in a timely fashion.
With the use of firewalls in the network we make compromises: visibility for security. We no longer get a response from the TCP stack at the far end or sometimes even the ICMP message back when a packet passes a firewall whether there is a problem or not. Even more frustrating, we may have no choice in the matter, as another group or organization could be managing the firewalls, and so policy change can be difficult if not possible altogether.
Layer 4 (TCP Session) Visibility Tools
There used to be an easy way to check for port availability programatically. With tools like L4Trace and L4Ping, the programmers used RAW sockets to create TCP Headers with empty payloads to attempt to establish a session, and timeout the connection early, terminating the session with a FIN. This allowed in-depth analysis of the path where firewalls were in place, seeing who was blocking the port (by seeing who the last person responding to a TCP Session Establishment request with incrementing TTLs.
Enter the Black Hats
Various malware manufacturers and other ingenious sorts used this capability to SYN Flood hosts in the attempt to break into them, or to manufacture malformed TCP Session Run Conditions to crash the server or overrun buffers. This resulted in Microsoft taking the RAW Sockets functionality out of Windows in XP Service Pack 3, hence breaking L4Trace and L4Ping as well as other very useful tools.
Response to Concerns of Dangling Connections
The code provided (in the previously noted article) does terminate the connection through the API provided by the windows TCP/IP stack, when the specified timeout period expires. It's not ideal that this should even have to be done. Microsoft, IHMO, should have provided two parameters in the TcpClient.BeginConnect()
Function. The first to set the timeout (noted x) in the following flow chart, and the second the retries before calling the connection dead. At the very least poor/malicous coding would result in the elimination of available sockets on the client system, as the socket would not be released (locking the socket) until the 30s TCP Cleanup Period expires post the attempted disposal of a TcpClient or EndConnect()
function being called.
Unfortunately, we don't have that luxury, and more importantly, within the windows infrastructure, there is no (as far as I've seen) way to even generate your own TCP packet without reimplementing all of WINSOCK, as RAW sockets are no longer available/implemented. With Windows 7 this becomes even more complicated, as you are required to build a Kernel driver in order to interface with the Network Card or Packets prior to processing by WinSock.
Current Flow Diagram showing WinSock TCP Establishment:
AVAILABLE TCP PORT UNAVAILABLE TCP PORT FIREWALLED TCP PORT
TcpClient.BeginConnect() TcpClient.BeginConnect() TcpClient.BeginConnect()
CLIENT..............SERVER CLIENT..............SERVER CLIENT..............SERVER
SYN ---------------------> SYN ---------------------> SYN --------------------->
<----------------- SYN-ACK <----------------- RST-ACK ......WAIT 3 SECONDS......
ACK ---------------------> SYN --------------------->
<----------------- RST-ACK SYN --------------------->
..x*3 SECONDS TRANSPIRED.. SYN ---------------------> ......WAIT 6 SECONDS......
<----------------- RST-ACK
SYN --------------------->
..CONNECTION ESTABLISHED.. ..x*6 SECONDS TRANSPIRED.. .....WAIT 12 SECONDS......
OS Triggers Async Callback OS Triggers Async Callback OS Triggers Async Callback
TcpClient.EndConnect()
FIN --------------------->
<----------------- FIN-ACK
ACK --------------------->
OS Triggers Async Callback
...WAIT 30s FOR CLEANUP... ...WAIT 30s FOR CLEANUP... ....WAIT 30s FOR CLEANUP...
TcpClient.Dispose() TcpClient.Dispose() TcpClient.Dispose()
Local and Foreign TCP Port Local and Foreign TCP Port .Local TCP Port freed for..
are available again for are available again for ......reuse by system......
use. use.
Dissemination
With a perfect path (no data loss) and a 3 second RTT (which is all TCP accounts for, and really the bounds with which TCP will effectively work in) the turn around time in an application is 9 seconds, if the Port is up and available, and we just terminate the connection once established.
connect-time
(x*3) = ?, (3*3) = 9 seconds
When the port is unavailable, (no application is attached to the port at the server) but unfiltered by a firewall, we receive the TCP messages back, allowing us to get a response back within 18 seconds.
connect-reset-time
(x*6) = ?, (3*6) = 18 seconds
When the port is unavailable, there is a route to the server in the routing table, the router hosting the subnet the server has an ARP entry to translate the IP to a MAC address, and a firewall sits between the client and server, the total time time to return the application a connection-failure is 21 seconds.
tcp-timeout + (tcp-timeout*(2^1)) + (tcp-timeout*(2^2))
(3) + (6) + (12) = 21 secnods
Looking at the other extreme, lets assume 4 millisecond turn around:
Available TCP Port:(0.004*3+(0.004*3) = 24ms
Unavailable TCP Port:(0.004*6) = 24ms
Filtered TCP Port:(3) + (6) + (12) = 21 seconds
Application
In today's day an age, a slow response is 700-900ms RTT, with Satellite being the slowest due to the physical distance the signal has to travel from earth to LEO and back.
I ran a report on our network and 96% of the host-latency (RTT) from the management system was within 60ms, with 92% less than 40ms, and 99% less than 100ms. We have over 2M addresses in use within the network of the 19M addresses available within RFC1918 (Private) space.
Say I were to poll every single device on the most common 25 ports using the standard TCP timeouts defined within WinSock. From a multithreaded application on a system that can establish 3900 tcp connections (WinSocks default dynamic port limit rounded down) and we say we average the latency at 40ms for hosts that respond. Let also say that only 5% of the ports on active hosts are accessible due to firewall policy, and 60% of the non-responsive hosts are not behind a firewall.
RFC1918_HOSTS: = | 19,000,000 | Total RFC1918 Addressing |
ACTIVE_HOSTS: = | 2,000,000 | Active Hosts |
SOCKETS_POLLED: = | 25 | Polled Ports per Host
|
%_RESP_SOCKETS: = | 20% | Responsive Ports on Active Host
|
LATENCY: = | 40 | milliseconds host majority latency
|
TIMEOUT: = | 3 | WinSock TCP Timeout Default
|
RETRIES: = | 3 | Winsock TCP Retry Default
|
PORTS_AVAILABLE: = | 3900 | Winsock Open Dynamic Source TCP Connections Default (Rounded Down)
|
UNFILTERED_HOSTS: = | 60% | Percentage of Non-Filtered unused endpoints
|
| | |
DEAD-HOSTS:
(RFC1918_HOSTS - ACTIVE_HOSTS) = | 17,000,000 | Unused Addressing |
| | |
RESPONSIVE_SOCKETS:
(ACTIVE_HOSTS * SOCKETS_POLLED * RESPONSIVE_SOCKETS) = |
10,000,000 |
Responsive Ports |
UNRESPONSIVE_SOCKETS:
(ACTIVE_HOSTS * SOCKETS_POLLED * (100% - RESPONSIVE_SOCKETS) ) +
(TIMEOUT * RETRIES) = |
465,000,000 |
Unresponsive Ports |
| | |
RESPONSIVE_SOCKETS_TIME:
(RESPONSIVE_SOCKETS / PORTS_AVAILABLE) *
((LATENCY / 1000) * 3) = |
308 | Time in Seconds to Poll Responsive Endpoints |
RESET_SOCKETS_TIME:
(RESPONSIVE_SOCKETS / PORTS_AVAILABLE * UNFILTERED_HOSTS) *
((LATENCY / 1000) * 6) = |
17,169 |
Time in Seconds to Poll Reset-Responsive Hosts |
UNRESPONSIVE_SOCKETS_TIME:
(RESPONSIVE_SOCKETS / PORTS_AVAILABLE * UNFILTERED_HOSTS) * S(TIMEOUT * 2^(N), 0, RETRIES - 1, N) = |
1,001,538 |
Time in Seconds to Poll Unresponsive (Filtered) Endpoints |
| | |
| 1,019,015 | seconds |
| 16,984 | mintes |
| 283 | hours |
| 12 | days |
| | |
RESPONSIVE_SOCKETS_TIME:
(RESPONSIVE_SOCKETS / PORTS_AVAILABLE) *
((LATENCY / 1000) * 3) = |
308 | Time in Seconds to Poll Responsive Endpoints |
RESET_SOCKETS_TIME:
(RESPONSIVE_SOCKETS / PORTS_AVAILABLE * UNFILTERED_HOSTS) *
((LATENCY / 1000) * RETRIES * 2) = |
5,723 |
Time in Seconds to Poll Reset-Responsive Hosts |
UNRESPONSIVE_SOCKETS_TIME:
(RESPONSIVE_SOCKETS / PORTS_AVAILABLE * UNFILTERED_HOSTS) *
S(TIMEOUT * 2^(N), 0, RETRIES - 1, N) = |
143,077 |
Time in Seconds to Poll Unresponsive (Filtered) Endpoints |
| | |
| 149,108 | seconds | |
| 2,485 | mintes | |
| 41 | hours | |
| 2 | days | |
RFC1918_HOSTS: = | 19,000,000 | Total RFC1918 Addressing |
ACTIVE_HOSTS: = | 2,000,000 | Active Hosts |
SOCKETS_POLLED: = | 25 | Polled Ports per Host
|
%_RESP_SOCKETS: = | 20% | Responsive Ports on Active Host
|
LATENCY: = | 40 | milliseconds host majority latency
|
TIMEOUT: = | 3 | WinSock TCP Timeout Default
|
RETRIES: = | 3 | Winsock TCP Retry Default
|
PORTS_AVAILABLE: = | 3900 | Winsock Open Dynamic Source TCP Connections Default (Rounded Down)
|
UNFILTERED_HOSTS: = | 60% | Percentage of Non-Filtered unused endpoints
|
| | |
DEAD-HOSTS:
(RFC1918_HOSTS - ACTIVE_HOSTS) = | 17,000,000 | Unused Addressing |
| | |
RESPONSIVE_SOCKETS:
(ACTIVE_HOSTS * SOCKETS_POLLED * RESPONSIVE_SOCKETS) = |
10,000,000 |
Responsive Ports |
UNRESPONSIVE_SOCKETS:
(ACTIVE_HOSTS * SOCKETS_POLLED * (100% - RESPONSIVE_SOCKETS) ) +
(TIMEOUT * RETRIES) = |
465,000,000 |
Unresponsive Ports |
| | |
RESPONSIVE_SOCKETS_TIME:
(RESPONSIVE_SOCKETS / PORTS_AVAILABLE) *
((LATENCY / 1000) * 3) = |
308 | Time in Seconds to Poll Responsive Endpoints |
RESET_SOCKETS_TIME:
(RESPONSIVE_SOCKETS / PORTS_AVAILABLE * UNFILTERED_HOSTS) *
((LATENCY / 1000) * 6) = |
17,169 |
Time in Seconds to Poll Reset-Responsive Hosts |
UNRESPONSIVE_SOCKETS_TIME:
(RESPONSIVE_SOCKETS / PORTS_AVAILABLE * UNFILTERED_HOSTS) *
S(TIMEOUT * 2^(N), 0, RETRIES - 1, N) = |
1,001,538 |
Time in Seconds to Poll Unresponsive (Filtered) Endpoints |
| | |
| 1,019,015 | seconds |
| 16,984 | mintes |
| 283 | hours |
| 12 | days |
| | |
RESPONSIVE_SOCKETS_TIME:
(RESPONSIVE_SOCKETS / PORTS_AVAILABLE) *
((LATENCY / 1000) * 3) = |
308 | Time in Seconds to Poll Responsive Endpoints |
RESET_SOCKETS_TIME:
(RESPONSIVE_SOCKETS / PORTS_AVAILABLE * UNFILTERED_HOSTS) *
((LATENCY / 1000) * RETRIES * 2) = |
5,723 |
Time in Seconds to Poll Reset-Responsive Hosts |
UNRESPONSIVE_SOCKETS_TIME:
(RESPONSIVE_SOCKETS / PORTS_AVAILABLE * UNFILTERED_HOSTS) *
* S(TIMEOUT * 2^(N), 0, RETRIES - 1, N) = |
143,077 |
Time in Seconds to Poll Unresponsive (Filtered) Endpoints |
| | |
| 149,108 | seconds | |
| 2,485 | mintes | |
| 41 | hours | |
| 2 | days | |
As depicted above, the first set of calculations are the times it would take to perform a basic TCP Session Poll of the entire system using the previously defined parameters with WinSocks Defaults for timeout and retries. By simply changing the timeout to 2 seconds and no retries we can cut the time it takes to poll every device within RFC1918 space to only 2 days from the 12 using the defaults.
I personally believe this provides plenty of merit to the need for the ability to either adjust the TCP parameters within the TcpClient.BeginConnect
and TcpClient.Connect
Functions or implementation of an exterior timing mechanism to track and terminate stale sessions at the Application Layer.
I hope this was of some value to those reading. And I appreciate the response by emilio_grv. It gave me the opportunity to go back and look both at the WinSock implementation of TCP/IP and Sockets as well as get some study time on TCP itself.
History
Created: 4/9/2012 by xk2600.