Monday, June 19, 2006

TCP/IP for Gnubees

In /etc/services on your Ubuntu box, you'll find tcp and udp both paired with standard port-numbers for protocols your kernel understands.
jennifer@edubuntu:~$ sed -n 27,35p
> < /etc/services
> | awk '{print $1 " : " $2}'
ftp-data : 20/tcp
ftp : 21/tcp
fsp : 21/udp
ssh : 22/tcp
ssh : 22/udp
telnet : 23/tcp
smtp : 25/tcp
time : 37/tcp
time : 37/udp
The more application-specific protocols, such as ftp (file transfer : 21), nntp (usenet : 119), smtp (email : 25) and www (web : 80), depend on these tcp and udp protocols to get the job done.

tcp is checked-receipt, meaning the sender is notified of every received packet.

udp doesn't want acknowledgement and is favored by gamers competing on multiuser servers, as tcp is relatively slow.

A tcp sender will helpfully stamp packets with serial numbers, in case the received order is different (and depending on routing, it may well be), plus stamp them with a TTL (time to live) so those failing to make delivery won't then become waywardly redundant and clog our arteries with a lot of negatively synergetic pseudo- information.

tcp packets are return-addressed, and so arrive with all kinds of postmarks. Part of tcp's job is to acknowledge each received packet, via small ACK packets, some of which may likewise fail to return home (no, we don't ACK the ACK packets, although initial hand-shaking is sort of that way).

The packet itself contains payload. An originating service applies a return address then passes the packet on through a next gateway (possibly a NATing router) which applies more labeling information and further forwards the payload, and so on, until the intended addressee is reached (or not, as the case may be). At this point, if at last through the gauntlet, the packet yields its payload, its mission accomplished.

DNS lets us keep the addressing alpha-friendly. For example, by pinging www.python.org I learned that today a host publicly identified as 82.94.237.218 is answering the call, while http://82.94.237.218/ meanwhile returns the home page (I find that satisfyingly consistent).

Thanks to DNS, underlying IP numbers may change, even while domain names remain constant. Furthermore, IP numbers are not ultimately tied to specific hardware (identified by MAC address, with ARP providing the ethernet-to-IP coupling). When we retire a host, we needn't retire its IP number.

Some subnets are by definition local, meaning they have no unique meaning on the Internet. 192.168.x.x is one such intranet domain (.x.x marks the subnet). Per RCF 1918:
The Internet Assigned Numbers Authority (IANA) has
reserved the following three blocks of the IP address
space for private internets:

10.0.0.0 - 10.255.255.255 (10/8 prefix)
172.16.0.0 - 172.31.255.255 (172.16/12 prefix)
192.168.0.0 - 192.168.255.255 (192.168/16 prefix)
The return address info, needed for acknowledging receipt, traces a sequence of public IP numbers down to some internal gateway, at which point port numbers may become more relevant, as a NATing router knows how to sort those ports to local IP number mail boxes.

Some port number like 1776, for example, might actually refer to port 80 on some box in some local "just us chickens" 192.168 village neighborhood. Such a box might serve HTTP requests without itself having a presence as a numbered host on the Internet (which is good, because we don't need to know about every Tom, Dick or Harry serving web pages, at least not in any way DNS cares about).

Historically speaking, this ability to use port numbers as a part of a return address helped us grow the Internet to a size perhaps unanticipated by some of the original designers of IPv4. They imagined us running far lower on public IP numbers by this time, towards the start of the IPv6 phase-in.

Thanks to NATing routers, we've allowed large numbers of computers to remain blissfully anonymous, out of the global IP namespace, thereby freeing more of a limited resource to focus on what's trully "inter" network traffic, more so than the local "intra" stuff of small businesses and home networks.

Yes, some companies take disproportianately selfish advantage of this public resource, with their data-intensive private intranet VPNs, hogging bandwidth. The more responsible companies pay their own way, and then some, which helps our shared global infrastructure keep pace with increasing user demands.

Recommended reading:

Practical TCP/IP by Niall Mansfield, Addison-Wesley, 2003 ISBN 0-201-75078-3 (e.g. see section 23.4 regarding port translation and one-to-one NATing).