From jrex@research.att.com Thu Nov 13 14:30:52 2003 From: jrex@research.att.com (Jennifer Rexford) Date: Thu, 13 Nov 2003 09:30:52 -0500 (EST) Subject: [KP-seed] interesting discussion on Internet Measurement Research Group Message-ID: <200311131430.JAA28407@chips.research.att.com> KP folk, Folks might be interested the (very active) discussion taking place the past few days on the IMRG mailing list. Has relevance to our discussions of automated network troubleshooting (e.g., the "why" and "fixit" problems). See the list archive at https://www1.ietf.org/mail-archive/working-groups/imrg/current/maillist.html and the note that started the thread, attached below. -- Jen ------- Start of forwarded message ------- Date: Tue, 11 Nov 2003 16:28:39 -0800 (PST) From: jon b To: imrg@irtf.org Subject: [IMRG] The case for an Internet Measurement Protocol Sender: imrg-admin@ietf.org Errors-To: imrg-admin@ietf.org X-BeenThere: imrg@ietf.org X-Mailman-Version: 2.0.12 Precedence: bulk List-Unsubscribe: , List-Id: Internet Measurement Research Group List-Post: List-Help: List-Subscribe: , X-Spam-Status: No, hits=-104.7 required=4.0 tests=BAYES_10,USER_IN_WHITELIST version=2.55 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.55 (1.174.2.19-2003-05-19-exp) Content-Type: text/plain; charset=us-ascii Content-Length: 17994 There have been a number of attempts to start work in the IETF on a protocol for performing network measurements, but despite the obvious need, it has been difficult to demonstrate interest in such a protocol. Partly this is due to the large number of different measurement problem areas, there are lots of different issues but it is hard to get the people concerned with all of those issues in one room at the same time. After much discussion with Vern and Mark it was decided that this forum would be a good place to try to pull together such as consensus of interest. The result of which could then be taken to the IETF. ---------------------------------------------------------------------- The case for an Internet Measurement Protocol Jon C. R. Bennett Outline 0) Setting the stage 1) Current problems we have measuring the Internet. 1.1) Who is 'we'? 2) Do we need to measure? 3) What do we need to measure? 3.1) Operational Considerations to measurement. 3.2) What do we want to measure in our wildest dreams? 4) Can we get what we need/want? 4.1) Non-technical obstacles 4.2) Technical obstacles 5) Straw man Protocol ----------------------------------------------------------------------- 0) Setting the stage. Take a look at many recent 'network measurement' papers and you will notice that many times they are more about 'measuring networks' than about 'network measurements'. They are studies in the difficulty of getting good network measurements as much (or sometimes more) than they are a study of the actual measurements. But this is not a bad thing. In order to do scientific/engineering research it is necessary to know the (in)accuracy of the collected data. As a result the authors of these papers often are forced to create new measurement tools or techniques so that they may obtain better measurements. But with few exceptions these tools or techniques must be operated from the edge of the network since the core network equipment is not considered to be modifiable. The few exceptions to this tend to be at those ISPs with their own researchers but those tools are unavailable to the 'rest of us', and in any case they can only answer questions within a single ISP's network not across the whole Internet. Many reasons are given why this state of affairs is to be expected or why a solution would never get deployed. Some even argue that it is philosophically a 'bad thing' to involve the network in the measurement of the network. For purposes of this discussion I will assert that as a result of the body of network measurement research results, we have a good understanding of what properties are so hard to measure that the need for networks to support the taking of such measurements can be well supported. And I will further assert that the issues of implementation and deployment can be addressed without significant real difficulty. So that we should concern ourselves with answering the questions of 1) Is such a protocol needed? 2) If yes, what should it look like? and save 3) Is this practical? If you feel the need to consider issue #3 first you can skip to section 4, otherwise just continue below. ----------------------------------------------------------------------- 1) Current problems we have measuring the Internet. or why ICMP just don't cut it anymore. a) Too many things we used to be able to measure that in practice we no longer can, either due to new deployments (NATs, etc) or new management practices (disable everything for security/privacy, etc) b) We have learned over time many new things we want to measure, but we lack 'good' mechanisms for doing so, or; "why running pathchar from GigE attached host is not a good idea" c) Many new technologies in the Internet have had or are in the process of having measurement protocols designed for them that are in many cases only available to network operators not to end users, e,g MPLS/LSP-Ping, pwe3/VCCV, IP Tunnels/GTTP. If we don't come up with a measurement protocol which can interact in some manner with these and other protocols we will lose useful visibility over more and more of the network. d) Tools which take measurements by using 'brute force' while they may be just the thing to answer the question at hand, are inherently undeployable on a large scale. One grad student = network tomography research 30 network testing companies = wide scale DDOS attack if the difference between 'collecting research measurements' and a 'crippling DDOS attack' is simply how many people are running the tool, then perhaps it is the wrong tool? e) While many recent developments in network measurement such as network tomography are fascinating demonstrations of how much can be learned using only the protocols available to us, if the real goal is to measure the network itself, would it not be better to remove obstacles to making good measurement rather than always just working around them? This is not to suggest that we will make measurement tools go away, but rather that we will make them better, more accurate, faster, more capable, etc. 1.1) Who is "we"? At first glance, we = researchers, but we should remember why we do research, need to consider that lots of other people would benefit from being 'we' as well. a) network operators b) 'normal' end users c) 'programs' the last item is an often forgotten but I think very important one. By 'programs' I don't mean 'apps' like ping, but applications which have their own needs for determining the state of the network, e.g. p2p, receiver multicast, CDN/streaming server selection, RONs, etc. ----------------------------------------------------------------------- 2) Do we need to measure? I'm going to be slightly philosophical for a moment, this will provide some basis for later discussion of non-technical issues of deployment. Do we "need" to measure the network? After all there is no point in creating a protocol to fill a need that does not exist. Well what does "need" mean? If the network always "works" do we "need" to measure anything? Not really. Does the network always "work". Well that would depend on the definition of "works" and as mentioned above, the choice of "we", in general the answer is, hardly ever. In the current commercial Internet if "not working" means, no connectivity at all, one might even argue that we (we = paying customers) have a "right" to measure. One might even go so far as to say we have a right to measure in a manner that tells us where the cause of the lack of connectivity is, i.e. who do we yell at for dropping the packets we paid to have transported. It doesn't matter that we didn't pay the network that caused the failure, remember we are being philosophical here, we paid, the packets should get through. If you are a really big company paying lots of money for a connection with an SLA, or any service provided by more than one ISP this should not be a philosophical issue, it should be a contractual one. If the ISP wants to be paid extra for better service, insist on your right to see you actually got it. Since the performance or efficiency of any given application may be effected by the behavior of the network, at some point "low performance" == "not working". This last point can be used to show a "need" to measure just about any property of a network. Note that we are still talking here about measuring the network, not measuring the behavior of an application over the network. It is up to the user/application to infer from the network properties what its performance will be. ----------------------------------------------------------------------- 3) What do we want to measure Reachability i.e. ping Non-reachability, where is the path broken. Path taken i.e traceroute Return Path includes finding broken return paths Ownership, i.e. AS number, Nanog Traceroute Path Performance Reordering/Jitter/Delay Path Properties MTU/Avail BW/Link Speed/AQM Arbitrary Paths the ping 'gateway' option Paths taken by apps proxies/redirectors etc To/From ANY device get through NATs/ALGs/Firewalls/etc Non-IP paths/tunnels provide end user access to LSP-Ping,GTTP,etc Flow specific paths work in the presence of flow specific filtering Reveal Protocol show the silent POP proxy, web cache "Enhancing" Devices redirectors, etc Is that everything? Too many things? Not enough things? 3.1) Operational Considerations to Measurement 1) Negligible performance impact 2) No "control" functions possible (only present an "information risk" not a "control risk") 3) Provide way to restrict visibility of information 4) Authentication/verification and DOS prevention 5) Where possible testing flow specific path should not require any extra filters or flow state in a router/firewall/NAT/etc, where not possible make the extra state needed as little as possible. Item 1 : Make sure ISPs don't turn IMP off to protect their routers from being overloaded Item 2 : Make sure ISPs/sys admins don't turn IMP off to prevent 'control' attacks on routers/hosts Item 3 : Make sure ISPs/sys admins with a network behind their Firewall/NAT don't turn IMP off to prevent information about their network and/or hosts from leaking. Item 4 : Don't provide a DDOS anonymiser and/or only allow use by approved persons. Item 5 : Where ever possible make IMP work with existing equipment that is not IMP aware, and where it is IMP aware don't force it to have twice the flow state, i.e. the filter that matches a data flow should match the 3.2) What do we want to measure in our wildest dreams This is not an invitation to include the "kitchen sink" but an invitation to think outside the box. For example; a) Show Layer 2 hops.... somehow just because its a layering violation doesn't mean we can't want it.... or a much more utilitarian example; b) Automatic network error reporting. Right now if you are trying to connect to site "A" and you can't what do you do? a) you run 'ping'... it fails b) you run 'traceroute'... if you are lucky it fails at a machine you own or one your ISP owns if not you are basically out of luck c) wait and hope it gets fixed quickly or 'fixes' itself, or c') if you are a really important customer, see if you can get your ISP to get the other ISP to fix it. c'') if you are not an important customer, you call the ISP and they wish you would stop calling and bothering them. Now if you had an Internet Measurement Protocol (IMP) path you could a) send an IMP 'ping' to A... fails b) do an IMP 'traceroute' (find the longest working path) to A c) find the Autonomous System (AS) num/contact info of the owner of the last working hop c) send an "I want to report a problem" message (comprised largely of the returned IMP packet) to the owner of the last reached hop d) since the IMP packet contains enough information to both describe what the user is trying to reach, e.g. src/dst addr/port, protocol number, DiffServ Code Point (DSCP), and at least some components of the path including at least one hop within the ISP (and the TTL value at that hop) it is possible for the ISPto attempt to exactly reproduce the customers failed attempt to reach A, by sending the same packet from any one of the hops indicated. the automated system could then perform the *identical* test that you performed, sending the same packet the user sent and if it got the same result, i.e. can't reach destination, then it could flag it for a human to look at, or it could perform more tests to make sure the problem really is in it's network. NOTE the packet would be identical except for the source address, but if the user were allowed to send an IMP packet redirected through the ISP router, then the ISP would be able to run the test with the same addresses. One could also imagine where you had to make your report to your ISP (or to any other 'reputable' entity that you could authenticate yourself to) regardless of where the error occurs and they would forward it to the appropriate ISP. This way ISPs would only take requests from their own customers or from other ISPs that they have authenticated. There are many different ways the user (or application) could determine who to send the trouble report to, and there are many different actions the receiver of the trouble report can take with it. The key point is that the information contained in the body of the IMP packet contains enough information for the users failed attempt to reach A to be *automatically* reproduced. This allows automatic validation of the reported trouble before taking further action, which might include informing a human operator or any other corrective action. ----------------------------------------------------------------------- 4) Can we get what we need/want? 4.1) Non-technical issues a) This is a bad idea, it violates the end to end principle and/or we all know hop-by-hop anything is bad. Firstly it doesn't violate end to end anything. We are conducting 'network' measurements not 'host' measurements. The network is the 'end point' that we want to talk to. Secondly it's not hop by hop, one of the ideas of using a new protocol is to allow it to be handled in the forwarding path rather than by the router CPU. b) No ISP will ever run anything like this. It is not at all clear this is true given that major ISPs have brought in drafts such as draft-bhattacharyya-monitoring-deployment-00.txt, there is clearly interest by ISPs both for their own operational benefits as well as to be able to demonstrate that they are in fact providing a high quality service. c) It will take so long to deploy that it will never be adopted. The recent run of security problems in the Internet have brought us to the point where if the device is software/firmware based then deploying IMP into it is just a patch release away. So getting wide spread deployment of new 'useful' networking stuff has suddenly become much easier. hosts --> windowsupdate.microsoft.com host firewalls --> update.symantec.com, etc home routers/NATs (most sw based) --> the sooner we get a protocol the sooner all the not yet sold boxes will have it. edge routers --> sw/firmware update core routers --> hope they can reprogram their Network Processors I am not saying it will be 'easy', but that with a bit of effort one could have fairly wide deployment in a relatively short time frame. Also the devices which are hardest to get deployment (the realy high speed routers) on, are also the fewest in number. They are also the devices most likely to be recently installed (for any definition of recent). 4.2) Technical considerations a) Too complicated for hardware devices The router functionality required is not very involved, particularly compared to capability of modern FPGAs/ASICs. Beyond that, I will just say that I've been involved with a hardware design for the version on an IMP referenced below and really its not that hard. (I used proof by assertion to save typing, I am happy to discuss it the mater further) b) Performance impact on software devices i) For NATs,Firewalls,etc handling IMP should be much easier than the requirements complex statefull protocols impose ii) For routers since there is no interaction with packet forwarding the work could be done in 'parallel' while waiting on the memory accesses for the routing and filtering lookups. Another point to consider is does the added 'per packet' load of supporting IMP reduce the overall packet load on the router? The answer should be, yes. Consider a router in the middle of a path being tracerouted. With IMP the router would process one IMP packet. With traceroute it would have to forward on average (path length/2) packets and send 1 TTL exceeded message. So the average load should be greatly reduced. In the case of a DDOS attack, while there will be lots of packets to be dealt with, they will have the advantage that each and everyone will carry the full path back to the attacker! -------------------------------------------------------------------- 5) Straw man Protocol Current protocols and measurement tools provide only a limited subset of these requirements or restrictions stated above. What we want is an efficient, mechanism with low processing overhead for getting as much information as possible in a manner usable by as many people/applications as possible. As a starting point for the Internet Measurement Protocol (IMP) I will propose the version of IPMP found in draft-bennett-ippm-ipmp-01.txt ftp.rutgers.edu/pub/internet-drafts/draft-bennett-ippm-ipmp-01.txt This draft was also meant to provide a starting point for discussion of the protocol, so please ignore the typos, I have recieved lot of commentary on it, and there are number of changes that have not been incorporated into the last version. NOTE somehow the 'Security Considerations' section went missing, it can be found in the 00 version of the draft at ftp.rutgers.edu/pub/internet-drafts/draft-bennett-ippm-ipmp-01.txt These drafts were based on the original IPMP work by Tony McGregor (http://www.watersprings.org/pub/id/draft-mcgregor-ipmp-00.txt) 5.1) What next? Is this good thing? Are there issues that have not been considered? Additional requirements? Security considerations? Is there functionality that should be removed? or added? Suggestions? Comments? _______________________________________________ IMRG mailing list IMRG@ietf.org https://www1.ietf.org/mailman/listinfo/imrg ------- End of forwarded message ------- From peyman@MIT.EDU Wed Nov 19 16:49:32 2003 From: peyman@MIT.EDU (Peyman Faratin) Date: Wed, 19 Nov 2003 11:49:32 -0500 (EST) Subject: [KP-seed] Network Tomography Survey Paper Message-ID: Folks, Here is a good survey paper on network tomography from a signal processing perspective (for localization of performance and not rechability failures): Mark Coates, Alfred Hero, Robert Nowak, Bin Yu: Internet Tomography, IEEE Signal Processing Magazine 19,3 (2002), 47-65 located at: http://citeseer.nj.nec.com/coates02internet.html I am also incrementally developing a network management resource site at: http://ana.lcs.mit.edu/peyman/nm.htm Peyman ______________________________________________________________________________ Peyman Faratin MIT, Computer Science and Artificial Intelligence Laboratory 200 Technology Square, NE43-534, Cambridge, MA, 02139 Tel: +1 (617) 258-0458, Fax: +1 (617) 253-2673 http://ana.mit.edu/peyman/ email: peyman@mit.edu ______________________________________________________________________________ From faber@ISI.EDU Mon Nov 24 22:48:30 2003 From: faber@ISI.EDU (Ted Faber) Date: Mon, 24 Nov 2003 14:48:30 -0800 Subject: [KP-seed] Economics and the KP Message-ID: <20031124224830.GH53381@pun.isi.edu> --Bg2esWel0ueIH/G/ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Dave Clark sent out a paper on routing in the KP that mentioned using currency to scope the observations. I mentioned at the meeting that I disagreed with that, and I've prepared a short document outlining my position. You can find it at: http://www.isi.edu/~faber/tmp/econ.ps http://www.isi.edu/~faber/tmp/econ.pdf Please let me know what you think. -- Ted Faber http://www.isi.edu/~faber PGP: http://www.isi.edu/~faber/pubkeys.asc Unexpected attachment on this mail? See http://www.isi.edu/~faber/FAQ.html#SIG --Bg2esWel0ueIH/G/ Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.3 (FreeBSD) iD8DBQE/woq+aUz3f+Zf+XsRAkljAKCSB9odt8bYCGGubAkQhE6QIQzCagCg0L2Z nfHDn7duesRiwwDdZeohlvI= =ugNh -----END PGP SIGNATURE----- --Bg2esWel0ueIH/G/--