Transcript - Inter-Domain Routing Tutorial

Disclaimer

While every effort is made to capture a live speaker's words, it is possible at times that the transcript contains some errors or mistranslations. APNIC apologizes for any inconvenience, but accepts no liability for any event or action resulting from the transcripts.

Randy Bush: I'm Randy Bush. I work as an operator and a routing researcher at IIJ in Tokyo. Some of the routing research involves BGP security and some of this work has been going on for 10 years, and so I'm going to talk about that today. A lot of people have been working on it with me, these are two of the most notable ones, Rob Austein and Steve Bellovin. I'll do some technical background. I'll describe the first problem we are trying to attack, which is mis-origination. I hope most of you remember the YouTube incident. The infrastructure needed to support BGP security, and then two different flavours of BGP security; one of which is available today and the other of which will be in a few years.

As I said, this is not new. Steve Bellovin and Radia Perlman wrote papers identifying the problem in 1986. Essentially we designed an Internet without security. The Internet that I grew up in when I was a little younger, if you had a Unix system on the network, it normally had an account called "Guest" with no password. It was considered rude to not have an open system. This is radically different today. Now we are trying to paint security on afterwards. The two main areas that Radia and Steve identified in 1986 were DNS and routing, the two biggest vulnerabilities of the Internet. DNS is being patched with something known as DNSSEC, and that's being slowly and torturously rolled out in the Internet today. In 1999, the US National Academy of Science called these issues out as major needing to be fixed. In 2000, Steve Kent and some folk at BBN, Charlie Lynn, who died, did some early experimentation implementations of something they called S-BGP and it used an X.509 based PKI to support it, and much of that PKI is what we use today. So that's the roots of some of this technology.

In 2003, we tried to experiment at a NANOG workshop with BBN's technology, and it had some serious operational mis-designs. In 2006, two of the RIRs, APNIC and ARIN, began to work on providing the certificate RPKI infrastructure, and RIPE started in 2008. They have these infrastructures today. All the RIRs, except for ARIN, are deployed with the RPKI infrastructure, which I'll bore you to death with shortly. In 2009, we started an open testbed and code started running in real routers. It runs in real shipping routers today. So the first whole sections of this are talking about present day reality. For those who don't know, I'm going to go through some very boring, simple, about five or six slides in the beginning. Sorry for those who do know. What's an AS? An autonomous system. It's really an ISP, like Verizon or Sprint or IIJ, the company I work for. We connect to each other and we talk BGP between us.

Our customers, we sell transit to them and we exchange BGP routes with them. This routing protocol is what ties the ISPs of the Internet together. They do peering and sell transit and other more complex relationships. A prefix, an IP space, 147.28/16 is propagated by BGP from the originator, the person who owns that prefix, to their ISP, and from that ISP up to the core of the Internet, back down to you, so you can send packets to it. An IP prefix -- I am going to use the pointer over on this side -- if you look inside a router, it has a BGP table. We look at it for this prefix and we see that this (the 234) is where it originates, this is the AS number, this is the ISP or end user site that first announces it. It goes to the upstream ISP, to another ISP, in this case this is IIJ and Sprint, so they are peering, then down to the end user site. So this end user site gets the prefix along the path and can therefore send packets back to prefix 234 at this origin.

Inside a router, it looks a lot uglier, but really the same thing. This is the AS path, 234 to 2497 to 1239 to 16509. This is the prefix, here is a different path, which started here at 2497, but 2497 also peers with 701 as well as 1239 and gets there to 16509. So this AS is multi-homing. It is not important but it's just a little uglier in a router, but it's the same information. The origin AS and the path it took. Let's talk about the threat. In security, we like to think about what asset we are protecting and what is the threat against which we are protecting it. Pakistan saw a social threat in content of YouTube and wished to protect their country against that social threat. So they told Pakistan Telekom, please, to the internal Internet and all your customers and inside Pakistan, tell them that the route to YouTube is this way. In other words, they announced that YouTube's prefix is this way and they set the packets into the bit bucket to protect their citizens from a threat that they perceived.

This was the plan. This is what they wanted to do. YouTube is somewhere out here on the global Internet and its route went to PCCW and PCCW gave it to Pakistan Telekom and Pakistan Telekom didn't want to listen to it and instead announced the bit bucket. Unfortunately, somebody in Pakistan Telekom was not careful. So that was the plan. But what actually happened was that they announced the poisoned prefix to the entire Internet. They announced it from Pakistan Telekom to PCCW, who, if they were a conservative ISP, would have filtered the prefix because they know that YouTube is not in Pakistan, but they didn't. It went out to the world.

So a lot of the global Internet got told, "YouTube is this way." The traffic went there. Fortunately and unfortunately, what happened was what Jonathan Zittrain said, "The Internet survives on the kindness of strangers." Somebody noticed this within one minute, put a message on NANOG, the big operators' list, and operators all over the world started filtering the bad announcement. Within an hour or so, most things were good, some time six hours later, Pakistan Telekom fixed their problem, but this is not the way we want to run a global infrastructure, relying on the kindness of strangers. So, we call this mis-origination. In other words, the origin of YouTube was lied about and sent there. A prefix is originated by an AS which does not own it. Pakistan did not own YouTube's prefix. I do not call it hijacking. It is often called hijacking, but hijacking is negative and it presumes a negative intent. My best guess is what we call fat fingers -- somebody made a stupid mistake in Pakistan Telekom, they didn't mean to do this. But it makes no difference. Let's call it mis-origination. The problem is that it is not a unique incident, it happens every day. Most of the time it happens to small ISPs, a small end user that gets in trouble. Sometimes it is a very large one, like YouTube. We have had many of these, large and small. What is the plan? What do we do about it? We would like to fix this.

There are three pieces to the plan. As mentioned before, the RPKI, resource public key infrastructure, which is an X.509 certificate hierarchy that supports the other pieces. On top of that we build something called origin validation, using the RPKI infrastructure to detect and prevent mis-originations of someone else's prefix. The RPKI is deployed by RIRs, starting a year ago. Origin validation started shipping in Cisco and Juniper routers this year. Then some years down the pike we are going to have something called AS path validation, which is checking that the whole path is correct, not just the origin. But that's in the future. I'll tell you a little bit about that at the end if have time. Why do we want to develop a validation? To prevent the YouTube incident. There have been worse incidents. In 1997 we called something the 7007 accident, which is the owner of AS 7007, a smallish ISP, announced the entire Internet back to the Internet. It attracted massive amounts of traffic but it also did worse because when Vinnie did this he sliced the entire Internet up into /24 prefixes. At that time the Internet was much smaller than it is today, so there were only about 60,000, 70,000 or 80,000 prefixes he announced. But routers were a lot smaller than they are today, and Sprint fell over. UUnet, what is today Verizon, fell over. Every time they fixed one router and turned it back up, the next router was poisonous. It went on for almost two days, cleaning up finally the edges of the network.

What we want to do is prevent all these accidental announcements. What origin validation does not prevent is malicious path attacks. Somebody in the middle, what we call a monkey in the middle, altering the routing announcement path and trying to attract traffic, intentionally. It's malicious, it's an attack. That requires path validation, which is the third step, which I will talk about at the end if I have time. That's some years away. Right now we need to be able to formally prove who owns an IP prefix and what autonomous system is authorised to announce it. So to be able to authoritatively prove who owns an IP prefix, we have to follow the allocation hierarchy from the IANA to the RIRs to the NIRs to the ISP and to the end sites. That is the administrative hierarchy. That's the only formal attestation of ownership of a prefix or an AS number. Then, once I can formally show -- by formally I mean I'm going to use cryptographic signatures and it's going to be signed, the IANA is going to sign, saying, "98/8 is given to ARIN", and ARIN is going to sign something, cryptographically, that says "Randy has 98.128/16", et cetera. Then we are going to be able to cryptographically sign what AS may announce it.

Prefix ownership follows the allocation hierarchy. So we build a public key infrastructure to represent that. It is being developed and deployed by the RIRs and operators and IANA does what IANA does, which is formally talk about it for five years, and maybe some day they are going to play their part in the game. This is based on an X.509 certificate that has been extended with an Internet engineering task force RFC3779 to have an extension that describes the IP resource. It describes IP blocks, ASNs. So it can say, this IP block is owned by whoever's public key is there. It is signed by the parent's private key. So IANA signs this, like 98.0.0.0/8, to ARIN. We are going to have some examples. Let's assume IANA, which is asleep right now, has somehow assigned this, this /8 to ARIN. ARIN is going to give this /16 pieces of it to these entities, this entity gets a /20, this entity gets a /20, this entity gets a /19. This one further gives it to these. It is a hierarchy following the administrative hierarchy of formally signed cryptographic hierarchy telling you who owns what IP address space.

Similarly, ASNs can be represented, but we don't really care right now. So that's who owns it. Now we know that Randy got 98.128 blah, blah, blah. What AS can announce it? Randy owns it, and he can make something called a route origin authentication that binds the address space to the ASN that can announce it. Randy wants to sign this but he doesn't want to say that 42 can announce both of the prefixes he owns. He only wants it to sign this one. So he makes an intermediate certificate that is called an end entity certificate, but we don't care, and he gives that certificate only the part he wants to give to the route origin authorisation. Now we know, we have a record that says it can be formally verified all the way up to the IANA, route trust anchor, just like DNSSEC, that this prefix may be announced by this AS. You might say, does the AS sign saying that it agrees? The answer is no, it doesn't have to. It shows that it agrees by actually making the BGP announcement. That's sufficient.

I may want to have two ROAs out for two different ASs for the same prefix. The reason I might want that is because right now Sunny is my provider, so I'm going to tell Sunny, please announce my prefix, so there is a ROA there for him. But Sunny is not a very good provider or he charges me too much, so I'm going to switch. So I build the circuit to George and I issue a ROA for George. Both ROAs are now active, both circuits are active. I test that circuit. I pre-test it some more. I'm happy with the circuit. I disconnect from Sunny and remove the other ROA. We call this make before break. That and other reasons may mean there are multiple ROAs active for the same prefix at any time -- it's perfectly valid. I could want to say, hey, yes, there's that /16, but maybe I'm going to announce /20s or maybe I'm going to announce /18s or /19s. So there's a macro, a shorthand, using a field called maximum length, where you say this ROA covers this prefix, with this prefix length, but you can chop it up into pieces that small that announcements may happen up to /24s from AS 3130. It's kind of a macro. Now we have an authority structure that says who can announce a prefix. That authority structure is formally verifiable.

We have something we call a certificate engine, which makes these certificates, and you use a GUI and you tell it, hey, I want to make a route origin authorisation for the prefix I own to the following AS, et cetera. It talks to this engine and the engine makes the ROA and the certificates that are needed. But before it can do that it needs to have a little conversation with its parent, in other words if this is my ROA I need to talk to APNIC or JPNIC and ARIN and IANA to get the certificate structure all the way down. Then when I've got all that I'm going to publish the certificates. I'm going to publish the IP resource certificates, the ASN resource certificates and the route origin authorizations. We have this structure up/down that follows the allocation hierarchy. When I go to create that ROA, my software that I use says, OK, you want to create AS 3130, that prefix with the maximum length. By the way, before you do that, I've looked at the global routing table and I see two announcements out there for this prefix, one from 4128 and one from 3130, and if you punch this "Create" button, this one is going to be known as an attacker. Are you sure you want to do that?

We say yes, of course. So what this looks like in a little more complex blow-up is that the registry, like IIJ's registry or whatever, manipulates its resources, it has a database that keeps -- an Excel database or Sequel database or whatever and it talks to the certification engine and makes certificates and the certification talks to its parents and children, the IIJ certification engine talks to JPNIC and APNIC and they talk to their upstreams, IANA, et cetera, and I publish my stuff. What's funny about this, as John Curran of ARIN pointed out the other year, is that APNIC, 98 per cent of APNIC's members do not want to run this software. They do not want to provide this service but they do want their certificates and ROAs. So APNIC is going to provide a web GUI or does provide a web GUI that you can go to and have your ROAs created for you. George will gladly run on the little treadmill and make little ROAs for you. Indeed, 90 per cent of APNIC's users probably want to operate that way. The 2 per cent of APNIC's users that don't want to operate that way are the big ISPs, Telstra, NTT, IIJ, et cetera. So they will run instances of this software on servers and participate in the up/down protocol.

What's amusing, as John Curran pointed out, is that even though that is only 2 per cent of the RIR's members, it probably represents 90 per cent of the RIR's address allocation, because those 10 big users control 90 per cent of the address space. It operates in both modes. Both modes are wonderful and valid. And you can migrate from one to the other. There you go. So, let's talk about what I call issuing parties -- it's not what I call, by the way, it's generally known in certificate language -- they issue a certificate, and this is the up/down protocol which each has its little GUI and they issue certificates and they each publish. When IANA gives APNIC a certificate, it puts the certificate here and it says APNIC publishes here and APNIC says IIJ publishes here. So these are issuing parties and they issue certificates and ROAs. Over here, a whole bunch of relying parties that we cannot see. We will see them later. IANA has pointers and says, hey, ARIN publishes here, APNIC publishes here, APNIC has IIJ, ARIN has UUnet, PSGnet, these are scattered all over the network. How do I get for my operational users and ISP a validated cache, a cache that I know is formally cryptographically correct, from those distributed data?

This data set is much smaller than the DNS but it's philosophically the same. You take the trust anchor, you take, knowing where IANA publishes and what IANA's route key is and you recursively descend through the data sets. But there's only a couple of million objects -- it's not like the DNS -- so you can descend and you can gather them all. A couple of hundred bytes each, there are a few million of them. What's that? A couple of hundred megabytes. Nothing. So you can gather them all into a cache. The way you gather them is very conservative. First, I take the trust anchor and I read the entire IANA database, then I take the trust anchor and go through it, validating it. Then I go down and follow the validated pointer to ARIN, and I do the same. This means, since I only follow validated data, that nobody can give me a bad pointer. That lets me assemble a validated cache of the entire certificate and ROA database. As I said, the relying parties will appear. The issuing parties publish that stuff. I use that gatherer to go around from the trust anchor and gather all the stuff. It's now in a validated cache. I can use different types of tools to use it. The one I'm personally in love with is the BGP decision process, in the lower right-hand corner, because that automates detection of mis-originations. But lots of people today use the Internet Routing Registry, which is part of the APNIC Whois service, and so that, you can produce fake IRR data, you can have NOC tools or you can put it in the BGP decision process.

Fake IRR data would look like this. It just says the ROA and the ASN, but what is interesting about it is this was produced by something which is cryptographically authenticated all the way down from the IANA. So this is far more trustworthy than the current IRR data, which is very untrustworthy, except for a few geographic areas in the world that really worked hard on it. You could make a CSV file of your work flow, being just prefix and AS. I don't know why these are called CSVs any more because they always seem to be tabs but I still go to record stores and dial a telephone. Stores don't have records and telephones don't have dials. Then you can run this into whatever your automated tool system might be. But this is the one I like. It is the one where you describe your ROAs, et cetera, to GUI and it puts it in the crypto engine and publishes it in repositories and it is gathered and given to a local cache that I like to have in my POP, and then given to the router.

I don't want my router to have to handle 2 million certificates, another cryptoblock. I want my router just to have to eat something like this: the prefix and where the AS originated. That's all the router needs to know. This RPKI router protocol is going to throw away all the stuff the router is not really interested in and just give the router what it wants. Now I am going to wander through some other stuff before I describe that. Let's remember, I am going to have the cache and I want it in my POP. The reason I want it in my POP is when I say this is just giving the router the IP prefix and the AS number, is that that no longer has the certificates and no longer has the crypto security. So up until now we have been using what we call object security, all the objects are signed, they are in a hierarchy, they are formally validated.

Now, we are going to transport security. In other words, this protocol is being carried over SSH or something, and does not have the crypto information in it and that means I'm trusting this transport, which means I kind of want it near my POP, or near my routers in my POP, which means I have a lot of POPs, which means I have a lot of these and I don't want to be doing the RCynic gatherer from the entire Internet in every one of my POPs. Aside from loading the central servers too much, it is a lot of time and money. So what I might do instead is have a cache or two in each continent, getting the data from the global RPKI, and then they all serve each other's caches. The trick is, these caches have all the crypto in them, they haven't given it to routers yet, so each one of these caches can and should validate the entire free, given the trust anchor. Therefore, all of these are valid caches and they are talking to each other. As you notice, they use redundancy to get the stuff around. Very little load is placed on the global servers. That's nice. Now we have all these ROAs around. How do they affect BGP updates? They are published and in my NOC I publish my stuff, but in the POP I get the validated cache, I strip the crypto in the RPKI router protocol and I give it to the router to make decisions. This is the RPKI router protocol. The router says, give me some data. The cache says, this is a bunch of IPv4 and IPv6 prefixes. The router says, OK, I have all the ones up to a certain serial number, give me everything since then, et cetera.

An IPv4 prefix looks like this. It is essentially a bunch of gibberish, type 4 for an IPv4 prefix, the prefix length, the prefix, the maximum length and the autonomous system number. The two things you really cared about are the prefix and the autonomous system number. In IPv6, 96 more bits, no magic, looks the same. The BGP updates that come into the router are compared with the ROA data we now have in the router. So the router got the ROA data from the RPKI protocol and has it in its tree. It gets BGP announcements and when a BGP announcement comes in it marked that announcement, it looks in this cache of ROAs and it says, for that prefix, for that AS, is there a ROA that matches? If so, it is valid. If there's a ROA with a different origin and none for the right origin, it is invalid, otherwise there is no ROA for it. So it marks it as valid, invalid or not found. To be noted, what it does is mark it. The router makes no judgment. It marks it for you. What you do to make this all happen is really simple -- excuse the Cisco-ness of examples, you Juniper people -- you say, hey, I'm talking about BGP and you say, here is the RPKI server, here is the port, here is how often I want to refresh the data. I can have multiple servers for redundancy, and the router handles it all.

Once I have configured it, I can say, hey, let me see the status of the server, and it gives you the normal nonsense, I have connected to it, port number and all that stuff, the typical things that routers throw at you. I can look at the table the router got. It shows prefix and maximum length and the AS and which server it got it from. The BGP dump table itself has been modified to show "valids", "invalids" and "not founds" in the dump table directly. If you show a prefix, remember way back when I created a prefix for 98.128 /24 for a 3130, so it says, yes, that's valid. Here, look at that, that's the wrong AS originating it. It marked it as invalid. The router now knows and has told you there's an attack or there's an accident -- whatever we want to call it. The result of the check is one of three things: it is valid if a matching and covering ROA was found with the correct AS number; it is invalid where a matching or covering ROA was found and the AS number did not match and there was no other valid one. Remember, there can be two ROAs or five ROAs out there. If one of them is valid, it is valid. But if none of them are valid, invalid.

If there was no matching or covering ROA found, it is not found, which is the same as today. This is all designed for incremental deployment. Here is a valid announcement. 27.318 is indeed authorised for 192.158.248.0/24, and I get it from two of my peers and they both tell me the 27.318 is the origin AS, even though it has different AS paths. They are both valid. Here is an invalid one as we saw before. 3927 is not supposed to be announcing that prefix. I got this announcement, it is invalid. Here is one that didn't find a matching and covering ROA. So what are the matching rules? This gets a little boring and also the term "ROA" gets changed to "VRP" because we do not have the ROA in the router, we have a ROA where the crypto has been stripped, so it's now called a validated ROA payload instead, just to confuse us all. I will still accidentally use the term "ROA", so as not to confuse myself. First we are going to define when a prefix is covered by a ROA. That is when the prefix length of the ROA is less than or equal to the prefix length of the announcement. So this /12 ROA or VRP is less than -- is the same prefix of course, and is less than or equal to the prefix length. Here it also covers because a 16 is less than or equal to a 16. Here it is too long -- the 20 doesn't cover the 16. So this ROA is irrelevant to the decision if it's there, because it doesn't cover at all.

Now that we have defined covering we can define matching. A prefix is matched by a VRP/ROA when the prefix is covered, as defined on the previous slide, and the length is less than or equal to the maximum length -- in other words, it has to be between these two -- and the AS matches. So here we have a BGP announcement that the /16 that is between 12 and 16 and the AS matches. So this is matched. So if this ROA exists in the router, when this announcement comes in, this announcement will be marked as valid. Here, the ROA is for 16 through 24, which is in range but the AS is 666, which is not the same as 42. So if a ROA that does match is not hanging around -- and this is the only ROA -- it is a mismatch and will be marked invalid. Here we have a ROA for 20 through 24. Oops, not in range, so it makes no difference that the AS is the same, this will result in not found.

Here are some examples, just to bore you completely. Here are two ROAs that exist in the router for the same prefix of 16 through 24 for AS6, 16 through 20 for AS42. We get this announcement of the /12. The /12 is not in that range and it is not in that range. So it is shorter than the VRPs so it is going to be not found. This BGP announcement has no covering VRP. Here is a /16. It matches both, covered by both of these. It is AS42. Yes, it matches this one, it is going to be marked valid. Here is a 20 for 42, yes, fits in there, AS42, matches, valid. Here is a 24 AS42. 24, oops, doesn't fit in there, fits in there -- AS6, not 42, wrong AS, invalid. Here is the same 24 with an AS6 announcement, matches, valid. That's the end of this boring little detail. There is a problem, though. Maybe I haven't upgraded the software on all my routers or maybe my Moscow POP does not have a connection to an RPKI server. So here is my network and I have iBGP full mesh in my network, and this router in New York gets an invalid announcement, this router in Seattle gets a valid announcement with the same prefix but probably a different AS. This router gets the same prefix, "unknown". Which does this router choose and why? We are not going to modify the BGP protocol to carry validity state. That would get ugly. Even if we did it, she would get three different validity states and not be able to choose.

The key here is I already have a whole lot of routing policy inside my ISP. What I want this stuff to do is blend in with it. I want to use my existing routing policy. I want to be able to test the mark that the router put on the route and then set local policy. And so, using Cisco syntax -- excuse me, Juniper people -- I can make a route map and I can match validity state and, for instance, I can set local preference. So here I take valid routes and make them 100, I take not found and make them 50 and invalid ones are dropped. I will not accept invalid announcements, I will accept valid and invalid and I prefer valid. Here is a very paranoid policy. I match valid, I accept it, and everything else I throw away. I am running a secure installation and I only want to listen to validated stuff. Here I am going to set a community. 400 if it was valid, 200 if it was invalid, 300 if it was not found. And that community is known by everybody inside my autonomous system and they can all each have their policy based on that. Another example I like to use is I have worked with a friend of mine, Steve Bellovin, who is a well known security researcher and he only wants the invalid ones. He is studying attackers. So he is going to have a policy that says permit invalid and drop everything else. This is the strength of me being able to control my own policy. Again, we have marked them valid, invalid and not found, we applied policy, and there is a MIB for it and there are syslog entries, et cetera, so you can monitor it. And you in your log you can have every time an invalid announcement comes through or you can query your router with SNMP, saying, show me all the invalid announcements that I have heard, et cetera. It is in the routers, it is in running code in the routers and it is integrated in the whole monitoring and measuring world. It is notable that this is a routing protocol for which the MIB was written early, usually it is five years later.

There is a problem. There are always problems, of course. But we have a potential social problem, in that the poor RIRs have always said, we don't do anything about routing, take that away, we don't want to talk -- we will talk about it because it is technically interesting but we don't control it, we don't want to control it. That's between ISPs. But all of a sudden they are in the certificate path. So we have what we call the Dutch court attack. Somebody can come to the court in Amsterdam and tell RIPE, withdraw that certificate or withdraw that ROA. Now, of course the people who have guns or who own the governments, like big media and the banks and everybody, who actually own our governments, already have control of all this. I'm sure you have all read of the Department of Homeland Security ICE, tearing down 120,000 domains of people that the media industry didn't like, et cetera. So what's really happening here is that right now they are using big hammers and bombs to attack people they don't like. This will give them a scalpel. They will be able to be much more precise. They are still going to be able to do it, but they are doing it today. When people with guns and money and lawyers want something, they usually get it, in my culture. What this does is it makes it a much more delicate tool and one that has significant benefits, ie routing security. So it is a trade I am willing to make. But we should be warned that from my ROA there's a certificate chain and it goes all the way up to IANA and it is vulnerable to either attack or stupidity by anybody in the chain.

For instance, we should note that these are not identity certs. I do not know who this certificate really is. I just know it is this prefix, it got it from there, allocated it here, et cetera. So that is not an identity cert. There could be sloppy admin here, there is a certificate that is soon to expire. All the software I have shown you, by the way, will warn you, in three weeks your great-grandparent certificate is going to expire if they don't renew it. It sends email and does all the normal things you want. But it could be, if they don't do something, it could make my ROA invalid. So the ROA would become invalid, it will become expired actually. My announcement would become "not found", unless there was a covering prefix. So who do I call? What do I do operationally? Who are you going to call? Ghostbusters! There's a record, just like an ROA, that a certificate can have of the Ghostbusters record, which has a human name and an organization. It is a stripped V-card, heavily stripped, it only allows names, addresses, telephone numbers and email addresses. So I know who to call. I call them up and I say, hey! If they still don't act there's also a draft in the IETF which shows how this parent could reach around the broken link and issue a certificate to cover the problem until it's fixed. It's called something or other grandparent. Let me end, you are controlling your policy, right? You, as the receiver, can say -- you could accept invalid announcements if you want. So if somebody does the Dutch court attack and makes access for all -- even more interesting, is Dmitry still here? -- makes a Russian prefix invalid, Dmitry can put out a message on the operator's list and the operators can easily overwrite.

I personally believe, and the operational considerations recommend, reject invalid. Because if you don't, what's all this about? Like an engineer -- because I am one -- I have to tell you about all the possible downsides. All the software I was talking about is either shipping router code, and you know who to talk to, or it is open source, BSD licence, running code, blah, blah, blah, blah, which you can get there. That essentially gets you origin validation. As I said, the RIRs, except for ARIN, are supporting it today. And ARIN is supporting it at the end of the year, at the end of 2009, 2010, 2011, now they are saying the end of 2012. I'm not betting on it.

I think APNIC's training folk are going to teach you how to use their web interface to generate ROAs today. You don't happen to have slides on that, do you, George? Talk about putting a man on the spot.

George Michaelson: No.

Randy Bush: Future work now. There's a problem. Origin validation will stop accidents. It will not stop malicious attacks. Accidents and fat fingers are 99 per cent of the problems that we actually see today. But we are still paranoid. A malicious attacker could announce and could forge an origin of the correct AS and then pass it to the Internet, thereby attacking. It would pass origin validation but it would mean the packets went to the wrong place. The solution being proposed is per prefix AS path validation; in other words, not just the origin but the entire AS path is validatable, to protect against origin forgery and monkey in the middle attacks. It is not merely showing that the AS path is not impossible, it is showing that it is the AS path.

One little piece of religion, and to put this in perspective, is I don't know what Mary intended with her announcements, I only know what she did announce. So I can't know what Mary's business relationships are with her peers and other ISPs, I can only know what she announced in BGP and I can formally validate the chain. What Mary meant to announce and what her business relationships are are policy. I do not wish to create another policy database. We already have a way to distribute policy on the Internet. It is called BGP. Policy changes continuously, new customers are brought up, peers, circuits go up and down, business relationships change, et cetera. So I don't want to track policy. What I'm trying to protect is the protocol has not been violated. Nobody is cheating the protocol and using that to redirect traffic.

An example of the way things used to be in the innocent days when everything ran correctly. B announced his prefix to W, who announced the prefix to Z and X, who announced it to A. So A got told, hey, if you want to get there, the lowest cost route is XWB. So the money in the packets flowed this way. Z is the attacker. Z lies. Z forges an announcement that says B is the correct origin, so origin validation works, but Z says, hey, I can get you there without W in the middle. I can get you there cheap and the dollars flow this way and they kind of stop here and the rest of the packet goes there with the dollars removed. This is not good. This is a very simplistic attack but the important thing to note is it is a path attack. The origin was correct everywhere, the origin validation doesn't solve it, it's a monkey in the middle attack, it's a diverted path. So the way the BGPsec solves this problem is through something called forward signing. When B hands the announcement to W, B signs the announcement. B says, I am signing the announcement saying I am B, sending it to W.

So the way it looks is B takes the prefix and so on and so forth and says, I am AS1 and I'm signing it to AS2 and puts the signature there. AS2, when signing it to AS3, signs over the signature and the rest of the block and puts the signature here. So what you have formed is a chain of signatures that is the AS path. That means you can formally cryptographically validate the AS path. So what will happen is B sends it to W, and B says, I am sending it to W and I sign it with my key and send it to W. W can then sign, B sent it to me signed, forwarding it to X, signing it with W's key, and so on and so forth. Z cannot sign a message saying that B gave it to him because B never signed a message saying B was sending to W. Therefore, Z cannot tell X that he has a shortcut to B. This is a change to the BGP protocol. So it is going to be a negotiated capability. A says, I am capable of this. Are you capable of it? Yes. To be noted, origin validation I think of as a thousand points of light. Any router can turn it on, be enabled, detect, react, et cetera, anywhere in the Internet. What their connectivity is makes no difference.

This requires agreement with my neighbours, so it's going to deploy in expanding islands and then the islands will interconnect with each other. The capability, due to the fact that signatures will make BGP fatter, as to -- you probably don't know it, but BGP has a limit of 4096 bytes in a message. That's being removed now. If it's not agreed, BGPsec falls back to traditional BGP data, so if I'm neighbours with George and we speak BGPsec and I go and hand it to Sunny, who doesn't, I strip all the signature information and hand him a plain, normal, vanilla BGP announcement. It requires per router keys. You could do per AS if you want, and that works perfectly well with the protocol but the protocol allows and kind of assumes per router keys because I don't want compromise of my New York router to mean that somebody can therefore get at my Moscow router. It is a more complex certificate and key distribution mechanism and there is an Internet draft on, whether the router generates a key pair or has it downloaded to it, et cetera. As I said, you can have one key per AS.

The structure for this is very much the same as before, except instead of a ROA we took -- remember the certificates also had AS numbers and the AS number signs on router certificates that have the router ID plus the AS number. Big deal. This only happens at provider edges. I do not need it on IGP or iBGP. It is inter-provider at the edges only. The iBGP is going to have to carry it, so it comes in signed on the left and goes out signed on the right, but nobody internally has to check it. There is a cool little feature, in that an end site, which is somebody who has two upstream providers and they are a university or a business or something, if I can trust those two upstream providers to already run BGPsec and validate, I don't have to have the whole database, I don't have to validate, they validate it for me. All I have to do is sign my own prefix and send it upstream. So I only sign and not validate, which means it can run on current hardware. You have got some junky little router at your edge, all it has to do is hold one key and sign your BGP announcements. What's interesting is of the 30,000 ASes on the Internet, 84 per cent of them -- and that's been a constant for over 10 years and we don't know why -- are what we call the stub ASes, because they don't offer transit to anybody below them. 84 per cent of the Internet is stub ASes, so most sites don't have to do anything except sign their own prefixes, no hardware upgrade requested. It is meant to be incrementally deployable, it doesn't require a flag day, it doesn't ask you to publish who your peers or your customers or anything are, which operators generally do not want to do. That's the end of the story.

This work, by the way, has been supported by the people who take your scissors away and won't let you into my country but they are also the ones who funded a lot of the DNSSEC and it is all open source and they are doing it supposedly to make the Internet a more open and safer place. ARIN provided a bunch of initial funding; of course employers, et cetera, Cisco, Juniper, Google gives us racks, NTT transport to do the experiments, et cetera. And that is my story and I'm sticking to it. Are there questions? Come on, George, throw a rock. No? How disappointing. We go to lunch early.

APPLAUSE