Conference.apnic.net/36/program#session/61755 >>Mark Tinka: Good morning, everyone. We are just waiting for the slides to load up. Welcome to this RPKI operational panel. While the slides load up, I'll just ask each of the panelists to introduce themselves briefly and I'll give a brief overview of the panel, and we will then get going. We will start with you, Randy. >>Randy Bush: Hi, I'm Randy Bush from Internet Initiative Tokyo, and I have been working on this routing security stuff for a little over 10 years ago, and work on the open source implementation of the RPKI part of the software. >>Geoff Huston: Hi, I'm Geoff Huston. I'm with APNIC. I happen to have made the terrible blunder of thinking that maybe we should complement our Whois registry publication service with certification products, back in about the year 2000, and I have never stopped regretting that thought ever since. >>Tomoya Yoshida: Good morning. My name is Tomoya Yoshida, with Internet Multifeed and JPNIC. In Japan we tested RPKI for two or three years at this time, so I would like to share with you some issues about BGP routers, et cetera, thank you. >>Taiji Kimura: Good morning. I am Taiji Kimura, and I am a research engineer at JPNIC and JPNIC is doing testing, as Tomoya Yoshida mentioned, and I want to share with you the situation and what's happening and the questions from the operators. Thank you. >>Matsuzaki Yoshinobu: Good morning, Yoshinobu Matsuzaki from IIJ. Probably I'm here as an ISP, so as an RPKI user issuing the ROA and verification and also the routers. >>Mark Tinka: Thank you very much. My name is Mark Tinka, I'm with Seacom and I'll be hosting this panel. Unfortunately, we will be losing Randy in a few minutes, as his plane is currently loading up its catering, so we will try to get through as much as we can with him, then continue with the rest of the panel. We will look at a few operational concerns that are starting to come up as part of the RPKI implementation, that we have not much experience with, but starting to ask questions now, so we can understand what to expect and how to plan for that. For those who are unaware, RPKI, resource public key infrastructure, is all about securing BGP routing and asserting that an AS number is allowed to originate such and such prefixes. From a researcher's point of view, what does that mean as an operator to run RPKI? What should they consider? As we gain more and more experience, we hope these answers will come much easier to us. First, I have on the list about certificate authority models. A couple of models have come up, where we have hosted models by the RIR and we have models that are delegated to ISPs to run CAs for their customers. There is potential also for customers to have their own certificate authorities and upstream those through the chain. Perhaps starting from the extreme right, Randy, what do you think this will evolve into? >>Randy Bush: I think both are being used and I think both need to be used. 90 per cent of the customers just want to go to a website and register their ROAs. 10 per cent or less -- 2 per cent of the customers, they are all the big ISPs, right, but there's very few of them. They want to run the delegated model, where they are also running a server and you issue their resources to them. What's amusing, as John Curran pointed out some years ago, is those 2 per cent of the customers represent 90 per cent of the address space. They are the big ISPs. So 2 per cent of the customers will be delegated with 90 per cent of the address space, 89 per cent of the customers are going to go to the web GUI and register their ROAs. >>Geoff Huston: I can't help thinking there are some parallels between this and the route registry model. Originally the route registry model had 1, 2, 3, then 10 and then 20, because there was almost no limiting function and folk ran their own route registries. Then we started hearing presentations about how the contents in one route registry differed remarkably from another and that kind of didn't work. It scaled too well and everyone ran in their own direction. There is certainly a case that for many folk, as Randy said, they have the wherewithal, the procedures and the need to run their own security infrastructure, and quite frankly, that's a business decision that for them was imperative -- no choice. For others, however, it is just something that would be good to have but if someone else has all the machinery, pedals all the bicycles, makes it all work, just fine. The RPKI, however, does one thing that the route registries didn't do, it does bind it all together. The PKI is a hierarchy. You can't just run our own little island completely independently of anyone else. The addresses came from somewhere and the certification structure does reflect that. This does give folk variance and ability to do what they want but at the same time stops complete independent proliferation of conflicting information. You can't go all the way out there and do your own thing. The addresses that you're asserting things about and trying to make signed attestations rely on their validity from the person who they got those addresses from, et cetera, all the way back to the IANA. To some extent it is a better model than we saw in the route registries in terms of that flexibility, you can do it either way, but at the same time, as I said, you can't run all the way out the room and invent your own. >>Tomoya Yoshida: I don't have enough idea for those, but I think we need more scalability and -- how to say -- the latest registry for the data. >>Taiji Kimura: My aspect from the registry is how a registry can provide very useful and secure infrastructure or information service to members or the other routing operators. The PKI has its own very unique system that has a trust chain and trust point, and the registry maybe needs to provide trust anchors for users. I would like to mention about that later, how the registry can provide information service routing registry or the ROA interfaces is my question and aspect. >>Matsuzaki Yoshinobu: I think any model will be okay if it's reliable and useful. That's a point. So if we can trust the structure, okay, then we can trust. But if it's somehow we can't trust, like the current IRR model, then, hm, I feel ISPs will use. But if it can be trusted, then we will use. That's all. >>Mark Tinka: Potentially we have an option for either model, but just as you said so far, it's quite likely that we shall need to see how this evolves, moving forward. The next one is on trust anchors. We have quite a few to choose from today. The question I have in my mind for the panel, and obviously for the floor, is will there be a single one we can refer to at some point? Is that a good or a bad thing? Will operators, like many of us here, prefer a distributor model and if there are lots and lots of trust anchors out there, where do we find the authoritative data about who they are? Randy? >>Randy Bush: The current situation is like DNSSEC before the route was signed. It's a mess. People won't choose it. It doesn't work, et cetera. IANA is where you get your addresses, IANA is the route; anything else is political games. >>Geoff Huston: Does anyone have a domain name certificate? Is everyone reading email? George has one. Thank you. How many folk issue domain name certificates? 150 or so. How many folk are registered certification authorities from those 150 points of trust? Well, about 1,500. I assume, George, you picked the most reputable one and paid the highest price because you valued your domain name, obviously. But the problem is that with so many to choose from, your browser or my browser doesn't know which one you picked. So over there is someone who hacked the cheapest, lousiest CA out there and went and minted a false certificate for your domain name, George. If my system asks for a certificate and the answer comes from the bad one, I've got a problem. So part of this issue is that if you have too many TAs, the entire system is only as secure as the worst, and there is no financial incentive to be any better because everyone is only as good as the worst. So I heard this week even more proposals to run trust anchors coming from some of the nations in this area. I believe there was mention from Korea and possibly some thoughts in Japan. But if you have a system with 190 trust points and you don't know which one to look at when you have an attestation, we're only as good as the worst. And in the Domain Name System, that's proved to be a less than good view. So Randy has been saying, maybe we should all go to the IANA and just use an IANA based route certificate. That view is fine in all worlds except the political one, and unfortunately we live in the world where the IANA is a functional contract undertaken under the auspices of the Department of Commerce of the United States Government. While in some folks' minds it is okay for a single nation state to hold a privileged position in global communications, other countries view that with legitimate concern. If we had an IANA that did not have that functional contract overtone and had a broader view of its role, those concerns might be ameliorated, but we have to recognize that those concerns are real and that various other nation states view their national and the global infrastructure as being incredibly important -- so important that single nation states should not hold privileged positions. There is no clear answer here and there is no clear technical outcome. I fear that we will proliferate trust anchors. I fear we will go down the path of domain name certificates. And I fear that that will devalue the outcome. Maybe the answer is about -- he's going to steal it back as soon as I finish the sentence. Maybe the focus is about how we can make IANA a trustable entity that is capable of sustaining that trust, without the political overtones that it currently has. >>Randy Bush: Move the IANA the hell out of the United States. They are the only ones -- it doesn't do any good, by the way. The NSA spies on us, no matter where the hell we are. >>Geoff Huston: Spies and edits. >>Randy Bush: Yes. Okay. It's the editing that is the problem, well said. But the fact is IANA is the one allocating the addresses; they are the only ones who can say what address space APNIC has, therefore, they are the one who has to give the route certificate. I agree with the political problem, but there is not a technical solution of having 36 trust anchors. Currently, if you take a default Windows system, you have over 300 route certificates in your browser. I hope this doesn't make you feel safe. Okay? The solution to the political problem is a political solution -- get IANA the hell out of the United States. >>Taiji Kimura: About this issue, I have two aspects. One is the designing of the trust and the system that can express the trust, like a bank account, how we trust the bank, there are many banks and credit cards also, and the merchants at the stores, they trust on the card that is shown by the customers. So they need to trust something or someone trustful. Then a web browser has a very huge number of trust anchors in a realistic way, so we download the browsers that have trust anchors inside it without our confirmation. But the new RPKI has at the point we can design the new trust way of the anchors. The second one is that the trust is made by users, not registries or some authorities or governments or other things, so I am hoping to build the trust mechanism which is selectable for users or understandable or acknowledgeable way of trust. >>Matsuzaki Yoshinobu: I always prefer the simple way. >>Mark Tinka: Get IANA out of the US, you say? >>Matsuzaki Yoshinobu: No, it's okay. One is okay. >>Mark Tinka: That's simple. >>Matsuzaki Yoshinobu: Simple. To avoid any misconfiguration, because so far we made a lot of misconfigurations in the Internet, and all we know about that, so in the future, to avoid such misconfiguration, I always prefer simple way, that's all. >>Mark Tinka: Thank you. >>Steve Kent (BBN): I want to comment on what Taiji said. This should not be about trust, it should be about authoritative. The problem we have in the web browser model is that we throw around the term "trust" and it is not a very good term. It's not transitive, it's not quantitative and we get into a lot of trouble. So fewer is better. One is ideal. Randy, are you starting the nonprofit fund to move the IANA out from under the DOC? Can we contribute? Is that a 403C corporation? Is it tax deductible? I will write a cheque, but I just wanted to find out. I think we should try to avoid using the term "trust". I disagree with the notion of having users trust it. Users are really bad at making value judgments in this area. We want to keep it simple, as Mats pointed out, and in this case simple is authoritative. If I looked at your credit card example, it is very different. Credit card issuers identify the users uniquely, and it is not the name, it is the 16-digit account number. They are authoritative for issuing those account numbers and the space is divided up so there are not collisions and there is a single authority that is actually responsible for doing all that, that coordinates between American Express, the international MasterCard, et cetera, to avoid that problem. They are a good model in certain ways but we should not be talking about trust here, in my opinion, if at all possible. Thank you. >>Randy Bush: When there are multiple trust anchors and I have to select from them, we are talking about trust, and that's what's broken. >>Mark Tinka: Okay, good. The next one on ROAs. >>Geoff Huston: I should add one more thing, and think it is part of the problem and why it is so sensitive. I am going to bring up the issue of the way in which we validate certificates in the RPKI. There is no doubt that given a spectrum from tolerant to extremely fragile, we went way out in the extreme fragility. So the margin for error here is zero and the margin for tolerance is zero. So let's say that I am certifying Randy for a bunch of addresses and I do not believe, in amongst the entirety of his v4, v6 and AS number holdings, I do not believe he should have AS number 53. But Randy thinks he does and issues certificates to his subordinates that includes that resource. He and I disagree about no other aspect of the holdings, except that single AS number. But because I am the issuer and Randy is the subject, all those certificates are not valid any more. It seems a fragile choice and it seems a choice made from the purity, and that world of "It's the certificate". It's not the certificate, it's the resources. That makes this discussion about TAs and the whole issue that if the TA or anyone in the hierarchy holds a differing view of the resources that are administered by the subordinate -- Steve will come and answer me in a second, I can see him warming up -- >>Randy Bush: Let's not have this discussion here, let's take it back to the IGF. >>Steve Kent (BBN): Who started it? >>Geoff Huston: This is an important discussion, because it is fragile. >>Steve Kent (BBN): A counterview of this is that a system in which it is okay to have conflicting assertions about the resources is a system which will grow to be sloppier and sloppier because there is nothing causing it to be fixed, and we have a lot of examples of those, I would rather not add to the collection. >>Mark Tinka: All right. It's looking good. But Randy has to catch his flight, so we have lost him. But that's fine. On ROA registration and validation, as operators, the question obviously now is will we issue and validate ROAs or will we just issue our ROAs? As an operator, do we care, or does one care about ROA or route validation, or do they just want to issue it, so that somebody else can validate their routes? With issuing of ROAs, if you use either a CA model or a registry model, a delegated model or a hosted model, you define how long you can register a ROA for. How long would you register a ROA for? A year, 10 years, 100 years? I do know of at least one registry at the moment who is fighting the year 2038 UNIX time issue, so that is as far as they can issue ROAs in terms of validity. Any thoughts? >>Geoff Huston: Is this a race for whoever can look up a certificate profile and find that the date format is not 32 bits long? >>Steve Kent (BBN): No, it is not. >>Geoff Huston: It is not 32 bits long, yes. That may be a front end problem, but inside the certificate instruments the date field does not have a UNIX time stamp. There is no fundamental zeroing problem in the validation instruments. That is not the issue. Whatever the issue is, that's not it. The whole issue on time stamping certification, how long, this system is built with explicit revocation included in the ability. So if you can issue with a date, there is also the ability to say, "I revoke that previous issuance" and there is an onus on folk who use this, relying parties, as part of the validation, checking this stuff is valid today, of pulling down the most current revocation set to see if that revocation still holds. One argument does say, "Make them long, revoke them when you need." There is another operational argument that says, "Don't let information sit out there and go rotten. You may not remember that you needed to revoke." This may be just sitting there until in 10 years time, someone says, "But this is still valid. Are you sure?" That's the argue unit that says, make them short, renew them as a positive action, let them die, because renewing says it's still current and I know. Personally, for me, it's maybe a flavour thing that some operators will try one and some will try the other. I like refreshing but it's a taste thing. >>Mark Tinka: I suppose the risk obviously is that if you forget to refresh, effectively it means you are offline at some point in the future. >>Geoff Huston: Which way do you want to fail? >>Chris Chaundy (Nextgen Networks): We live in a commercial world. All of these things are based around contracts. Set your ROA expiry contract plus a month or something like that, and when people renew their contracts it's a nice little jolt to remind them to sort out their ROAs. >>Geoff Huston: We had that debate in one RIR and I do not think APNIC was any exception, and one argument was that the RIR should not limit the validity times of the things it signs for its members, its resource holders, that nonpayment or a dispute over payment should not invalidate your routing. There was an argument from the community that said, disconnect the two and do not try to use the RPKI as an enforcing instrument to pay your bills. That is an incorrect tool. Paying your bills is one problem, paying your RPKI is entirely separate. There was a lot of community feedback that said keep the two separate, that there is good operating practice in certificate issuance and there are bookkeepers. That's where we ended up. >>Mark Tinka: Geoff, regarding the operational viewpoint of whether you simply issue or whether you issue and validate as well, do we foresee situations where perhaps some operators just want to issue, so they do not get cut off by the big players, or do they also want to have the ability to validate if they can? >>Geoff Huston: I'm getting very confused here, because in some ways I am making attestations about addresses and AS numbers as the holder, and I am using the RPKI to generate digital signatures across those attestations and putting them out to the world. As a publisher, that's my job. >>Mark Tinka: Correct. >>Geoff Huston: Someone who is seeing something in the routing system may wish to validate or otherwise understand that that routing update refers to information that originated with me, and the real question is: did Geoff have a part in this update or is someone pretending? So they validate these ROAs, they use validation as a relying party of the RPKI, to test whether what I said is real and valid. >>Mark Tinka: This question is from the point of view of just the single operator: do they want to have both functions or do they just want to issue a ROA so they can be validated by others or do they also want to validate other people? >>Geoff Huston: One is the export department and one is the import department. They are different departments, it is different roles. You can publish and that doesn't mean you have to validate and do the whole BGP thing, it does not mean that. Similarly, you can do the BGP validation thing, even if you don't publish, and what you will find is that stuff that refers to you goes into this big grey category of "I really can't tell whether it's good or bad because there are no credentials associated with that information." So you don't have to do the two, you can do one or the other or both; they are, in my head, independent things. >>Mark Tinka: True. >>Steve Kent (BBN): One observation I would make here is that even if you chose not to do ROA validation for all of the other Internet resource holders, you really do want to do it with regard to yourself, to see if what's being seen by everybody else still shows you as the holder of those resources. So it's a very simple thing to do and you really want to do it, to check how you appear to the rest of the world, to see if something has gone terribly wrong, so that you can try and have it fixed. >>Taiji Kimura: My comment for how long will you register ROAs for is that, when I mentioned the designing or the consideration on the people remembering about it. If we register the data in a registry, like a routing registry or the allocation or assignment of information, sometimes registries need to notify the people who have registered the data to update it. This is one aspect. Another is the registered people who registered may disappear in the future, so it is difficult to take so long duration for the validity. The identity certificate has a more simple case. If they lose the validity of the certificate, they lose their way to do something using the certificate, but RPKI has the digital signature system, so without the people who register the data, the sign of that is verified in other places, that people don't know the registered people are still valid or not. So this makes it complex, compared with the simple identification certificate, I think. >>Mark Tinka: Taiji, on your previous point, and, Geoff, I think you want to comment about the same, were you suggesting that the registry proactively notifies the relying party that there is a new ROA that you need to receive? >>Taiji Kimura: If we put the notification day by day -- the problem is that people often ignore the notification. So that would be the design of the time duration, I think. >>Mark Tinka: A quick comment, Geoff, perhaps? >>Geoff Huston: You know, I'm a washing machine. I don't tell you how often you should wash your clothes; that's your problem. Registries are passive things. They are places where you put data. To think, all of a sudden, they do this active, "Hello? Hello? You should put in more data," or they contact all the relying parties? No. This just doesn't work like that. Registries are where you make public attestations. That's what you're registering. You've put it there, it's data about you, it's your responsibility to look after it, nobody else's. If it goes stale or rotten, that's your problem. Set your own reminders, look after your data. Relying parties who are using the registry, go there as often as you operationally need to, to make sure you are getting the current data. The registry won't ring your bell either, it's not their job, it doesn't scale. Registries are almost like a big whiteboard where you pin notices to. The board is passive. It's you that goes and pins the notice and other people go up and read the notice. So this whole issue of how you maintain that, the registry is just the board. That's all it is. Everything else happens with other folk doing transactions. From that point of view, you don't really have to remind folk, you don't really have to go out and poll. None of that is a scalable operation for a registry. >>Chris Chaundy (Nextgen Networks): Geoff, to take a contrary position, sure, the registry has a responsibility to its individual members, but the registry, I believe, also has a responsibility to the community, and part of that goes against what you -- it doesn't fit in with what you were saying exactly, in my opinion. >>Mark Tinka: This obviously goes into the question of the scaling from either side, particularly if we are looking at one or a few trust anchors, where we can actually call for this information. On the trust anchors side, does it scale well to have a few, if you are pushing out this much data across the network, does it scale well to send reminders and "hellos" that we have ROAs that you need to pick up? On the operator side of things, we're using rsync today. Does that scale well with the number of trust anchors we have today, et cetera? It's the whole discussion around there. Perhaps, Mats, I think you want to say something about this? >>Matsuzaki Yoshinobu: Yes. As an ISP, probably we will issue ROAs, because still there are lots of mis-announcements in the Internet. This is a possibility to prevent such kind of mis-announcement at an early stage, so probably we will issue ROAs to protect our network. On the other hand, probably we will validate using the ROAs, because of course we would like to know the current situation of the Internet. If something wrong happens, we need to help each other. It's a community. It's an operational thing. So probably APOPS helps at least. It's not like a panel, but we can exchange some information on the mailing list about some invalid status. Should I talk about the scaling as well? >>Mark Tinka: Yes, sure. >>Tomoya Yoshida: May I say something? I just want to add to Mats' comments. I also think I want the issuing authority to use the ROA, but at the first stage, just simply whether it is invalid route, I would like to, because they have small ROAs at the first stage, so just simply the job of thinking about YouTube the accidents we can drop, it depends on how many ISPs implement the RPKI on the network. Also, the question is they are thinking about how long we register ROAs for. I think we need to refresh the data of the ROAs, but -- >>Mark Tinka: Like Geoff said -- I mean, Geoff, while you are thinking about answering, and I do have a quick question about what action to perform on ROAs in the router. But while you are thinking about your answer as well, you do not think rsync is a scalable protocol for this? >>Geoff Huston: No, rsync is not a scalable protocol for the way in which we are using it. It puts a high burden on the rsync server as well as the rsync client and if you get a massive number of clients hitting the server, you are creating a huge scaling and cost problem on the server, then you have to replicate the data and replicate the same sync as the primary and we run into real issues. I'm not sure whether rsync is the protocol we will be using in 10 years time, put it like that. Going in small, if you want to only issue one or two, there is no evil bit. I can't certify what I'm saying is a lie, I can only certify what I'm saying is true. So how do you know -- not just guess -- that something is a lie? You can only know that something is a lie when everything else is proveably true. So we make a number of assumptions in this system about ROAs that you have to be aware of. If you issue a ROA with an originating AS, does that mean you have issued all the ROAs that an originating AS could possibly have issued? So that if I see any other prefix from that originating AS, should I assume that it is bad? Now, you are the publisher, you can't speak for the relying party who is doing that assumption. But pretty typically, the only way you can find something is bad is the absence of good. So if you start issuing ROAs, it would be a very good operational practice to issue ROAs for everything you originate -- everything. So that if you see something originating from your AS that is not in a ROA, that's a very good assumption that it is wrong. Secondly, if you're multi-originating as a prefix holder, you have to think about getting all of those originating ASs to issue ROAs. Because again there is this assumption that the relying party is going to make, that if there is a ROA out there for your prefix for one originating ASs, there will be ROAs for every originating AS of that prefix and everything else is bad. As a business, you can't start small, wet your toe, test the temperature and then see if you are going to jump in. Oddly enough, you have to jump in, either as an AS or a prefix holder, that first step is the extent. There is no middle ground, because everyone else is validly making assumptions about the entirety of the information you are publishing, and those are quite valid. As a publisher, you should be aware, it's not a case of do one and try it, you either do the lot or you don't. >>Matsuzaki Yoshinobu: A comment on the first point. As an ISP, we receive ISP resources from different RIRs, like APNIC and JPNIC. Okay, we can issue a ROA from APNIC at this moment. Still we need to wait some time for the NIR to enable RPKI service. >>Geoff Huston: I can't deny that is an operational issue for you, because of the assumptions that if you are the relying parties, the consumers are making, they expect you to be able to do that, and whether you can do it or not, they are not thinking about that, they are making assumptions. So it is certainly an issue that you have but you have to fix it. What can I say? >>Chris Chaundy (Nextgen Networks): Well, yesterday at Randy's RPKI workshop, he was clearly stating that there are really three states associated with ROAs, it is either valid state, an invalid state or a not known state. Certainly the invalid is an absolute, so is the valid. The not known, the relying party can decide what to do with those. The whole action on the response is really up to the relying party, how they want to treat it, whether to just consider it an aberration, drop the local pref, raise the mid, whatever. >>Geoff Huston: Chris, the definition of "invalid" is what this conversation is about. It is not about the stuff that isn't signed at all, the grey bits. It's actually because in certification environments, what's invalid? Those are the assumptions that folk are making. >>Mark Tinka: I have a good slide on those actions. I think we will go over that again. That should be nice. The next one, in terms of location, where do you see yourself running RPKI? Is it only at the peering edge, in the borders, at the customer edge, or a combination of all on all eBGP-speaking routers? >>Geoff Huston: You have to plan for the future. Origination won't help malice, it will only help accidents. So if I accidentally announce your prefix, wrong origin AS, you will catch it. If I want to get your traffic, I know your origin AS, I'm going to announce a bad announcement with you as an originating AS and put myself in the path badly. The full story about securing BGP is actually about securing path and origination. >>Mark Tinka: Right. >>Geoff Huston: So when you talk about where you are going to run routers that do all of the work, not just half of the work, that's what I'm saying, you've got to think about the entirety of the issue. Now, once you accept a route in eBGP, the path doesn't change in iBGP. So unless you are doing something really kinky in iBGP -- which is actually everyone -- you would normally not do this in iBGP and you rely on eBGP as sort of the skin of the immune system, that that's the barrier and inside you are not doing mutual suspicion. So that's why you would think of the edge, because you need path plus origination in the fullness of time. If you are just thinking, this is origination, this is easy, I can do this everywhere, the full picture of the load and the job hasn't quite been revealed to you yet. Path is part of the answer. >>Matsuzaki Yoshinobu: I think that probably running on every BGP router in the future, of course, but at the first stage, just as a trial, maybe we will run RPKI on our iBGP router to monitor, and then expand it to edge, I think. >>Mark Tinka: You are the operator. >>Tomoya Yoshida: Thinking about the monitor, we can monitor using the border edge, just simply marking, and not the routing. So I think first is the border edge, and thinking of the iBGP. So I don't like it, because we can use the BGP community inside iBGP, so we can check the result of the validation, I think. >>Geoff Huston: Again, it depends on how you want to treat a lie. Because if you want to stop the lie before it even enters your network, you need it on the eBGP speakers, but that's going to cost. It's going to cost in processing hardware, information dissemination. I have seen models where the route validating engine -- there is only one of them for your AS or maybe two and it sits deep inside your network and has a lot of grunt -- then it spreads back poisoned routes for the ones that didn't validate. So you accept an eBGP, promulgate an iBGP and then get a special iBGP speaker to spew out blackness. Various countries have implemented their "Do not go here, I hate that IP address" filters via BGP feeds and we have certainly seen black BGP feeds for all kinds of reasons. You could do this by basically refeeding your local prefs via iBGP back to speakers. The downside is you accept the lie as it comes in, as it promulgates and gets processed, then you have another update if you didn't like the lie. >>Mark Tinka: I suppose the issue is the routers don't have the grunt to do the validation process. >>Geoff Huston: It allows you to specialize and centralize your processing. I do remember, over the years, I have seen centralized architectures that put that into one processing box. Think of SDN and go about half a step to the left and you're there. This is SDN in another form, because it is. Centralized processing, the instructions go back out to the boxes that do the work. You can do the same here, if you so wished, or you can replicate that function on every box. The problem with replication, a lot of cost, if the cost is high; if it's cheap, so what. Also, the issue of synchronizing the information, making sure that every box is making the same decision, because if one box gets it wrong, it spreads, and all the other boxes are not doing any valuable work because the lie spreads, because the one box wrong spreads the lie anyway. >>Mark Tinka: You are proposing a digital peer-to-peer model, perhaps, between routers? >>Geoff Huston: No, it's not my money. I don't run the network. >>Mark Tinka: Suggesting, perhaps. >>Geoff Huston: I'm saying operators have choice in how they do this. There is no, "you must do it this way or it's a fail mark." There are a number of ways you could did this, and you have to think about your situation. How big is your AS? What do you want to do? The same as many other things about design, which iBGP you use, how you do metrics, what are your iBGP policies. This is another place of design and engineer. >>Taiji Kimura: JPNIC has a trial server and we have several workshops in Japan, and I and my colleague Okada-san, we talked about how we think the users will configure RPKI correctly. We tried to make a session at the ENOG. ENOG is not European, it is Echigo, it is an area of Japan, and they have IX there. They may have their own RPKI in-house system in IX, and in other places, in a workshop on domain housing we talked about the route server may have the RPKI verification functions to find invalid route or by using RPKI. So I would like to have the comments from Mats and Tomoya-San specifically about route servers or other places, if it is a good way to put the RPKI for recovering from the invalid route or finding the invalid route from the BGP routers. >>Matsuzaki Yoshinobu: Probably we still remember the DreamHost incident. An AS is supposed to announce or mis-announcement, but the neighbour didn't know about that, they are peering, but they didn't know about that. Why? They use route servers on IX. So it's very surprising for me, peering partners didn't know each other. In this case, I think RPKI on the route server is a little bit help to prevent mis-announcements. Or do you have any other suggestions? >>Geoff Huston: I think it is a bad idea to outsource your security function. Truly. >>Matsuzaki Yoshinobu: But how do you think about outsourcing your peering? >>Geoff Huston: I also think that is a bad idea. That is outsourcing your policies and outsourcing your money. That is a less than informed way. I know that exchange operators desperately want to add value. You know, "Come to us and we will do more. We do your routing for you. We do your security for you." But I think it's business-wise a bad idea for your business. It's not a service you should take. Similarly, your security is your business, it's no one else's business. Route servers that are doing RPKI on the IX -- euggggh, you know. >>Tomoya Yoshida: On behalf of the IX, thinking of route servers, I think in some cases we can propagate, after the validation at the route server. In this case, we can prevent to propagate invalid route to the neighbour. That is one possible case, I think. I think the deployment of the RPKI, step by step, from neighbour to neighbour, you can describe it as a web face, I describe it as the regional model. For example, in Australia, in the ISP for the RPKI, so that the packets between those ISPs will be safe, so I think it is step by step from the regional to the global deployment. The route server, it is one possibility to add the RPKI function. >>Mark Tinka: That obviously brings us to RPKI in the iBGP. This is another operational question I have. Would you run it on iBGP-only routers or would you use the extended community that is coming from your eBGP-speaking routers? Potentially, if you try to validate your own routes internally, is there a risk that you could break your internal routing if things go wrong with the software? >>Geoff Huston: Again, you have to think about path before you really answer that question. When you think about path, there are changes to the community attributes used inside the BGP protocol. When you create isolated islands of folk who are doing this with, if you will, normal BGP in between, there is no security link. So when the full extent of doing path plus origination is there, all of the BGP speakers who are moving and update iBGP and eBGP have to recognize and transit those additional community attributes. So that's not an option. So the only issue you are really talking about is: would you turn on, if you will, prefix filtering, the results of validation, on those internal routers? I'm sitting there going, you know, if it's path plus origination and eBGP is doing just fine, I'm not exactly sure, unless you've got huge routers that are doing nothing else that month, why you would add the burden on to your iBGP speakers. That seems to be security without need, it seems over the top to me. I wouldn't do it personally. But they don't let me run networks any more, so why should you listen to me? >>Matsuzaki Yoshinobu: Yes, we will do that. >>Mark Tinka: You would run it on the iBGP speakers? >>Geoff Huston: And validate and throw out? >>Matsuzaki Yoshinobu: Just to monitor, because we are using multiple vendors' devices on our network and there could be bugs. If we validate on every eBGP session -- >>Tomoya Yoshida: I have a question. You would like to have an RTL session or for all iBGP routers? >>Matsuzaki Yoshinobu: No, just particularly for monitoring in the iBGP router. >>Geoff Huston: I would love to be your vendor -- sorry, another vendor. >>Matsuzaki Yoshinobu: To prevent or to detect something goes wrong, we like to know earlier, to avoid mis-routing on the network. >>Mark Tinka: You are saying on some but not all iBGP speaking routers? >>Matsuzaki Yoshinobu: Some. A few, actually, not all. >>Mark Tinka: Do you have a question? >>Chris Chaundy (Nextgen Networks): If you have got bugs in some place, you have probably got bugs all over the place, so what's the value? >>Geoff Huston: There are a number of studies out there, and if you look you can see them yourself, that many larger operators use policy routing on iBGP. Not all of your network sees all the same thing, and certainly you don't announce the same things everywhere to all of your peers. As we see greater diversity -- and iBGP was not designed to do this, folk just do it -- there are these sort of boundaries of route sets inside your network, where you are transitioning from one universe to another. You might, as you are designing that topology and architecture, think about those boundary points and what you need to revalidate, which is different to monitoring. There are grounds for doing that, but it is really about whether you are applying relatively radical iBGP policies inside your network. >>Matsuzaki Yoshinobu: But we are simple. That's not our case. >>Mark Tinka: The really worst use case of RPKI -- and these were the actions we were referring to earlier, obviously we like to receive valid routes and it's obvious what you do with those. What about invalid routes, especially as an initially deploying entity, would you drop all the invalid routes, would you drop all unknown routes or would you have a roadmap to what type of action you perform on an attestation or lack thereof? >>Geoff Huston: You have to understand that pulling all the material needed to validate BGP updates requires you to go out on the net and use rsync and pull them. But that only works if you have routes. There is a certain amount of circularity here, that if you start in a state that says, I will only accept validated routes in my system and I haven't yet booted my local cache and for that I need to go out and use the routing system but I have no routes, you could be talking to yourself for a few years and not having a network. To some extent, that is a very interesting question, and whether you drop the lot or whether you de-pref them is left in many ways as, it's up to the operator. Certainly, if I can't validate or invalidate, there are no credentials around, operationally it makes a lot of sense to think, that's okay, I might de-pref it from something that is valid, but that's okay. Invalid: is it a mistake or is it attack? Penalizing mistakes, particularly if they are honest mistakes. Now we get down to the heart of the question, which is a great question. Are you trying to validate the protocol or the intent? I really meant to advertise this prefix but BGP stuffed up and someone helped me by faking it and sending it to you. I meant to do this, it is real, but the protocol had an air gap. I really want you to learn it, but it was a mistake, if you will, that got it to you. Those are the tricky bits where you are saying, I'm not sure I really wish to penalize invalid all the way. If I have to use it, maybe I should. >>Mark Tinka: But how do you flag honest mistakes? >>Geoff Huston: That's the same as the evil bit, there's the next bit, which is the honest mistake bit. >>Taiji Kimura: PKI-based system has similar questions. We have always, like S/MIME based email exchanging and HTTPS servers. Web browsers reset the connection between the servers that have no server certificates, and at the beginning of the server certificate, a commercial server certificate. Sometimes my colleagues or me have kind of a dream, all browsers can avoid the non-SSL server's connections. But that is only a dream and nothing happens or something happening in web browsers, like EV SSL, but in email, PGP/MIME or S/MIME, we can use and we can exchange each other. And like invalid routes, we always have this much bunch of spams in your email clients, so we choose the email clients that can handle so many spams, like invalid routes. So this is my imagination: the invalid routes and unknown routes will exist in the long term, maybe forever, and the amount of them makes us decide the reactions for the clients, like relying parties, to treat such routes or ROAs, I think. >>Mark Tinka: Well, I suppose the existence of invalid routes is accepted. The question then is: what will you do with them? If we are more gently to the presence of invalid routes in our routing domains, then what was the point to begin with? >>Matsuzaki Yoshinobu: Yes. Probably it depends on the quality of ROAs. I say just 1 per cent of ROAs somehow is a mistake, 1 per cent is a figure, but just a few ROAs, just a mistake, then we can develop invalid prefix probably. But 30 per cent or 40 per cent ROAs somehow mistake, then we should accept. >>Geoff Huston: There two things going on here and one is obsessive compulsive behaviour versus reality. We seem to think we have to validate every update -- valid, invalid and so on. But look at the way BGP actually behaves. How many updates do you get a day? 100,000 or 200,000, but it's pretty constant. If you move out your iBGP and look at eBGP, it's not a lot. Secondly, how many of them are repeats of the same information within the last even 30 seconds? BGP just simply goes again and again and again, and cycles around very, very small paths of difference. If you validate something and you cache that, you will be using your cache almost all the time. The next observation is: why don't you use validation as a random walk through your FIB and just forget about doing it on ingress, but just randomly, when you are doing nothing else, validate something, because most of the updates are just repeats of previous updates, and once you have invalidated, all the updates for that particular thing remove. When you think about doing this, there are many ways of doing it that lever upon the actual behaviour of the information in BGP, as distinct from the operation of the protocol. If you go down that path, I actually think you get to answers which are pragmatically useful. Will I accept an invalid route? For about an hour or two, and then maybe I'll black-hole it. In other words, once I'm really sure and there are no further updates or whatever, then I might. If you combine knowledge of the behaviour of BGP, true knowledge, not the protocol but the information load, against what you are trying to validate, you might come up with answers that aren't horrific in the amount of processing load to do this validation. I think what we are seeing right now is almost like DNSSEC version 1, and we are at DNSSEC version 20 by now. I think there is a long way to go in trying to validate the operation of a protocol and the real intent, trying to stop people lying in routing, and understanding the information flow in routing I think can help us do validation in the protocol better. That's a fair deal away from where we are now. >>Chris Chaundy (Nextgen Networks): What you are saying, Geoff, is almost sounding like route damping. I don't know if that is a good or bad thing. >>Geoff Huston: It is a bit like route damping, that if you see the same behaviour again and again and again you get less tolerant. >>Maemura Akinori (JPNIC): In terms of the values of RPKI, it would be really helpful for the community to know how realistic BGP security with RPKI can be used. It used to be recognized that still the RPKI and route validation by the router is a really very heavy process for the router, but I don't think it is the case right now, so I would like to hear your experience and your perspective on the reality of using the BGP security with the RPKI right now. Thank you. >>Mark Tinka: As Geoff brought up, from my point of view, as somebody who is testing an implementation, we are not doing the validation on the router itself, we are centralizing that process and just feeding the results back to the router, which will then perform some kind of action. >>Geoff Huston: This is just an automated prefix filter, and that's all origination is. >>Mark Tinka: Pretty much. >>Geoff Huston: The information load on the router is one cycle per second. It is basically just an automated managed filter list. Path is nothing like that. All of a sudden you have to do asymmetric keys on the machine that is doing the validation of that path. There are no quick fixes on that one, that's a heavy load, undeniably heavy. Some answers are to put graphics processor chips on every single route -- fine. We have it on every laptop now, why not. You put heavy duty processing and do it on every router or you offload it somewhere else into a route server model or similar and do all the processing elsewhere or you make it a background random selection. But it is, undeniably, a lot of work either way. That's the issue. It's not origination, it's path that from an operator's perspective, I think, has a much bigger impact than origination. >>Dean Pemberton (InternetNZ): Interesting point you make about having to put GPUs into all the routers. Presupposing that is the place that you would do this, what if there was some advanced technology out there that you may be able to pull the control plane back to somewhere where the amount of CPU wasn't the problem? That would be kind of cool. >>Geoff Huston: That is the route server idea, that you actually pull these things back in and then do an iBGP back out of poison; right? >>Dean Pemberton (InternetNZ): Not necessarily. >>Geoff Huston: There are all kinds of models that pull it back out, process it and send the results back out to the points. It's not a filter list because you're actually zotting one prefix and trying to cause a recomputation of best path to the FIB, whereas with route origination it's an incoming bar as a filter, which is why there is a subtly different flavour to path than just origination. As I said before, if you are doing origination, you are stopping fat fingers but you are not stopping me because I'm evil. >>Dean Pemberton (InternetNZ): Yes. If the problem we are having with path is because it is very, very difficult to imagine that we are going to get enough compute into the routers or replace all the routers with something. If we can put the compute somewhere else, that compute is not a problem any more. >>Geoff Huston: Yes. >>Dean Pemberton (InternetNZ): I purposely did not say that and I'm not going to. >>Mark Tinka: Okay. >>Geoff Huston: Might I say, if you are thinking of not doing path, this is a huge amount of work and cost and effort for fat fingers. This is not the end point and it should not be mistaken for an end point. Where we are now is not viable to move on with. We have to do path if we are going down this track. >>Mark Tinka: Supporting RPKI in the core. We have a limited set of hardware today that provides support. As an operator, would you wait until support is much more widespread, ie test the waters, or would you deploy hardware today that does support RPKI? It's not such about the hardware as it is about the software, but then vendors decide to make you buy new hardware by stopping development of software and hardware, and can still do it. What about if other operators can support RPKI origination today, and path tomorrow, will that put pressure on you to consider your own implementation? It almost sounds like IPv6. If it is just the software that you need and you have the hardware already, would you feel the pressure to upgrade? >>Matsuzaki Yoshinobu: Is this question for the path validation or origination validation? >>Mark Tinka: Whatever we have today. >>Geoff Huston: Today. >>Matsuzaki Yoshinobu: Okay. So it's talking about origination validation. Hardware, I think it already supports RPKI/RTR protocol, so just on software, I think. >>Mark Tinka: What I was saying before is that some vendors do not provide additional support in software for existing hardware, so you are forced to buy new platforms that you didn't really need. >>Matsuzaki Yoshinobu: But anyway we need to upgrade our hardware to deal with more bandwidth, more ports, more memory. So we are continuing to upgrade our hardware year by year or time by time, so I think that is not an issue. Then the next question is the software, when we will update to support RPKI. It is probably the same thing here: the latest software probably supports RPKI as well. >>Geoff Huston: Not path. >>Matsuzaki Yoshinobu: Not path. Just origination validation today. I think we can support RPKI automatically. >>Geoff Huston: I wouldn't do it. I would not do it today. The reason why I would not do it today is I think there are very small islands of use and it's only me and my neighbours. The level of ROA publication is low and the ability and understanding of the procedures to get us out of the problem of automatic filtering -- when route damping first came in and operators leapt into it, most of the operational calls were -- I've been route damped, for God sake, will you move that bloody filter off my announcement. It wasn't, you got me and I was flapping, it was, you got me and I wasn't flapping, will you stop doing this to me? These are very early days now and there is an impetus from the vendors and other folk to deploy the latest and greatest in your production environment today, but most production engineers would say, "My job is to limit the number of angry customers ringing up and saying I've failed them. My job is to create a service that just works. Mistakes are bad, but me causing mistakes is even worse." So if I'm running your network for you as your engineering manager, and I go, I've heard this RPKI stuff, Geoff, why haven't you got it in? I say, look at the calls on your helpdesk, I won't do it until I'm sure I won't cause problems. I don't think we are running right now a mature system that is capable of resolving this. This is like the very early days of route flat damping: there are too many unknowns in the environment. I have no expertise, I would be doing the stuff, but I would be not feeling the filters on my front-end eBGP routers. But I would run, and I think I did run, a relatively conservative network. It is not my job to have things crash. So, no, I would not do it today, but I am not saying to anyone else, you should do that. These things become style and approach and what kind of network are you running. In the lab, I'm going, fine, this is cool stuff, need to understand, need to know, expertise is good. Fielding it out there? Unless someone pays me bucketloads of money, no. But if they paid me bucketloads of money, I would have a hard time. >>Matsuzaki Yoshinobu: As I am an ISP or a particular ISP, we will do that. >>Tomoya Yoshida: I think the -- yes, I don't know whether we try the whole routing side of my AS, but we need an interoperability test, because the last year in Japan we test it using Cisco and Juniper, and other times some implementation was long of the extended community implementation. At the test, we fixed the problem, so we need to test and test for the future. I also think the current implementation of the Cisco and Juniper and Alcatel-Lucent is coming, but CLI or some additional function is very poor, I think, currently. So we more have some serial command and statistics and information and it will be useful for later, so we need those kinds of comments in the log, I think. >>Taiji Kimura: This is my thinking, because I am not the operator for BGP routers. In the previous IETF the three events have taken in the RPKI workshop and we have a discussion about the deployment of RPKI, and someone said that the automated prefix filter is not only the value of RPKI but also the operators can see the invalid route easily, compared with previous. Then updating the equipment is, as Matsui-san mentioned, the equipment is updated periodically, every three years or five years, but in the period when the people who operate the BGP routers find the value of RPKI. My guess is that when something happens, it shows the RPKI if the RPKI has taken in that routing operation, the operators should be helped. So that will be when something happens. >>Mark Tinka: We have just about four minutes until the break. I will try to aggregate the last slide with the previous question about securing the RPKI router protocol. Based on the validators that we have today, it appears that we might have to do security maybe at the network layer, not so much in the protocol itself at this time, or add functionality, like tunnelling through SSH and so forth. What do we think about that? While you think about answering that, the last slide here talks to what kind of support do we expect registries to offer, in terms of troubleshooting issues, particularly the hosted CA models and so on. From a customer perspective, if your BGP customers come to you and say, "I need a service," are you going to mandate that they have ROAs issued, like you do today have a route object in your favourite RIR? That is three questions. >>Geoff Huston: I have this front door on my house with five bolts and alarms back to base, is made out of solid steel, and the window beside it is open. You haven't really done anything. Why do I need to secure between the validated cache and the router when I can just break into your router and do the damage anyway? To some extent a certain amount of perspective is required, and understanding that security into your control plane is the big picture. You can do lots of little things but you have to prioritize them in terms of risk and return. You may want to do that, you may not. I don't think there's a rule that says, "Everybody should do it," because that's not true. You might protect your control plane so that the individual conversations in your control plane are okay because your control plane is well protected. So from that point of view, I don't think there is a golden rule. Should you be geared up to handle cases on automated filters? If you are applying automated filters, they are going to get it wrong and at point and customers will be denied service. If you are not able to respond to customers saying, "You have stuffed up," why are you in the business? >>Mark Tinka: This is from the registries now. >>Geoff Huston: The RIRs themselves? >>Mark Tinka: Yes, for hosted models and so forth. >>Geoff Huston: For hosted models, the issue is, who is driving the portal? If you are ringing the RIR and saying, "You own my keys, you own all of this, you do this for me", then if you're doing all of that, okay, you've got a problem. If I'm saying, "You've got all the pedals and all the keys and all the control, I am merely the gears", there is nothing I can do for you. You have got all the credentials; I'm the machinery. If my gears have fouled up, if the machinery is not working 24 hours a day, I need to know. As an RIR, that's my obligation. The machinery is running. Your profile is what I'm hosting for you; your keys, not my business, absolutely not. That's yours. And I can't fix that. I'm not you and I don't want the liability of being you, with all due respect. As RIRs, I certainly see from our perspective, in the RPKI hosted services, we run the machinery, you provide the profile. I can't fix your profile settings and your content, but I can and I want to know if the machinery isn't working at any point. >>Matsuzaki Yoshinobu: How about stable publication service? >>Geoff Huston: Stable publication service? Stability is such a strange thing, because sometimes stability is your problem, not mine. Right? Network faults occur everywhere. Am I responsible for all of the Internet as an RIR? I do not think so. Can I make sure the services I put up are in places that are well replicated, well provisioned, adequately done, to what we understand is good or even best practice in engineering? Take that, do that, absolutely. Fix your faults? No, not my problem. Again, we will do what is appropriate, and I think matches expectations, but we are not going to guarantee 100 per cent Internet just to make this work. That is everybody's problem, not just the RIRs. >>Matsuzaki Yoshinobu: You can't fix my problem but we can fix your problem. >>Geoff Huston: Thank you. >>Matsuzaki Yoshinobu: So both of us need to do our best. >>Geoff Huston: Sure. >>Mark Tinka: At that point, unfortunately we are out of time, if only we had more. Unless there are any parting words, unfortunately we are out of time to take any questions from the floor. I would like to say thank you very much to the panel, and I hope we have all had some food for thought on the flight back home. Thank you. APPLAUSE >>Sunny Chendi: Just one housekeeping note: we have an election today on the Policy SIG. If you wish to vote in the election, you can collect the ballot papers just outside the room on the voting desk. Thank you.