Transcript - APOPS
While every effort is made to capture a live speaker's words, it is possible at times that the transcript contains some errors or mistranslations. APNIC apologizes for any inconvenience, but accepts no liability for any event or action resulting from the transcripts.
Philip Smith: Good afternoon, everybody, I think we should make a start.
This is the first of the three APOPs sessions. Just to give you a little bit of background about the Asia Pacific Operators Forum, APOPs has existed as a mailing list since probably about 1996 and it became a little bit more than a mailing list in early 2000s, 2002 we did the first of the APOPs BoFs at APRICOT. APOPs has now become the plenary part of the APRICOT Conference as well as the APNIC Conference. We have two APOPs meetings per year.
This is the operators' opportunity to talk about what's interesting and so forth for them within the Internet industry in the Asia Pacific region.
The three of us who chair this thing are myself, Tomoyo Yoshida, who was sitting at the back somewhere, and Matsuzaki Yoshinobu, from IIJ. So Tomoyo Yoshida
from Multi-Feed and Matsuzaki Yoshinobu from IIJ are my co-chairs on for this. I will be chairing this session, Tomoyo the next one and Matsuzaki, the other session on Thursday.
There is a website, which probably needs a bit of work and there is also the mailing list which you are all welcome to join.
At APNIC 34, APOPs is part of the regular program.
We did the general call for contributions. If you are on the APOPs mailing list as well as apnic-talk mailing list, as well as expressing interest in attending APNIC 34, you would have received the mail shots from the program committee.
The program committee stayed on from APRICOT 2012.
We asked them if they would like to help develop the program for APNIC 34 and about half of the program committee stayed on to help put this program together.
So I would really like to show appreciation for their effort in joining conference calls, which for some of them were late in the evening or on weekends, to help put together the APOPs program.
I should advertise lightning talks. We have lightning talks again this week. If you don't know what they are, they are -- I was going to say they are very fast presentations, but they are not, they are 10-minute
long presentations that hopefully are not people speaking twice the speed they would normally speak.
They are short presentations on a topical subject.
The program committee is generally working on a one to two months lead time, whereas lightning talks, if something happened last week or this week that you want to talk about or you think is interesting for the operational community, then the call for papers for the lightning talks is open now.
All we need from you is the title and a short abstract about what you are going to talk about. You don't need to prepare any slides if you don't want to.
You have 10 minutes. We run the lightning talks on Thursday from 4.00 pm to 5.00 pm. We will finish slightly earlier on Thursday because of a social event, we need to start heading off for that at about 5.30 or 5.45, so we will have one hour for lightning talks on Thursday, which after two sessions of Policy SIG might be quite an interesting alternative light relief after half a day of policy.
If you would like to submit, we have the submission at apnic.net, go there and select the lightning talk entry and put in your ideas. The program committee is looking forward to hearing from you.
This afternoon's agenda, we are running a little bit
late after the extended opening plenary. We have three presentations, the first one is "The Internet in Cambodia" from Samol Khoeurn, the second one is "Analysing Dual Stack Behaviour", by Geoff Huston and the third one is "Looking at DNSSEC Where We Are (and How We Get To Where We Want To Be)", by Rick Lamb ICANN.
First up, Samol.
Samol Khoeurn: (Khmer/Cambodian spoken) My name is Samol from eintellego. Today I would like to talk about the Internet in Cambodia. This is the agenda I am going to talk about.
The Internet users in the Kingdom, today I heard Professor Kanchana from Thailand, who mentioned about Internet users in Cambodia and I got some information from the MPTC, the Ministry of Post and Telecommunications, about the Internet users in Cambodia and also some local ISPs suggested about the number of users in Cambodia.
There are some disputed figures about this. The local ISPs suggest that there are 500,000 users of the Internet in Cambodia, in recent articles, and also the government, the MPTC, the Ministry of Post and Telecommunication, claims to have 1.7 million Internet users in Cambodia in 2012. So we have some conflicts regarding the number of Internet users.
I have also seen about the number of Facebook users in Cambodia, we have 630,000 users. I think most people in Cambodia surf the Internet for Facebook, everyone, when they meet each other, when they meet new friends, they say, what is your Facebook, they don't ask about phone numbers any more.
The number of Internet users in Cambodia is plus 10 to 15 per cent of the Internet users should be the accurate number of Internet users in the Kingdom.
I am not saying the reports from the government is wrong, it is just what I think.
Most Internet users in Cambodia are 35 years old, younger than that, because the old people don't really know how to use the Internet, they all lived in wars, genocide regime, which is a bad time for them, so only for the young generation.
Most users of the Internet in Cambodia usually are students, most Internet users are students, some businesses, government officials, but as I have seen so far, students are the most Internet users in Cambodia.
So far, we have 27 companies registered for Internet operations in 2011. Here, I just listed some companies, some businesses well known in Cambodia, like Ezecom, AngkorNet aka MekongNet, Digi, it's a very good service I got from Digi, they have provided some content on
local servers, like movies and songs you can download freely; and also Chaunwei, CityLink, Online, WiCam, EMAXX. EMAXX is an ISP that is going to provide 4G technology, but I'm not sure when they are going to finish, but I see the advertisement for 4G technology.
Here, the connectivity of ISP in Cambodia. We have the link, if you go to click on this link you can see the animation of this picture. They provide the AS number, as well as the name of the company, and if you click on it, and if you click on each name you will pop up the description and also the link to each upstream and also the local connectivity with the ISP within the Kingdom. It is very useful information on this website.
Talking about the fibre connectivity, so far we have a Chinese company, so-called Cambodia Fibre Optic Cable Network, which has a fibre length of 5,000km, and Telecom Cambodia, government owned, around 1,000km, and Viettel, a Viet Namese Government company, belonging to the Viet Nam military, it has 16,000km. Also some ISPs in the country also have their own fibre optic link, so totally we have over 25,000km of fibre optic across the country.
Here is the national optical fibre backbone. As you can see, the green link belongs to Viettel Cambodia, which belongs to Viet Nam. You can see we have two links
from Phnom Penh to Viet Nam and also from Phnom Penh to Siem Reap along the river, and also some of the new optical fibre network by GMS, which is the Greater Mekong Subregion, from Cambodia to Laos to China. This is run by Huawei company with Telecom Cambodia.
This is the link to international. You can see we have three links to international, one to Viet Nam, one to Thailand and another one to Laos, across to Laos and up to China, to Kunming.
I would like to talk about the Internet access tail types that we usually have for the home connections. We usually have coaxial cable and for the business users we usually have fibre optic. Wi-net is not so popular in Cambodia but the most popular one for the users, the mobile users, like mobile broadband that we have, so-called dongle, usually in Cambodia we call it 3G USB, so you have the SIM card in the USB and then plug into your computer and then you surf the Internet. This is very popular in Cambodia.
For the DSL dial-up, ISDN, we rarely see in Cambodia. Myself, I have never used dial-up at all in Cambodia.
Transit pricing is still expensive in Cambodia and most transit comes from neighbouring countries, mostly from Viet Nam, and Viet Nam from Hong Kong. Cambodia is
considered to be the highest cost for telecommunications and Internet service in Asian countries, we are the 10th country in this region.The transit cost is around $100 per meg. It is dropping down but it is slowly dropping down.
This is just an example for the residential pricing.I just quoted from one of the ISPs in Cambodia, called Digi. I myself also use this company. You can see the price, it has two kinds of prices, home unlimited and unlimited -- limited is about the data usage, if you buy 3 MB for the data usage of 12 GB and it costs $12 per month and if you run out of the data usage, then you only have 256k. I do not know whether it is very expensive compared to Australia or to other countries, but this kind of speed with the data usage is still expensive for me. This is for business pricing. In my office, we have got 2 meg and it costs about $250, so the same -- this is the quote from Ezecom, which can be said to be the biggest Internet provider in the country. I just say that, it's not official that it's the biggest one.
Talking about services, IPv6 is not publicly available for any users in Cambodia. However, IPv6 transit is available to some ISPs. Static IP for business users, it's not available for home users. Sofar, I know there is one ISP in Cambodia online, they can rent you the static IP address, even though you are the home user, you can call them and ask for a static IP address for the price of $8 per month. Actually, I just made a phone call this morning asking about the price again and they said, OK, we can discount for you, only $4 per month. I think I will consider.
Some ISPs also provide the local content -- gaming, TV, movies, shopping, like Sabay, sabay.com.kh, known as CIDC, very famous one for gambling; and also Digi provides song downloading, movies for watching online, which is free, and I like this kind of service because I spend a lot of my time watching movies.
The data centres. In Cambodia there is no carrier-neutral data centre. It's not like in Australia that we have a global switch which is carrier neutral, in Cambodia there is no such kind of thing. However, the ISPs which provide the services of co-location and hosting, also available in Cambodia, we have Chaunwei, Ezecom, MekongNet, Online, Wicam, they all provide co-location and hosting.
We have a problem here regarding the Internet exchanges in Cambodia. Based on the website of HT Networks, which is the information I got from them. We
have two licensed and unlicensed Internet exchanges in Cambodia. However, in here we only have to license by MPTC, which is HT Networks and Finder, and three others unofficial, stated by HT Networks -- Telecom Cambodia, MekongNet and CIDC. However, my friend who works in MekongNet says, "No, we have the licence too." And the guy from CIDC says, "No, we also have the licence." So I think the law is being written.
So this information is probably completely wrong or completely right, in the next few weeks we don't know.
So this is Cambodia. Welcome to Cambodia! Although, based on the HTN-CIX, we got some information appearing that links to the ATN network, you can see most other ISPs in Cambodia also already peer with the HTN Network, we have Telecom Cambodia, Camintel, Online, blah, blah.
This is the peering summary. We have both IPv4 and IPv6 peering. We can see some prefixes with IPv6.
Next I would like to talk about social media in Cambodia. Facebook, very, very popular, even myself, I spend much time on Facebook. My boss said, "Do not spend time on Facebook!" But sometimes I still do.
According to the social data, the number of Facebook users in Cambodia right now reach 600,000, very fast growth of Internet users in Cambodia, mostly are
students from company staff and government officials.
LinkedIn is a professional social media and many Cambodian companies start using LinkedIn, especially for HR, some HR training all have LinkedIn account.
I myself also have LinkedIn account.
Twitter is not so popular in Cambodia. I have an account on Twitter but not many followers. I got followers only from my college based in Australia. My boss keep watching me, every time, everything, everywhere, when I post something, he always knows.
I don't spend much time on Twitter because I am being watched.
Angkorone is the only one local social networking.
I think Angkorone was launched for the first time in 2008 or 2009 by a guy from the United States, but he is Cambodian, Steven Path.
So he came back from the United States and came to Cambodia with the concept of social networking. It was quite popular the first time, because Facebook was not really popular in Cambodia at that time, and I also got an account on that one. The number of users of Angkorone, right now reached 150,000, which is high enough, but it has stopped growing because of Facebook coming and getting over Angkorone. However, Angkorone is cool, I love it.
Google+, many users, I got many friends on Google+, we like the features of videoconferencing, my company also uses Google+ for conferencing, because we so-called hang out, and it's good. However there are not many activities. I got many friends there but no one post about their status. I don't know where they are. If I want to know where they are, I go to Facebook and check their check-in.
Next I would like to talk about the organizations in the Kingdom. We have BarCamp, ISOC-KH, NiDA, KNIC, CamCERT. BarCamp was founded in 2008 by a group of IT specialists, also with a journalist and some locals.
They had events two times per year and mostly they talked about the technology and development stuff, and it's cool, many participants each event. I also joined this one, and I'm going to join the next event, which I think they have next month. I'm going to talk about Juniper. I think I'm going to talk about the Junos introduction to some students in Cambodia during this event.
ISOC-KH, officially we had the annual meeting last year. It was created or founded by Norbert Klein. He was in Cambodia for a long time, and nowadays we have around 100 members. NIDA is owned by the government, it is responsible for managing some policy, regulatory --
and KNIC is responsible for IP address managing and domain names, but we only have the website, there's no group of people working on this. It's like the team without operations, there's no operation, it's just the website. I'm not so wise about this, but my friend told me KNIC is not in operation right now.
CamCERT is also owned by the government and is under the control of NIDA. CamCERT stands for computer emergency response team.
I think this is the last slide. I want to spend a few minutes -- maybe a few seconds -- talking about the future Internet of Cambodia. The 4G technology is coming. This was announced by an operator called EMAXX, so the Internet in Cambodia is going to be faster with the devices that support 4G, and I'm looking forward to having that.
The growth of mobile and home Internet users, also based on the MPTC, they said that in 2015 the Internet users in Cambodia will reach 1.8 million. I don't know if that's true or just what they say. This year the number of mobiles reached 1.5 million. The Internet price is also dropping down slowly, e-commerce is coming, but just like my slide says, cash is still king.
Although some banks already got e-banking, like ANZRoyal, also FTB provides e-banking, which I'm using.
No PayPal in Cambodia. So sad.
That's all from me today. Connect with me at LinkedIn. No Twitter! APPLAUSE
Philip Smith: So questions for Samol?
Samol Khoeurn: I hope not.
Philip Smith: Thank you. Next up we have Geoff Huston, who will be talking about analysing dual stack behaviour.
Geoff Huston: Good afternoon, everyone. Hopefully someone has some slides -- either that or I'll make it up as I go along.
I have two legs. Oddly enough, it's better than having one leg, and generally I go faster with two legs than one leg.
Most of you are running either Macs or relatively recent versions of Windows. Most of you have two protocols running right now and most of you are probably finding that it's not faster to have two protocols, sometimes it's slower.
What's going on? Why exactly is this so weird? Why exactly do two protocols run worse than one? I would like to look at that question and try to analyse what the problem is.
Keyboard. What does a browser do when it is given
the choice of v4 or v6? The real question is, going into this room, which for most of you is a special room, because most of you don't run dual stack, is this room different in terms of being better or worse than simply just running in the v4 world? I want to roll the world back just a couple of years and look at what happened in, say, Windows XP or Mac OS 10.6. Interestingly, the systems had an unconditional preference that if you are in an environment that supported v6, if your machine managed to find a v6 router somewhere and got a v6 address, whenever you went to a site that was dual stacked, whenever you went to something that had both As and AAAA records, unconditionally it always tried v6 first. What do you reckon? Sensible? Or don't care? Or my mail is really interesting? Most of the developers at the time thought it was quite sensible. They wanted to encourage use of v6.
Having your browser unconditionally saying, cool, let's just do v6, is something that kind of worked. If it had v6, it did two DNS queries, waited for a while and if the AAAA query completed, it just did v6.
What happened when your browser thought it had v6 but something was broken? Which happens a lot of the time. What happened when you tried v6 but you really
had to pull back to v4? How long did it take? Does anyone remember? If you were running Windows, it was kind of OK, it only took a mere 19 seconds. 19 seconds! That's abysmally slow. But if you were on a Mac that was lightning quick because a Mac took 1 minute and 15 seconds. Think of this: if a web page has 10 components on it and there are 10 dual stack fetches, you've just wasted a large segment of your life. But you are still better off than the folk running Linux because they were really in a world of suck. This really hurt. It waited for up to 3 minutes before falling back to v6.
So obviously that was crap, obviously that was never going to work. About a year or so ago we started to change this and one of the changes happened in Windows, with the release of Windows 7, or it might have been Vista, where they started doing preferences. Instead of going v6 first, we noticed some of the auto-tunnelling techniques that were so fashionable years ago were really quite shocking. Windows in particular started to pref them down. If you had that auto-tunnelling NAT traversal thing called Toredo, you would only use it when hell was freezing over. Slightly better than that, if you had 6to4, if you were one of the few with unicast native v4 addresses, you would try that but you would try v4 in preference to either.
Now you would only use v6 in preference if you were getting v6 from your local router. If you were tunnelling in any way you would probably not use it.
What would happen if it failed? Exactly the same as before. It still sucked. I'm amazed that we are sitting there doing millisecond based machinery, we're busy putting fibre across Sydney Harbour because it is 3 microseconds faster than its competition. Someone is laying a fibre cable across the Atlantic because it is 6 milliseconds faster than its competition, and someone is putting out an operating system that takes 180 seconds to fail over.
There is something deeply broken about this industry, that manages to do one and the other at the same time. Obviously this sucks and if you were behind the dual stack connection, your life was rotten, it was abject misery, something was completely broken about all this.
They tried a new idea, you modify things a bit and only use Toredo -- never. This idea that you won't even look up the DNS if you are using Toredo, so now you rub out the Toredo stuff and try again. Still broken because it's not the right way of thinking about things.
Trying one protocol and then trying the next is really, really stupid. And it just doesn't work. What
we need inside a dual stack world is a better class of failure. We need to be able to break things with more style and panache than we are currently doing.
Part of it is that our operating systems were built about the same time we were chiselling letters out of granite. These things are only relics coming out of the trees, they are very old operating systems. One thing that they did was they never thought about parallelism.
Operating systems are amazingly persistent because the operating system thinks when you ask me to connect to an address, there is no plan B. So I'm going to try to connect to that address for as long as I possibly can because there's no plan B.
When the browser says, connect to this v6 address, it doesn't just try once, it doesn't just try twice, in the case of Linux, it tries up to 15 times over 3 minutes because there's no plan B. When Unix was originally written it was never written as dual stack, it was written as monostack. What's the alternative if you couldn't connect? There was none. So the operating system had one view but the browser had another. That was the real problem that was going on.
This is like a two-horse race. When you start with one horse, you send it down the track, and if it dies on the way you shoot it and send off the other. That's
about as dumb as what we are doing. If you really want to conduct a two-horse race, what you should do -- because this is a computer and it can take it -- is start off both protocols at once and see which one wins.
It's the same as the two-legged thing, you use both at once.
Amazingly, it's only in the last few months that this modern way of thinking has caught up with the computer industry and finally after years of this, we have come up with operating systems that do slightly better.
The beauty about Macs is they declare themselves with that brightly lit Apple. There are a few of you here. Some of you might be running Safari. You are in a dual stack world now. What goes on? Interestingly, now, what Safari does is it tries to figure out how far everything is away by measuring the round trip time. There is a certain command that will display the cache of round trip times that are going on.
When you go to somewhere, it looks it up and says, I know the round trip time in v4, I know the round trip time in v6, I'm going to go with what's fastest.
What if you've never been there before? Well, I'll try the other one. How does failure work now? Now failure is really quick. If it doesn't work within one
RTT time, you flick over to the other. Sounds stunningly brilliant.
Who runs servers? Some of you? Good. How many IP addresses do you have on that server? I have noticed with, say, Google there are five addresses, Facebook there are a whole bunch, and there seems to be some of this mythology out there that if I really want resiliency in a server, I will have multiple A records.
This gives me better resiliency because users can pick and choose.
What about in v6? How many addresses should you have in v6 for a server? Is one enough if you are big and important? Maybe you should have two or maybe three. Think again. Because those poor users on Safari lose, because if there are multiple v6 addresses, all of a sudden that really fast fail-over doesn't work any more. Safari says, "I'm going to try every single v6 address in turn before failing over." That lightning quick, "Let's try one and the other one doesn't work", fails miserably.
This idea that multiple addresses is better than one and resiliency is all about stacking stuff up doesn't work with some of these protocols and multi-addressing makes some of these things fail miserably badly.
Maybe you should think about one v6 address.
Anyone running Chrome? Chrome is great. I love Chrome. The way it works is that the DNS is a performance indicator, that if you resolve the v4 address really quickly in DNS, that means v4 will be faster. That's bollocks. The whole idea that the DNS is the same as the round trip time for data is a fantasy that maybe only Google could ever think about. It's completely wrong.
In this case they start up a DNS race, they query both the AAAA and the A record, whichever answers first that is the one you lock into, which is bizarre. If it doesn't launch within a third of a second -- why a third of a second? Nobody knows, it's just one of those numbers -- you fail over.
Firefox, some of you are running Firefox. Run it, it's really good. Set up a thing called fast failover and all of a sudden you get a decent amount of performance because it is a race to complete the TCP handshake. You start off both connections, do the DNS independently and in parallel, it's a computer, it can take it, then start both TCP connections at once. It's a computer, it can take it. Whichever one answers first, that's the connection you use. The other one, send it a reset, send it a Fin.
Now it's a little better, the data connection speed tells you which protocol to use and it will try to use the fastest. All of a sudden, both protocols are working in your favour, all of a sudden you are really trying to figure out, I should use the quickest all the time.
What about the bigger picture? I have given you a few instances of this. We can look at the combination of browsers and operating systems. If you are running Mac or 10.7 with standard Firefox, it took 75 seconds to fail over and it preferred 6. If 6 wasn't working, give up now, your life is misery.
If you turned on fast failover in Firefox, all of a sudden it was quick. In general, the only folk yet to get this, when I compiled this list a few months ago, was Opera. It was still taking an awfully long time to do the failover, but most of the others got it right, apart from Explorer. Nothing is right in Explorer.
That's the full table, go and look at it on the slide pack if you want to see the full dump, and down the bottom there is a URL which goes through this in exquisite and excruciating detail, but I won't now.
Why are we doing all this parallelism? What was wrong with running v6 first? Is there any difference between the two? We started testing a whole bunch of
folk. You didn't know you were being tested, you just thought it was a Google pop-up ad; it wasn't; you were being tested. So we tested, and are still testing today, some 800,000 users every day. We send them a simple test which says, here is a URL in v4, here is a URL in v6, here is dual stack.
What do we find when we do the tests? The first thing I'm going to look for is connection failure.
Because the server is me, I can tell what you are doing.
When you have one of these tests you immediately try to set up a TCP connection. What does that mean? You send me a packet. It's the opening packet, it's called a SYN. I will send you back a SYN-ACK and if you got it you will send me an ACK. I can't tell if you can't send me a SYN -- I'm not worried about that those. But if you send me a SYN and I send you back a SYN-ACK and I never get the ACK, let's call that a naked SYN, let's call that the busted return path.
How many percentage of all these connections fail on a broken handshake? Down the bottom, the green line is v4, the blue line is v6. Would you run a service that failed a third of the time? Would you actually jump in your car if one time out of three you didn't make it? Sorry. Would you jump on a plane with a 30 to 40 per cent failure rate? Obviously you are here, you
don't do that kind of stuff. You have this idea that that kind of failure rate is unacceptable, and it's not acceptable.
That is amazing, that v6 failure rate is abysmally high. Interestingly, I see some spikes in the failure rate in v4 as well over time. We can look at these.
Here is the v4 failure rate, it's low fractions of a per cent. Do your v4 connections fail? Mine don't, they are pretty reliable. What am I seeing here? I looked at this and what I found was actually it wasn't what I thought. Some of the time, my server is subjected to SYN flood attacks and those peaks are folks sending me SYNs, trying to turn me off, and that low background is me, as you, being subjected to BOT scanning, where they send you a SYN just to see if you are alive. It gets back the ACK, the SYN-ACK, and does nothing, so it never tries to completes the connection, it's just trying to see if something responds at that address.
It declines over time because the SYNBOT rate is constant over time. As I do more and more experiments the percentage failure rate comes down, which is what I expect, so I am comfortable with what I'm seeing with v4.
But v6, 30 per cent is abysmal. I started looking
at this, looking hard at what they were. Toredo has a failure rate of up to 40 per cent. Toredo is a NAT traversal technology, it doesn't work. 6to4, when it behaves right, has a failure rate of around 10 per cent.
6to4 is a tunnelling technology. It doesn't work. Lisp folk, think hard. Folk who think CGNs work, think hard.
Because what you are seeing here is data that disproves both of those myths. It doesn't work. Tunnelling is abysmal.
The bottom line is native unicast v4 failure rate, and it's there, it's not zero. Toredo -- NATS don't work. They just don't work. When you try to stress it, they break. Why? Because we never standardized NATS.
The bits we didn't standardize was multiparty UDP, and the bit every single NAT developer has to exercise creativity for is multiparty UDP rendezvous. Don't ever ask a software programmer to be creative because they will all do it differently, and the poor application that's trying to figure out what's going on dies.
That's the evidence.
Who is going to run CGNs? You are running out of v4 addresses. Are you going to run CGNs? Has your vendor shown you the failure rate if you are doing anything other than port 80? Look at this and cry. NATS don't work, we knew that; 35 per cent failure rates,
unworkable, we knew that; that is why we are deploying CGNs everywhere, we knew that.
6to4 auto-tunnelling doesn't work. Most of the issues around tunnelling are you really, really need coherent signalling of our message transfer size, MTU, and that doesn't work. Interestingly, the failure rates are different on each continent. In the US the failure rate is higher than in Europe, and Asia oscillates between the two. Different equipment? Subtly different configurations? I don't know.
6to4 failure is obviously also local failure.
What's going on is there is an awful lot of protocol 41 filters.
I'm not that interested in those. Let's move on because what I'm really interested is in is v6 unicast where I'm not tunnelling. Over the first eight months of this year I found 1 million connections using v6, which is a great number, of which 22,000 failed. That's a failure rate of 2.3 per cent. You are saying, obviously people can't type in their v6 address.
Obviously people can't: 13 people used a FE80 link local, 139 still used ULAs, and so on and so forth. But even if I get rid of all the dud addresses, the folk who just cannot figure it out, I'm left with 22,700 folk who have a good address and it still can't make a
38 were using unadvertised v6. OK, we'll forgive them. Sorry, 38 were using unallocated, they just can't type in an address. 150 were using unadvertised addresses, they can type but can't type in their router, and 22,500 were using addresses that should work.
I'm getting curious about this because that's a lot of folk. The first thing I did was take those addresses and allocate them by country.
Anyone from Pakistan here? What is going on? One-third of the connections coming out of Pakistan don't work. One-third.
Hong Kong, same problem, except now you are only 1 in 5. Cool. Isn't it amazing, France, 0.3 per cent, UK 0.3 per cent, Japan 1 per cent. Other countries like Viet Nam, 12 per cent. I have no idea why there are such massive differences. But there are. That's the sort of worst and the best of them all. New Zealand, clean up your act. That's just abysmal. Australia is doing a whole lot better. So there! However, countries is one thing. But we are in a geek audience, so let's name the good and the bad. Here is the failure rate per origin AS. It's in small type, so you may have to look online to find yourself. You know, you might be listed there.
The fascinating ones who are amazingly good, KDBI from Japan, really low. The RIPE NCC is there, with a relatively low failure rate. These guys are doing v6 brilliantly well. For users who come from those ASs that connect, in general, succeed. But you're all waiting for the next page, aren't you? You are all waiting for the "bad" page.
Leading the list, at the bottom, is unfortunately Asian, and it is the Malaysian Research and Education Network with a failure rate of a whopping 58 per cent.
This is not one or two tests, this is quite a few hundred tests over the eight months.
Interestingly, as well, the New Zealand .nz registry -- we are the names people, not the numbers people. Bloody hell, you are not the numbers people, you just can't get that right.
The University of Hong Kong, I'm afraid you really do have to do some work here. It does show that there are differences down at that network level, that it is possible to do a good job, and equally it's possible to do a pretty poor job.
When you get this kind of failure rate, your users see delay and crap service, that as soon as they turn on v6, the experience is infinitely worse.
So this is not good. Let's try to get out a good
message in the three minutes remaining. What I have tried to do is in capturing all these packets, I can actually do the round trip times, I can actually see if v4 or v6 is faster or slower. This is a typical kind of graph, the bottom line is seconds of difference. If things go out to the right, v6 is slower, if things go to the left, v6 is faster. Let's not worry too much about the green and red because that's the auto-tunnelling stuff.
Fascinatingly, 6to4 is faster than native mode v4 in some cases. Wow, weird. The blue stuff is the stuff to look at. Fascinatingly, in the transit world v6 is as fast as v4, down to fractions of a second, down to tens or even units of milliseconds. If you are down there, v6 is looking pretty good.
Over in Europe, Toredo has a weird kink on it, I have no idea why. It is unexpected that Toredo is that bad. It is slower, I think, because of some weirdness in Toredo, but let's not worry about that.
I also have some fast Toredo that is faster than v4, still haven't figured out why.
In Australia, by the way, it's really weird, Australia has a wide divergence in speed that is non-optimal and I suspect it's because our transit in v6 not optimal.
All of this talks to a few observations I would like to leave you with. I hear a lot of folk saying, "I'm not dual stacking my web server because v6 is so much slower." In general, that's crap.
In general, right now in the world today the transit networks have got their jobs done but the paths of the world of v6 even to the millisecond level are just as fast as the paths in v4. But, if you are running a 6to4 tunnel, if you are running a Toredo tunnel -- indeed, I would actually venture to say if you are running any kind of tunnel, don't, because tunnels don't work and in general they are slower.
Is v6 as robust as v4? 2.3 per cent failure rate is not 0.23 per cent failure rate, it's 2.3 per cent, it's 1 out of 50. That base failure rate all over the world is not acceptable. Folk will not enjoy an experience where 1 out of 50 connection attempts fail badly.
But it's not in the core, it's at the edge. Some folk are managing to roll out v6 almost perfectly: AS2516, 0.2 per cent failure rate. It is possible to do an amazingly good job with your users. But some folk don't. Robustness is highly variable; some have got it, some haven't.
What should a browser do? Should we say it's a two-horse race? Set them off, whichever one is fastest
wins. That's where Firefox went to. Or should we try to reduce the NAT server load by slightly handicapping the race by giving v6 a slight lead down the race-track, like Chrome with a 300-millisecond start, or should we try to be super-dooper clever like the Mac and try to do RT estimates? None of them are bad. If you have used any of these, you will find they are blindingly quick, certainly compared to a 3-second time-out, this is OK.
The trade-off is pretty good.
But there is another pressure going on. Most of you are thinking about buying CGNs, most of you are thinking about how big does it need to get? Interestingly, the size of the CGN you are buying is going to be determined by the behaviour of the browser in dual stack.
If the browser in dual stack does not buy us v6, your CGN needs to be a whole lot bigger. How can we help the story along and give everyone a hand? Frankly, I have yet to see what I would regard as an optimal answer. We should fire off both DNS in parallel and if you get back both AAAA and A records, fire off v6 first, but only wait a short amount of time and fire off v4.
Rather than happy eyeballs, rather than any other kind of eyeballs, if we do biased but pleasantly amused eyeballs, everyone emerges a winner.
Thank you very much. If there are any questions, I'd be pleased to answer them.
Philip Smith: No questions for Geoff?
Geoff Huston: Thank you.
Philip Smith: Thank you very much.
Our final presenter is Rick Lamb from ICANN, who will be talking about DNSSEC, where we are and how we get to where we want to be.
Richard Lamb: It's great to be the last speaker between you guys and cocktails. It will be a little tough.
This talk is about giving you a quick update about where we are on DNSSEC deployment and maybe some things we can do.
The good news is we have made a lot of progress on DNSSEC, a lot from this community. We have really passed the point of no return, I think I can safely say; at least at TLD level, the core infrastructure of the Internet, at the root, software fully supports it, ISPs are starting to step up to the plate and providing either systems that pass all the packets unscathed or systems that actually, resolvers that do validation.
With the new gTLDs coming out, there is a requirement for them to support DNSSEC as well. So I think we are doing pretty good. That's good news.
For a moment, I'm going to quickly describe why we care about DNSSEC, the cache poisoning attack. For a lot of you guys this is the same, so maybe this simply serves as a slide deck that you could reuse for somewhere else.
The first step is, of course, someone requests a website, they go to your ISPs, your enterprise or your end nodes, DNS resolver, and make their request. It goes out and goes out to the DNS, asks various questions and eventually ends up asking the DNS server that is operated by the enterprise that has majorbank.se in this case.
What happens in the cache poisoning attack is before the DNS server that the enterprise has, that has the correct IP address, has the opportunity to reply, an attacker replies before that, it does many replies and gets one the DNS resolver takes. DNS requests have IDs so you have to match the ID, but with today's CPU power and networking speeds it's been proven it's not that hard to do this.
The DNS resolver says, fine, I've got an answer to this, I'm going to pass it back. So it passes back this rogue IP attacker address and the resolver picks it up and remembers it and the true response comes back from the DNS server, the enterprise, and of course that's
ignored because the resolver already has an answer. The user is totally unaware of what's going on, he goes through on his merry way, asks for a page from his bank, the bank comes back with a log-in page, well, the attacker's web server comes back with a page that looks very much like the bank's, the user puts in his user name and password -- usually what happens in this case, with these rogue pages, they send back an error, saying "bad page", but he's already got your user name and password, so he's already got your account information, he's golden.
It's even worse because the way resolvers work, they remember things and they cache things. The next request that comes in from a different user for the same bank is also responded to with the wrong IP address and of course that user also gets redirected as well to the wrong web page and user names and passwords continue to be collected in the password database.
That, in two slides, is cache poisoning.
All the slides will be on the net. You are free to take them or reuse them any way you want. If they can help you explain this to some others, that would be very useful.
What does DNSSEC do? Why do we care about DNSSEC? DNSSEC solves this problem, essentially. Now when you
get a request for the same website, the DNS resolver eventually passes that request to the server operated by the bank, the attacker comes back with its responses but we don't care, because now this is a resolver with validation turned on. The records are dropped because they don't validate. When the right response comes back, eventually it gets remembered by the DNS resolver and is returned and nice things happen, you get to the real web page for the bank and everything is hunky-dory.
Since the resolver is only caching the right responses, this continues and this is the way DNS should work.
There's plenty of motivation for DNSSEC. There have been a couple of events that happened late last year, very public, one is called the DNS changer, and I'll talk about that for a moment. There have been calls by not only governments but certain groups to get DNSSEC deployed. There is a lot of good positive energy here.
There is a whole area called DANE, let's try to use DNSSEC for solving a whole different set of problems.
There are also efforts there in the IETF that have made a lot of progress. Then there is a whole bunch of other applications, killer apps, new ideas. There is existing work that has been out there for a while that has always tried to use the DNS as something to verify, authenticate various applications.
There is some digital identity work going on in a lot of countries and a lot of governments. Most recently I have run into some people and companies doing smart grid work, where they are trying to add some better technology to the power systems, so that they can optimise their use and face there -- but depend on the Internet for this, which is scary in some ways.
There are some companies out there that are trying to do this, and they have looked to DNSSEC as being a way to make sure that those critical operations actually are authenticated.
Many of you may realise that what happens with DNSSEC, at some point, is that we had a global PKI.
The DNS changer is something many of you may have heard about, but it is very valuable in the sense it is another event we can point to, to try to encourage DNSSEC adoption.
It was the biggest cyber criminal takedown in history, 4 million machines, hundreds of countries involved, and this is a case where users were redirected by modifying their resolver, the IP addresses on their computers for the resolvers it was querying. DNSSEC in an end-to-end environment, if this was used end-to-end, would have avoided many of these problems. This is an example that we can point to.
This was also around the same time, November last year, a large Brazilian ISP, an internal attack, where rogue employees had modified internally some of the records inside their name servers and were able to redirect people to a fake Google page to try to download an anti-virus program. Again, full DNSSEC deployment would have solved this problem as well.
There are a whole bunch of other things, I have put some links at the bottom that are very useful fodder for this. There was a presentation by someone at Google who described the brief history of hijacking, in general, using DNS and it was quite useful, a lot of references to some of their own pages that had been easily hijacked in various places.
I also mentioned earlier, there is support for DNSSEC from various governments. Sweden, Brazil, some of these countries have policies in place that strongly encourage financial institutions, for example, to deploy DNSSEC. So that's a great thing.
Most recently, March of this year, this is just in the US, our communications ministry, the FCC, came out and had a set of recommendations and of the three recommendations, one of them was DNSSEC. I was really surprised. But this is great, this shows that the governments are interested -- maybe these are just
recommendations, but even with these recommendations the six or seven largest ISPs in the US stepped right up to the plate and said, "We are on board. We will support you on this." This is great.
Of course, back in 2008 there was a US Government mandate that said all government agencies in the US would deploy DNSSEC. That didn't happen as quickly as we expected. Back in 2008, DNSSEC was a hard nut to crack and it still kind of is, but at this point 60 per cent of US Government agencies have DNSSEC deployed in one form or another, so this is great.
Of course, the other thing that everyone in this room, we are all excited about DNSSEC, it's a global PKI. We never meant it to be that way, it never started that way, but somehow along the way, that's what we ended up with.
The next slide is -- the DNSSEC efforts, particularly in the Internet community, is a classic example of bottom-up approach that we've had. We did not go and ask communications ministers or countries first, "May we do this?" This developed from the bottom, from the engineers from various countries, and we have been able to implement this.
Lo and behold, we have ended up with something that no top-down intergovernmental approach would have ever
yielded, no one would have agreed to have something like this, where we would have one hierarchy that would cross borders. So that's something we can all pat ourselves on the back for. There's a lot of motivation here.
Today, 92 out of 315 TLDs have DNSSEC deployed on them. I have some IDNs there as well, just for fun.
Most recently, Trinidad deployed it, .post, a new gTLD, also has it deployed on there, the root has been signed and audited, and, as I was saying, some of you were involved in that work and there are 21 people from various countries who support that process of managing the root key.
84 per cent of domain names could have DNSSEC deployed. That's because .com or .de are such large ones, and in some sense we could have DNSSEC deployed much further. Growing ISP support, et cetera, I've listed a lot of those things.
With all that positive news and all this progress of the infrastructure, why is it that less than 1 per cent of the domain names, the second level domain names, have DNSSEC deployed? DNSSEC is no good unless it is deployed from end-to-end. It is only useful if it is deployed at the top, at ibm.com, for example, and the ISPs or the end nodes have some validating mechanism in place. So the full path has to be covered.
Some have plans. Some people like Yandex.com and PayPal and a few others have deployed DNSSEC on their domain names, and that's great. I have noticed in this, I really applaud the people who put together the network here, I see the DNS resolver here actually has validation turned on, which is kind of cool.
We have innovative security solutions just waiting to happen, where we could not only distribute certificates, we could finally start having secure email and configurations. Wouldn't it be great to have Internet that we could trust again? That if we saw an executable file on the Net or saw something we wanted to download, we wouldn't be hesitant about it. It would be something the end user could start to trust again. That would be great, a lot of motivation.
What's the problem? I would love to hear your thoughts on this. It's part of my job to figure out what the deal is here and what the barriers are.
I spend a lot of time talking to CIOs in various enterprises and a lot of them know about DNSSEC but they have other fires they are putting out and I can't blame them for that, they have bigger fish to fry.
A lot of it is raising awareness with that group.
Large enterprises know about DNSSEC but they need help, which is why I'm providing some of these slides, and
maybe some of them could be passed along to them to try to make a case.
When they look into trying to deploy DNSSEC, they hear fear, uncertainty and doubt, and say, there is no box I can buy, I can't just turn it on, oh my God, signatures expire, we will have all kinds of problems.
This is something that also stops them from trying to deploy this stuff.
Registrars, DNS providers, they don't see the demand. It's a chicken and egg problem.
Randy, do you have a question?
Randy Bush (IIJ): I think you just said, deploying this sucks and it takes a rocket scientist and the key algorithms, all this stuff is rocket science. Nobody sane deploys it. Yes, I've deployed it. Proof of point. Would I deploy it? I had to run 30,000 lines of fracking pearl code. It sucks. It's not simple. It needs to be made simple. Normal DNS can barely be run by most people. This stuff is hell.
Richard Lamb: I agree with Randy. This is where we can step in and try to make it simpler and this is where I would like to try to help.
Along the same line, barriers to success, it may be difficult to deploy but some of it is lack of awareness.
From the customer level there is a chicken and egg thing
here. Because it's so hard to deploy, if the customer is not asking for it, why should an enterprise deploy it? So there is this hard slog, like everything else, just like with IPv6, to try to educate the customers, educate the people who would benefit from this.
There is lack of registrar support, it is chicken and egg. Every time someone finally says, "You have convinced me. I am going to go to the registrar and ask them to deploy DNSSEC." They will say, "We don't support it." "Why not?" "Because no one has asked for it." We have heard the story before. It's a very common thing. There is a little difficulty there.
Registrars are starting to support it, people like Godaddy have turnkey solutions which you click and sign, which will be the vast majority of people, people want something they can say, fine, it's another feature.
Another thing that worries those of us who see the long-term picture of DNSSEC as being something that will help secure various things, trades application, opportunities, new products and new ideas for various people, is that if it is not deployed in a reasonable way, in a trustworthy way.
What you can do, a takeaway slide would be what you can do to raise awareness about DNSSEC, it is a feature and I have heard this from companies like Comcast who
have said they have some people who have come to them and said, it used to be only about speed -- Comcast is a big ISP in the US -- now we are actually looking at security and if you have something that helps us with security, we care. That to me is a wonderful story, I was very happy they were willing to share that with us.
Starting early is a good thing, combining with other upgrades. At minimum, one of the things that used to be, at least from this audience, from the ISPs in this audience, it used to be we were looking for every single resolver to do full validation. From a security point of view, a pure end-to-end security point of view, it's more important just to be able to pass the packets unscathed, so the validation, full DNS, full packs of keys, full packets should be able to make it to the end node, so that if and when we have validation built into the end nodes, they can do this. This would be a critical thing.
Geoff Huston (APNIC): Two slides back, you could replace the word "DNSSEC", put in IPv6, and roll the clock back five years or maybe eight and have the same slide. As technologists, we assume that implementing this stuff is as easy for everyone as it was for us and the problem is a lack of awareness, we think.
But I think we were wrong then and are wrong now: that we should not underestimate the fact that behind this is a business, with real costs and real risks and real machinery running. The fact that something hasn't been adopted may not necessarily be because we are unaware, it may be because the case for business of making that investment doesn't look as good as other things I could do with my money, with my technical crew, with my available resources.
Trying to understand that and understand, is that perception of business priority one that we intrinsically agree with or do we think maybe there's a better case for DNSSEC but it hasn't been phrased in a way that relates to our customers? It's difficult, it's hard. It doesn't mean it is impossible, it just means it's really expensive and the industry is currently saying, I've got better things to do with my money.
I'm not sure the solution -- the solution set is a lot like the v6 solution set of five years ago too and I kind of wonder if some of the underlying issues aren't more about the business of this business as distinct from the technology of the business. Just a 2-cent observation.
Randy Bush (IIJ): We are not running out of insecure DNS, we are running out of IPv4 space, so v6 is increasing.
Peter Losher (LSC): First off, I want to agree with Randy in regard to how hard it has been to deploy DNSSEC.
Those of us at ISC who work on the bind code, I'm not one of them, I work on the operations side, and we have been basically beating on the developers to say, give us an easier option, the easy button, as a commercial says in the US.
I know there have been incremental releases over time, like 97, 98 and 99, where we have tried to put in mechanisms in place to make it easier to do key management and key rollover, you don't have to remember the key ID when you sign a zone. Those of you who have tried it and went "aaaagh", I would suggest you look at 99 and the inline signing stuff. It is not quite there yet but we have made some significant progress.
Richard Lamb: Even at that point it needs to be productized. A lot of the people I talk to would really like a box, they are happy to pay for it, so maybe it's fine in a box. But they really want a single solution.
I know for the vast majority of people, the domain name holders, they want somebody else to take care of that for them, a registrar or what have you.
I'm running out of time so I'm going to move forward quickly. I was going to address somewhat -- try to defuse the difficulty or take away some of the fear and
uncertainty in deploying something like this. The point has already been made here, there are various ways to do that, it's not just the engineering solution. If it's already not difficult enough, but it's not just the engineering solution but it's how you practise good security around the engineering solution, because this is something slightly different. If we are going to realise all the value and the potential for DNSSEC, the practices around the operation have to be done correctly. Luckily, I believe the vast majority of people are going to want a third party to take care of it and that third party may be us. But this is where we would target it, this group. If you were going to develop something like this, it's not that hard to develop something that has good security practices around it. We have learned a lot from the CAs, something as complicated as this, bio-metric eye scanners, ridiculously insane security, or a very simple smartcard with safes like this, I'm pulling out examples from some top level domain operators.
CR recently did this with Costa Rica, using the TPM chip that was built into the server platform. The TPM is a whole security chip that comes built into a lot of machinery now for free.
In fact, it is implemented that way and it is pretty
simple. It doesn't have to be fancy hardware or complicated, it could all be done with software, again with the right processes, procedures and people in place, things we don't have to relearn, it has already been done by CA people and there are plenty of courses where we are happy to teach people to do that.
Things that I've seen. Lessons I have learned.
I have seen a lot of problems out there. For trustworthy deployment you need some sort of documentation, like a practice statement and random number generators, you would be amazed how many places I still see just using a dev random off a machine that's been sitting there off with very little entropy in it and turned on once or twice a year. Those are the kind of things that are important.
The summary: my biggest fear is that with all the progress we have made, and we are not going to put DNSSEC back in the starting gate, the signed zones will be signed and the signed TLDs will be signed and stay signed for a while, but there is a chance that if we don't take the next step, if enough people at second level domains don't start deploying this, DNSSEC is going to die on the vine. That will be a sad thing for this community, the Internet community and the engineering Internet community as a whole.
This is one of our babies, this is one of the rare times that, maybe in the past 20 years, where we have put something really major into the Internet. We should all be proud of that. If it doesn't finally get used and end up whatever, distributing certificates or secure configuration files to various systems, that will be a sad thing.
I'm hoping you guys will take slides like this, slide decks like this, or whatever, and use this to try to convince people that might come up to you and ask you what DNSSEC is, or if they are trying to implement it, offer them help or direct them to some of the various training options out there. I encourage you to do that.
Of course, I always put up this picture, this shocks some people. If you think about it and I'm sure some of you have, DNS is really the core of a lot of authentication mechanisms out there. Everyone has heard about domain validated certificates, SSL certificates, you get those things, but every time you create an account online, it's at the base of all these systems so it really does behoove people to deploy this.
That's it, thanks for hanging around a little late.
If you have any questions, please ask.
Siamak Hadinia: One comment from Jabber, from Joao Damas, which says education is fine but it is the wrong answer
in the general case. What you need is simple automation of both DNS and DNSSEC.
Richard Lamb: I agree. You are hearing from a lot of real experts, and Joao is an expert in this field. My contribution to this is that it has to be done in a secure fashion as well. Any system can generate keys and completely automate the process but the keys need to be secure.
Paul Wilson (APNIC): Thanks for an interesting presentation. In my role, I'm spending quite a bit more time over recent years on governance, most recently APNIC was at the APNIC TEL ministerial meeting in St Petersburg in Russia, and that is just the latest of many. The thing I'm noticing very much lately over the last year is a huge increase in interest in security, at government levels, in Internet security.
Security has become the number one issue, certainly the number one Internet issue, and in many cases the number one national security issue at a government level and I think that is a big change and it's a really serious development over a short space of time and it will only get more.
One statement I heard at a recent meeting is that cyber security is a war and it's a war that we are losing. That's serious stuff.
The point of what I'm saying is that unless the community of engineers, the people who are building the Internet, can get this stuff happening pretty quickly, and really get it sorted, then there's going to be a lot more attention and the sort of decisions we don't want governments to make about timelines and technologies and this must be done and God knows what, who knows what, how that would be foisted on the Internet Community.
So yes, this is a call on that basis to the engineering community to really get moving on this stuff. I think DNSSEC is really important, Internet security as a whole is a really important issue for all of us to be aware of.
Richard Lamb: May I ask you one question, what your experience has been. Is it an awareness issue or an engineering issue? I heard in this room we have to make it simpler to deploy. I am obviously seeing it slightly differently. I'm seeing a little bit more of the side where it's an awareness issue as well in governments.
Paul Wilson (APNIC): I think it is a time and resources issue as well. It is hard enough for people, particularly in this region, in any developing economy, it's hard enough for a company to have a decent team of engineers and hard enough for a decent engineer to keep up with the demands of running a network, a network that
is often growing at an incredible rate, let alone to then have to look at IPv6, to look at security and DNSSEC; as Randy said, it is rocket science. In my experience it's a resources, time, hours in the day issue for engineers.
Richard Lamb: Thank you.
Philip Smith: Thank you very much for that, Rick.
Philip Smith: I will stand here to close off, rather than sitting in my chair. That brings us to the end of the first APOPs session. We have two more, one tomorrow, another one on Thursday.
Don't forget, the call for presentations for the lightning talks, which I introduced at the start, so we will remind you again tomorrow.
Also, don't forget, those of you -- Samol mentioned how he sits on Facebook all the time. Those of you who love Facebook, go to the program page and like this session if you liked it. With Facebook, I don't think there is any way of un-liking it. Hopefully you all enjoyed the session. The social event is next -- that starts about now.
You probably have time to rush back to the room, freshen up, change, whatever you want to do. The Catwalk Bar is downstairs, down one level, past the
lifts, past the Italian restaurant, veer slightly to the right, a little bit past the spa, and you are there.
You'll find it, no problem. Hopefully you enjoy the social event tonight and see you all tomorrow.
Tomoya Yoshida: The time has come, so we will start the APOPs session. This is the second of the three APOPs sessions. My name is Tomoya Yoshida of Internet Multifeed, also known as JPNAP.
As you know, the APOPs is the Asia Pacific Cooperation Group for the Asia Pacific Internet. We have three chairs and we have the website, which is www.apops.net, so please visit this website so you can see the past APOPs meeting, materials, et cetera.
Before we start, I would like to give you some information about the lightning talks. This time we will have the lightning talks on Thursday from 4.00 to 5.00 pm, so if you would like to have a short presentation for lightning talks, please submit to this URL.
For the lightning talks, you do not need slides, but many people have already prepared short slides. If you
would like to have some presentation, please submit with this URL.
This is today's agenda. We have three speakers, one is "Open Source Software for Routing" from Martin Winter and the second one is "Wherefor art thou CDN" from Dean Pemberton and the last one is "RPKI Propagation Emulation Measurement", from Randy Bush.
Martin Winter: Good afternoon, everyone. My name is Martin Winter. I want to talk a bit today about choices you have for using open source software in routing and what is the status today, to give you an idea, is that something you should look at, why you may want to look at it and how soon you may be able to do something.
First of all, who is Open Source Routing? Open Source Routing is part of ISC, the Internet Systems Consortium. We were approached mid last year when a few companies were really interested to have the open source movement better established in the routing space, so they wanted us specifically to improve on Quagga, which is one of the possible choices.
It was mid last year when we started that. We are funded by companies sponsoring us in that direction, so one of the big funders, like Google is in there, and we are a nonprofit organization.
Before I start talking about different choices on
open source routing, I want to give an important reminder. I always get approached and people say, "I can't use Quagga, I have too much traffic, you can't handle that much traffic with the software." You have to remember, when I talk about Quagga, Bird, Xorp and all these solutions, these are only the route engine solutions part. The forwarding part -- if you look at a route you always have the route engine doing the routing protocols and you have the forwarding part. The forwarding part could be done by hardware or if you have low bandwidth, you may be able to use a Linux PC and do the forwarding with Linux. But that's not part of the routing daemon, that's part of Linux or the BSD, or theoretically you can do something like Open Flow below for hardware forwarding.
Keep in mind, the part I'm talking about is the routing protocol, which is what runs on the route engine.
Let's first look at why you might want to consider Open Source.
There are a few reasons it is really time for you, if you haven't started to look at it. The big reason, especially in Asia, is money -- probably everywhere else in the world too. So there might be a much cheaper solution. Yes, if you buy something from Cisco, you may
have all the gazillion features on it and all these other things, but you may not need all that, so maybe today you already have enough from the basic routing what you want, what you need inside the open source solution.
Also, if you launch into the SDN cloud and all these buzzwords, this all started out from some Open Source movement, most of the Open Source software really fits very well in there. So if you need like a virtual router, because you have multiple virtual machines, you don't need to spend money for physical hardware forwarding, you could basically run one of these choices as a virtual router on a virtual machine inside your large cloud.
The features. You probably have heard about the features over and over again with vendors, "I need this and this feature," and the vendor asks, "How many million are you willing to spend if we build that for you?" With Open Source you may have a much simpler choice, you may be able to sponsor somebody or even write the code yourself and add that missing feature.
If you are just missing one essential feature, you can add it. You may have a special feature which is the thing that distinguishes your network from another one.
The support, obviously, now if you are here in Asia
you may not get great support, like people in the US are used to. Maybe you get support in Hong Kong and Singapore, but in other countries you may be better off to find local resources, people who understand and can read the source code and look at it, or you can go and outsource it to someone.
There are also a few reasons why you may want to hold off a bit longer. First of all, it's really early adoption. Today, if you start using open source in routing, you are one of the early persons there. It may be a risk, you may not really know what you are getting there, you may not be really sure on the quality, so you really probably want to do your own testing.
Support. You may have to either get your own people, like being an expert in it, or look for a very few companies who may be able to sell you support for it.
The missing features are always a big thing, there may be too many things you are missing, so it's too much work and you can't really go there.
Risk. Obviously, if your business depends on it, choose wisely.
I want to give you a quick overview on the popular open source software, what's out there. Basically I'm looking at Bird, Quagga, Open BGPd and Xorp, which are
the four famous ones, to give you an idea of what's there, where they are used and what their strengths and weaknesses are.
Let's start with Bird. Quite a few of you are familiar with Bird. It's a software which is very famous, especially as a route server. It started in 1999, it's maintained mainly by the Czechoslovakian NIC, the labs run the whole thing and deploy it and they do a very good job.
It started as an alternative to Quagga and Zebra.
Mainly people wanted to do route servers, and Quagga and Zebra didn't perform the way they wanted to do it, the higher scalability, so that was the main focus. It's well known and quite fast and efficient, at least as a route server.
From the protocols, it basically supports all the standards, like the very basic, RIP, different versions of RIP, OSPF and BGP. The key thing is that nobody is really using OSPF. We had discussions in the past at RIPE meetings and we asked around if anyone is using OSPF with Bird and from what we have seen, Bird is only used for BGP. It runs on the standard Linux, the different BSD flavours, it has a lot of features, very powerful configuration, filtering language and multiple routing tables. Its main development focus is towards a
route server, that's their focus and I think they do a very good job.
The missing limitations, which you hear sometimes, the IPv4 and IPv6 on Bird is something which is separated, so they are not like one daemon running both, they are separate parts because that's the way the code was built and had to be compiled. That is basically a challenge if you need that.
OSPF is there but I don't know anyone using it so we don't know how good the quality is for production. ISI, as an example, is not there at all, if that's something you need.
If you look at who is using it, Bird is the most popular choice and probably the best choice today for route servers. If you do a route server multiple use, I think Bird is basically the choice.
Everything else may be not that much. If you look at the sheet there, like all who are using it, we notice that basically they are only exchange points. So it is really used as a route server and also a little bit BGP processing, not really for forwarding.
Let's look at the next one, Open BGPd. That is the project which I'm least familiar with so I will be a bit more brief. Open BGPd is part of the whole open BSD community, so it started there as a separate project,
they do Open BGPd and also the open OPSFd, which is the OSPF part. One of the key things is this is the only project which is BGP licensed, all the other projects are GPL licensed. If you are building something commercial and GPL is a problem for you, Open BGPd may be the thing to look at.
From the protocols, it is limited to the BGP.
OSPFd, as I say, is available as a separate project.
That's Open OSPFd. It is focused from the software to run on open BSD but runs on all the different flavours of BSD and on Linux. The BSD licence is completely free, which means you can take the source code, modify it and incorporate it in a commercial product without any problems.
Limitations. I have to say, the main thing is the limited deployment. I haven't really heard anyone in the ISP community using Open BGPd. If you are using it, I would love to hear from you, so feel free to approach me afterwards.
The next product, Quagga, is the one which Open Source Routing works on. A fun thing, Quagga is the only product which doesn't have a logo. Just in case, everyone thinks what kind of funny zebra it is, a quagga is an extinct animal, related to a zebra which is missing part of its stripes.
Anyway, Quagga started as a fork from Zebra, which was early Open Source project, which came even earlier from HD.
It is like an open source community project, but there's not really one maintainer or one company maintaining it. On quagga.net, which is the public place, there is a community, a mailing list and it runs there. It is maintained by the community.
Open Source Routing does not own Quagga. The only thing we are doing is we try to help out the community by doing testing and doing additional development.
One of the key things on Quagga, Quagga is focused on full routing, so Quagga is the idea not just to do BGP but really do full routing so you can use Quagga on a real router or as a real router with the hardware platform below them.
Protocols. We have all the protocols in there, like standard RIP, OSPF versions, ISIS just came in, version 4 only, IPv4 only, not yet IPv6, that will be a bit later. The key thing on ISIS, the IPv4 version, we just merged it over from changes which Google did and we are still trying to get it in good shape before we go to IPv6, but IPv6 should be done soon afterwards.
It runs on all the different operating systems, Linux, BSD, all the different Unix flavours out there,
usually have a port out of the Quagga. The CLI is Cisco-like. If you are familiar with Cisco, CLI, you may be feeling quite at home there.
Key limitations today. As I mentioned before, route servers, if you have many BGPs, four or five full BGP tables, Quagga is currently not efficient at all, it has major issues in the BGP part.
Another limitation, there are quite a few different branches of Quagga: the Quagga.net official "Master" branch, there's Euro-IX, there's Quagga RE, and quite a few more, which was from past years when a few people from the community got frustrated and basically built their own branch.
Users. There are still a few route servers out there, small ones, not that many. Major users, quite a few people, and Open Flow, SDNs and small router appliances, smaller ISPs, which are basically Linux based, smaller embedded computers, which act as router, some smaller ISPs, I know there are quite a few in Europe. If you today want to use it in Linux and have 2 or 3GB of traffic, that's not a big problem. Also very large data centres, CDNs, sometimes use custom modified versions on it. A classic example is the main sponsor, Google, which uses in their data centres Quagga everywhere. I hope quite a few of you use Quagga as
Xorp, is another not so well known project. Xorp was started, again, as an educational project and it was mainly a few people who were more Juniper fans, they liked the Juniper CLI and they were thinking that Quagga core is in a horrible shape and tried to do it better, so they tried to do an extensible open platform for the routing and they are focusing on good documentation and clean code.
Protocols. We have all the different protocols, the same things, they are quite strong on multicast. One of the fun things, it's the only open source which has a Windows port. Again, strong point, CLI is like Juniper, if that's what you like. What they like to highlight is C++ and it is good code, the code quality is quite nice and well written and very well documented. If you are looking for something you need to completely take apart, modify it, Xorp might be something interesting.
Limitations. No ISIS today and the performance, well, there are not that many people using it for routing, so performance, they are not really sure where they are standing.
Looking at the users, when I contacted them, this is what I got. There are very few users out there.
I haven't found any ISPs using it. There are some
schools basically using it. There is the Pica8, which is a commercial stack, which has some routers sold in it, but it is split off from the Xorp and leased back and they thought they will merge it back. There is some test technology in there.
Just a quick overview. I will try to do the highlights of what I think is the coolest thing on each of the different choices. Bird, if you do a route server, large scale, lots and lots of different BGP feed, I think that's your preferred solution today.
Open BGPd, I think the coolest feature is the BSD licence, so you don't have any limitations from the GPL code. All the other projects are GPL.
Quagga is the main software for people today who use it for actually forwarding, routing. OSPF is the key thing there, so you have the OSPF routing, OSPF is very stable, and also the BGP path.
And Xorp, basically if you are looking for nice C++ code, well developed documentation, that might be something to look at.
I would really suggest you spend some time, look at the choices, see about how in your business which one of these may be like a close fit, which one may even solve it today, something you need, or maybe very close to have a solution in them.
Next I want to give you more details on Quagga, as we are working on that one.
Just looking at the routing protocols, as I mentioned before, BGP, IPv4 and IPv6, the challenge there, the performance today is bad for large multiple tables. There are two approaches, trying to fix that.
If you are looking on route servers in the past, a lot of ISPs which basically used Quagga as a route server all moved to Bird because of this issue.
Euro-IX is a branch sponsored by Euro-IX, so it is basically M66, and Lynx is sponsoring that and they have a branch which they are trying to fix and doing work with multithreaded. It's a work in progress, there's a prototype out there now. It has quite a few bugs but it may get usable soon.
There is something middle, internally, we tried to fix a few things in data structure, so we have a few things where we changed a few lists that were sequential lists into a tree structure, to improve the performance.
OSPFv2, from all the feedback I get, it works very reliably, there are no problems at all, there are quite a few users out there. As I say, that's what the main users, for people who use Quagga as a full router in their smaller ISPs, especially in Eastern Europe, they have no problem.
In my testing, if I go on a really large scale OSPF, huge mergers and convergers on network topology, I found a few issues I identified and I am working on getting them fixed.
OSPFv3, I know quite a few people like IPv6 in here, and they should like it. Unfortunately, I have to say, that's kind of a separate cloned version of the v2, a lot of the fixes didn't make it into OSPFv3 so the OSPFv3 is missing a lot of bug fixes that were fixed in another part of OSPF and should have gone in both of them. I hope that gets fixed soon.
ISIS, as I mentioned, for IPv4 I would assume one or two further releases out there that may be good, current release is 0.99.21, so may be 22 or 23, maybe at that stage it may be usable, the ISIS, at least for IPv4.
IPv6 will take a bit longer. Our approach is let's first fix it for IPv4 and then look at IPv6 afterwards.
RIP, if you are still doing that, that works with no issues.
There is a URL which documents some of our testing efforts and results.
I want to talk more about what we are doing mainly on Quagga as Open Source Routing. We are not really focused that much on the BGP part there. We see Quagga as a full routing platform and I want Quagga to be a
solution for building a real router, either a pure software router on a PC or as a hardware router where Quagga is the route engine and then have something like Open Flow as the forwarding engine or something else.
From that point of view we spent quite a bit of time working on the IGP routing protocols, so we are looking at ISIS and OSPF and trying to get it into a usable shape, working there, fixing it and testing that part, recently we had IPv4 ISIS in a halfway decent shape, and OSPF unnumbered interfaces is just about in the progress of coming in and there are a few stability issues we are working on.
Another thing is data structure changes, which I mentioned on BGP. It's not just impacting BGP, it impacts quite a few other protocols as well. We assume that should speed it up quite a bit in the performance.
It also should help do certain things in the paths, which are locked and non-interruptible, basically to allow it to interrupt certain parts of code, so we hope to improve and fix most of the performance issues in BGP, just by these data structure changes.
Another thing we are working intensively on is an API, mainly looking at companies who want to use, like Open Flow, so they have a direct way from Zebra to get the RIPd table downloaded, so Zebra, which maintains the
RIPd, it needs a way to forward the forwarding table down to the hardware, so to Open Flow, so we are trying to make a nice, clean API so Open Flow can connect in there and all the other vendors who want to build something, their own hardware, their own forwarding, they can connect into that API. It will basically help us to disconnect from the Linux terminal below, and not having to keep all the forwarding in Linux.
Now I want to talk for the last two minutes about how or why you should look at it. I want to try to get you thinking more about Open Source Routing. The ones who are long time in the business, if you remember the early days when you were playing with Linux and everyone jokes, this is a hobby, and nobody took it seriously, and then suddenly companies started taking it up and it took off, because companies started supporting it, other ones were starting to use it, and it is very common today, if you look on a high end server, you may pick a Linux or BSD server, that's very common.
I believe today, if you look at how much money you spend on existing hardware, if you just take a small percentage, just maybe 1 per cent of what you spend, and spend it either by supporting one of your favourite projects, one which is close to what you do or if you have good engineers and support it more by having
somebody working on it, like contributing to the parts, so it doesn't need to be money, it could be time too.
In other words, just do a little bit.
If you do that, the whole Open Source Routing line will really start taking off, all the fixes in the past which were very slow should now go much faster, things will speed up. You most likely get all the missing features which today stop you from using it added into the code and you can soon use it yourself.
From that point, it becomes a choice for you in the network. You basically have all the features, then when you need it there, it gets there, you also end up the additional vendors start to have to acknowledge it, they can't do a mark-up of 50 to 70 per cent any more, the margin on their routers, they may have to go down, so it saves you money, you have lower operation costs and you can do more support, you have cheaper costs, because you can do that, and it goes back that all the other vendors are forced down again to lower their prices, and it should really kick off and get a really good environment, where companies start using it, it's a good solution and it will be a viable solution in the future.
In the future, you may not buy the router from one vendor any more, maybe you buy the hardware, with forwarding from one vendor and you buy the software for
the routing from a different vendor or distributor or download it from somewhere, like you do today with a Linux platform, where you buy the server somewhere and then pick your Linux distribution.
That's basically it from my side. We have a few minutes left for questions.
I am very interested who in Asia Pacific, especially, uses Open Source, like software for routing or if you are not using it, why you are not using it and what's stopping you, do you have any experiences there, or if you are interested in helping out, feel free to approach me afterwards, I can give you some hints, connect you with the correct communities from your favourite project.
Tomoya Yoshida: Any questions from the audience?
Sunny Chendi: I have a question from the remote participant, Emba: which network layer are the companies using the system, PE, Transit or OR? I tried to understand the question before I relayed it.
Martin Winter: I'm not sure I understand the question.
If that's the question, provider edge or core, I think today, the main, if you look it, especially Quagga, people actually using it for routing, it's probably more on the edge, like data centres, it's a very common thing, but it's also used a bit on the core, for small
ISPs. Today mostly in the core there are people just using a PC as a router because they have only 1 or 2 or 3GB of traffic, which is OK for PC hardware to maintain, but basically not that much that they would have real hardware. The other, the edge, on the data centres, where companies like Google also, which are customised, use deployment, but not that many features.
The key thing is, if you need all these extra features thing, like ACS and all that, that is part of the forwarding plane and you may have a hard time to find that part today for the open source solution.
Tomoya Yoshida: Some ISPs use the forwarding router on their network, so they do not need the forwarding plane, just the concrete plane.
Martin Winter: For routers, like Bird is very common and a little bit Quagga.
Vidol Leung: I have not actually deployed any production Quagga but I have a bit of experience. I want to know from your experience, is the performance equally as good as the actual router? Suppose you have a hardware server of 2GB RAM and dual CPU, 3gHz, would it perform equally well as a router, like a Cisco router 7206 MPEG2, like that?
Martin Winter: To answer the question, there are two parts to the question. One is the forwarding part.
Forwarding, if you are using a PC and you have decent gigabit card, total bandwidth, 3GB should not be an issue. If you have better hardware, it could be more than that, it depends a bit on the hardware. If you compare to the classic old version of 7206, I would say you would have the same forwarding performance.
The routing part, the separate part is OSPF, I don't see an issue performance wise from what you mentioned, even a dual core CPU, normally not a problem. If you talk specific Quagga, if you do lots of BGP feeds, you may have performance issue on BGP. If you do more than about 4 full BGPs then BGP as of today in Quagga will have issues, but we are working on fixing that. Small BGP table, no problem, OSPF I don't see an issue, RIP I don't see a problem.
Randy Bush: There is another use for these, and that is the research community uses them heavily to record network routing traffic. I specifically want to ask about ISIS, which is the major IGP used by the big ISPs, and we researchers want to be able to record that ISIS traffic, both in v4 and v6, and we have been dying for years. We would love to see the ISIS support work but also we want to be able to record it, just like we can BGP. That's both dump and updates.
Martin Winter: ISIS today, I think the only choice is
only Quagga. Bird doesn't sound like they are working on ISIS. Xorp discussed it a few times.
Randy Bush: Quagga is the only one we care about.
Martin Winter: I know, because you do a lot of IPv6, so that may be a quite a bit out too. Obviously, the first priority is getting ISIS working, IPv4, then IPv6.
Recording, I haven't heard many requests but it's probably -- I'm sure there is something like enough interest, the way you describe it, probably someone will write it up and add things. The whole thing is a community project. I assume ISIS is good quality --
Randy Bush: I just heard you are not going to support it.
Martin Winter: I would love to support it.
Randy Bush: But you won't.
Martin Winter: Especially the research group has very good and intelligent people too.
Dean Pemberton: My name is Dean Pemberton and I want to go through today some work I have been doing recently around looking at CDNs and how we in New Zealand interact with them, looking a little bit as well about how the New Zealand Internet hangs together in terms of its global partners and then looking a little more around some of the exchanges within New Zealand and how they could better attract CDNs.
As I mentioned, this predominantly shows examples for New Zealand but it should translate quite well for lots of other small Asia Pacific economies, especially Pacific. Just as we heard the other day, the Greater Mekong Subregion contains a lot of countries that I think this work could be very pertinent for as well.
Look out for the similarities when I'm going through this and see if any of these apply to your countries.
At the end, if we have time for questions, throw up your hand and let me know, because I would be really interested to see whether the sorts of issues we are seeing in New Zealand are also being experienced across the rest of the region.
Speaking of New Zealand, here it is. We have two islands, named North Island and South Island. We didn't get very creative. One of them is north of the other one and the other one is south of the one I just mentioned.
Auckland is the largest city. Wellington is the capital city and where I lived, so the best city, and Christchurch is where the big earthquake happened. That gives you a bit of an idea about New Zealand.
One day back in the dim dark mists of time the islands got the Internet, and they got it by going all the way back to the US over a very slow modem link, so
they did not really connect to anything locally, it was a single modem link all the way back to the States.
Some things haven't changed. We are a little bit slow getting to places on the net. These are some round trip numbers I took from Verizon business's global agency service level agreements, if you put things on the net, I don't see why I shouldn't be able to steal them and put them in slides.
You can see that we are 60 milliseconds away from Australia, which is not too bad, but if you look at how far away we are from the rest of the world, the US, India, Hong Kong, in the mid to high 2s, the minute you want to get into the UK and Europe we are getting a little bit far away. Not much we can do about that, they haven't worked out how to speed up the speed of light and that's predominantly what's causing the problem.
Here is a look at what the submarine cable network in and out of New Zealand looks like. Now you can see why we have the problem. We are hanging down here in the middle bottom, the two little islands, and there's one cable that goes up to the States and another one that comes to Australia and they form part of the Southern Cross Cable Network. That's predominantly the major way in and out of the country.
All I want for Christmas is a shorter wet piece of glass, really. New Zealand is at the bottom of the world. If we are looking to go back to the States or the US for all our content, we are always really going to be at a bit of a disadvantage when it comes to that.
Is there a way of getting a little bit shorter piece of glass for us to get this Internet thing over? In the short term, it doesn't seem all that likely.
Plate techtonics being what they are, it will take a little bit of time for the continents to move closer to us. I am not really willing to wait that long for my downloads -- I don't know about you guys.
There are some cable providers looking at new cables but they are a few years away, and this just got worse.
Recently -- I am talking the 1st of this month -- Pacific Fibre, who were the major player in laying a new cable network between Australia, New Zealand and the States, pulled out. They said, "We have given this a good try, but we couldn't get the funding, that's it, it's done." Now we are pretty much back to that one network and their ability to charge whatever they like.
There may be hope for me. I may be able to pull the rest of the continents closer to New Zealand in a little bit shorter timescale than something geological.
If you look at the map, this is very much how the
Internet used to work. You can see here on the west coast of the US, I have my server and that's where my Facebook hangs out. All around the world are all my users and they are all accessing this one server.
That's certainly how things used to be.
Now they look a lot more like this. Into the Content Delivery Network, either run by a service provider or contracted from. The content has moved out closer to the users, and this is really going to be a bit of my saviour. I can't move my islands closer to the west coast of the States, but maybe I can move the content closer to me.
That's awesome. How do I get one? I want one of these. That sounds great. How do I convince all the CDN providers in the world to put all their nodes in New Zealand and bring all that content to me and that would well and truly solve my wet glass problem. Turns out that's a lot harder than you think. Not a lot of them were interested in my passionate pleas.
What it comes down to how is how the content distribution networks see New Zealand. What is New Zealand to them? The realisation is we are certainly not a country. You would like to think we were, but to context delivery networks we are just not. We are a little bit too small to qualify as country. We only
have 4.4 million people, a whole lot more sheep, but not a lot of them have Facebook accounts. We sit in the middle below Georgia and above Costa Rica and a little bit above the Palestinian territory and Lebanon.
We hit hard in terms of tech and people doing the right stuff but we just haven't got that bulk. Not a problem for me, but for the country as a whole, not quite there.
We are somebody else's suburb, we are not a country.
The big question is whose suburb are we? This is where things get interesting.
Are we a suburb of the US? Well, possibly. We certainly used to be over our very slow modem link. Are we a suburb of Singapore, quite a large regional player? Are we a suburb of Australia? That would be good, they are only 60 milliseconds away, but it's not immediately obvious where the Content Delivery Network providers see us.
I decided I would have a look. I will leave up the slide, because I then want to talk a bit about it.
I went home and decided, I am going to look at whereabouts I go to get to these Content Delivery Networks. If I'm trying to access stuff from Akamai, where do I access it from? From my house, which is labelled "NZ Residential", I go to New Zealand, so
there's an Akamai node in New Zealand, that's great, everything from there comes down very quickly, life is good.
From a colobox I had access to in the second column, it also comes from New Zealand. That's great.
I want to go a step further and see what the experience looked like for someone in Australia.
I contacted a friend and got them to do essentially the same set of tests, one from whatever their ISP was at home and one from a colobox they had access to. We can see that the residential in Australia comes from Brisbane and the colobox comes from Sydney, so all doing pretty much the same sorts of things. But you can't say that for the whole chart.
Look at Amazon cloud services, both of us heading towards the US, 250 milliseconds time, dragging the content all the way over, none of it comes locally.
Amazon is about to deploy into Australia, hopefully that will make it a little bit better.
There are some interesting things on the graph.
Look at Cloudfront. I get it from Sydney, both from home and for the colo, so that content is sitting in Sydney. Why does my friend's ISP in Australia go to Japan for it? That's just crazy. There's obviously some peering arrangements going on that are a little bit
whacky. Similarly, down with soft layer Internet at the bottom, I know that content is in Sydney, but again it's being accessed from Japan.
This idea of bringing the content out towards the users is helping people make the Internet a little bit smaller, but you have to be doing the right things and it's not immediately obvious how you can know that.
I would advise a lot of people here to go and do a similar sort of test, and I will show you how to do one very easily, later on.
I am trying to see if there's anything else interesting on there.
New Zealand is a suburb of -- well, it looks like, for a lot of CDN, New Zealand is still a suburb of Los Angeles. Hopefully we are one of the good ones, rather than one of the bad suburbs of Los Angeles. I don't want to be right next to Compton or something. I would much rather be a suburb of Sydney, again, one of the good ones. We are trying to populate Bondi enough so that we are a de facto suburb there anyway. If I can't be a suburb of Sydney, I would at least want to be a suburb of Singapore. This stuff is important.
We need better measurement tools. My experiments were very ad hoc, and by very ad hoc I mean a couple of TCP trace routes, to get the load balance to work
properly. We need better measurement tools for finding out where your CDNs are coming from, to look at what your connectivity into the Internet looks like today.
CloudHarmony is a company that is doing some good work here. I think it's something that the RIPE NCC Atlas project could look at. You have all the probes out there, getting them to look at where the closest CDN nodes are would certainly give us a bit more of a picture. It would be great to get a global view of CDN deployments and what they look like. The CDN providers don't really like giving this, because it's proprietary information for some of them.
It would be good for you on your network to be able to see how close content is, how close is Apple or Windows updates. That matters. If they are halfway around the world, 400 milliseconds away, that's bad, if they are 200 milliseconds away in a cache, that's good.
This is an example of the CloudHarmony service I was talking about. There is a URL at the top. These guys in a browser will show you how far away your local CDN nodes are for a subset of a CDN. They will do the same for cloud providers, cloud storage providers, et cetera; a very easy way of doing this and you don't have to trace route and arm twist your friends to trace route on your behalf.
Don't everyone run it at once in this room, otherwise the wireless will break.
Now we know there is potentially an issue and my entire country is a suburb of a not very nice part of Los Angeles, what do I do to change that? I can change where I appear -- really, I mean peer. Or I can change my CDN gravity. I will explain what that means in a minute.
At the last APRICOT meeting there was a good presentation by Willy of Matrix Networks, in Indonesia, where they looked at building out their network and where they appeared on a regional level, rather than globally. What they looked at was they looked very much at their geography and looked at where their local exchange points were. They didn't do what New Zealand did all those many years ago and just go to the US, because nowadays things have changed, the Internet is not just in the States, it is everywhere, you should look close to you, look regionally and start making sensible decisions.
They built out into Singapore and Hong Kong and it was only years down the track that they looked at building out further into the US and EU. So it was almost the opposite of what New Zealand did but I think in the current state of affairs that's a very smart
thing to do because it means you get your local content first, before becoming a suburb of somewhere you really don't want to be.
CDN gravity. This was a concept that I thought made sense. It will be interesting to hear from the CDN providers whether I'm being sane here or not. It's the concept that if you have a certain mass you attract certain things. Massive things tend to suck other things in. Having a mass on the Internet attracts CDNs to you. If you have a small mass, they don't care, at all. No one is building a CDN node in your basement because you surf so much Facebook. You are relatively small and have a relatively small mass.
If you have a large mass, they fall over themselves to get close by. If you are a large exchange point or a large part of the Internet, CDN providers want to go there because it enhances their brand.
What can we do to make this idea of increasing your CDN gravity work? You need to make sure that you have a suitable place for CDN providers to build to, and then you really have to make yourself too large to ignore.
Let's look at some of these.
What do CDN providers need? They need demand in a single well defined location. The question I have been asked by a couple of them in the New Zealand context is
the following question: where is the one place in New Zealand I can go to pick up all the New Zealand demand for the content I hold? I don't want to go to four places and get some of it from here and some of it from there, I want to go to one place and get all the demand.
They need power, so you have to be able to provide it. They need access to the network, they need redundant access international links to get the stuff into your country and out of your country, and at a half decent location.
In terms of New Zealand, let's look at how we measure up, then you guys can start thinking about how your countries might measure up, in terms of this as well.
In terms of a sensible place for CDNs to come to get New Zealand content, we have a couple of exchange points, the largest in terms of volume is the Auckland Peering Exchange, ethernet based exchange initially built in the Sky Tower. Don't do this. It is the most stupid place ever to build an exchange in the history of mankind.
It doesn't come up very well on this slide, but this is the Sky Tower. It is massive, it's the tallest free-standing structure in the southern hemisphere. The little black band just below the observation deck is
where the exchange was initially built. For bragging rights, there's none better. We have the tallest exchange for miles around. For getting access in and out and convincing people this was a sane place to go, it was stupid. So don't do that.
We have a little bit better now, and I will go into that a little bit later.
What do you get if you do peer? You get most tier 2 providers in the country, you get some tier 1 providers, you don't get the two major tier 1 providers, and this is the best case scenario, I'd say you miss out on 80 per cent of the demand for the country.
There is not a single place within New Zealand at the moment to get all of the New Zealand demand. That's a problem. How can we attract CDN providers to New Zealand if there's nowhere to attract them to? In terms of networking, the APE is not really the Southern Cross cable landing stations. There are other cable providers who can bridge the gap, but having to employ more and more people is not necessarily the right way to go about things, so it would be good if we could solve that problem.
The other problem is it's not particularly near the Auckland data centres either. Again, we went for cool rather than functional, and it's a great view, but it's
not necessarily a great view of the racks worth of data centres next door. Most data centres tend to long line back to the CBD where the Sky Tower is. I have a map here, so for all the map freaks, this is cool. I am a map freak.
The Auckland CBD is here, this huge cyan mass of points at the bottom. Each of the points is an exchange switch. I said we got it a little bit better than just the Sky Tower. The company that runs the exchange, they make the exchange VLAN available on any of their other switches in the CBD, so you can go to any of those points and be able to access the exchange. The problem is it's only in the CBD.
The blue points at the top, almost right up the top, are where all the data centres in Auckland are. What do you notice? Not the same place. There's also a big bridge in the middle, the Auckland Harbour Bridge is there, they are on the wrong side of the bridge.
The Southern Cross landing points, one of them is here in Whenupai and the other one is snuggling under a blue dot in Takapuna, so they are not particularly near that either. Geographically, we have got some work to sort this out before the VLAN CDN providers even have a place where they would tick their boxes to come.
There are obviously some areas for improvement, but
it is a work in progress. As a community, we are actively trying to fix this. Watch this space.
Is that all I need to do? If I tick the boxes, people will flood to my door and bring the content for me to be able to access quickly? Well, no. If only.
If you build it, they may not come. I heard over lunch a story about someone extending out an Internet exchange, and for whatever reason, no one came. You can't guarantee that just because you build it they will come. What I can guarantee is that if you don't build it, they definitely will not come.
If you don't have a place that is suitable for them to build out to, they won't.
The next thing, make yourself too large to ignore.
This is not just a case of signing up for lots and lots of Facebook accounts. You actually need real people with real demand, and the answer here may be cooperation.
What I want to start New Zealand having a bit of a think about is -- and it's the only Oceania regional network. Remember those numbers I had before, 4.4 million people, just a smudge ahead of the Palestinian territory. Let's look at where we come in if we do it on an Oceania basis, Australia, New Zealand and all the Pacific islands. We kick quite hard, 35 million people,
we come in ahead of Canada. Canada has a little bit of a unique position in that they are right by the US, but I'm not letting it stop me, 35 million people and all their Facebook accounts, that's CDN gravity. There are a lot more links out. Thinking back to the submarine cable map, New Zealand only has two links out. New Zealand plus Australia plus every Pacific island, that has a lot more scale, a lot more carriers, so the markets work a lot better, there are a lot more exchange points, there is more sensible CDN gravity and we may be even able to get Netflix, and that would be awesome.
What does it mean, what are the takeaways? Measure where your CDNs are, make sure you know. We didn't really know before I started doing this on a country basis how well off or badly off we were. But the old adage, you can't manage what you can't measure really holds here. I would really like to be able to see better tools for being able to do this.
Measure where your neighbours' CDNs are and if they are better than yours, go and get some. If I have a country just across the ditch, the West Island of New Zealand -- which the locals like to call Australia -- if they have better CDNs than I do, then I want some.
I want to start peering more and more with them to make their content closer to me, none of this silly going off
to the States for it.
Assess if you have the infrastructure that big CDNs need and if not work out how to work towards it. You may not be able to get there today. What they really need, what they care about more than anything else is the demand. You may not have that and there may be nothing you can do to get that, but that will come over time. Make sure everything else is ready so when you get the demand, you don't have to also find the network and the location and work out why you are on level 48 of the big Sky Tower.
Cooperate. If you are a small country, maybe you have small neighbours, but together you may be big. We saw a really good presentation the other day about the Greater Mekong Subregion and it started to resonate with me along these lines: if you are Myanmar or Laos or those smaller countries, do you really need to go it alone? Do you really need to define this in terms of just your country or do you do the trick of the Greater Mekong Subregion Area Network and look at it in terms of bringing CDNs to the region, rather than trying to get them into the one country and really reinforcing those links within the country.
Above all, do not assume that all content is US-based, it's just not.
I will pass back.
Tomoya Yoshida: Any questions or comments? I didn't know you had access to Japan also.
Emile Aben (RIPE NCC): I have just one comment. I'm also involved in the RIPE Atlas project. If there is anything you need from us that you don't already have, please come talk, because this sounds like a really interesting experiment and we are always wanting to support these experiments.
Dean Pemberton: Thank you.
Tom Paseka (CloudFare): I'm curious, how long ago did you do the measurements?
Dean Pemberton: It was just before the Auckland Intech Conference, six to eight weeks ago.
Tom Paseka (CloudFare): We launched in Sydney three or four weeks ago and I did a check and all of New Zealand is coming to Australia.
Dean Pemberton: Cool. Outstanding. What did I say, just as a matter of interest? CloudFlare, so that line four down, we were going off to the States and Singapore and Japan, now we are going to Sydney. That's cool.
Tomoya Yoshida: The last speaker is Randy Bush.
Randy Bush: I am wearing my researcher hat today. We were interested in looking at RPKI deployment. In other
words, to deploy the RPKI we need to be able to get it from the servers at the RIRs and the NIRs and the big ISPs to the routers and the operators in the networks.
So we wanted to know about the propagation of the RPKI data within the relying party infrastructure.
I will get into what a relying party is in a second.
We wanted to know how sensitive it was to latency between caches, how sensitive is it to the timers in the caches and how often they fetch, and how much of the delay is propagation and how much is validation.
I am not going to completely answer all these questions today. This is ongoing research and I think, as the title should show, it's an early report. In fact, it's a very early report.
Just to remind ourselves, this is the publication hierarchy for the RPKI, there is the IANA and there's the RIRs and the NIRs and the ISPs and the poor little routers are way down at the bottom, and they want to get that data that's way up at the top.
How long does it take? That's what we're going to try to answer.
There are four major players in this game. The first are the publication points, APNIC, ARIN, et cetera. Then, if I have a large ISP structure, at the top of the ISP might be two or three gatherers that
gather the data from up the top and feed it to caches throughout the ISP, one or two in every POP, so that the data are very local to the routers, and then finally there's the poor little router that wants the data. So we have the publication, we have gatherers, we have caches and we have routers.
Propagation is the time from when a certificate authority -- RIRs, et cetera -- publish a certificate or ROA to when the relying party receives it. The relying parties in the game we're going to talk about today are the validating caches and the routers who receive it from the caches. We measure it by having all the players in the game logging each object and time stamping when it received it. Then we can gather all those time stamps from all the objects and do major arithmetic, like subtraction and we will have results.
In the experimental architecture I'm going to show you, we don't really care about routers and routing in BGP, because they are not part of the measurement that we're interested in. What we use instead is a pseudo-router, a fake router, that is a client of the RPKI protocol and just logs everything it sees. It doesn't have a routing table, we are not interested in all that. We have separate measurements for that kind of thing.
The caches also log when they receive objects, and we have a dirty trick when we want to induce delay, when we want to create fake delay in our experiment, we actually send the data -- we insert a router into the model, the routers are all in Texas and the servers are all in Japan. So we are introducing an enormous delay whenever we want to, and we will measure with delay and without delay.
The caches sync the data from their parents or from the gatherers. Each cache has a root trust anchor and validates all the data. That's the basic rule.
This is one slice of the model we are going to run.
So there is the root publication, the RIRs and so forth, IANA, here is a tier 1 provider and they have some gatherers and they feed a bunch of tier 2s and tier 2s feed tier 3s, et cetera, on down.
Let me do a little detail on this to give you an idea of the scale. The model I'm going to show you, we have three tier 1 providers, each one has three gatherers, each one has six tier 2 providers and they have two gatherers, each has 20 tier 3 providers and of those 12 have gatherers and 8 trust their parents. That ends up with numbers kind of like this, for how many there are and how many gatherers there are and how many caches there are and how many certification authorities
there are, and it ends up to be a lot of them, about 1.5 lots.
How do you deploy a testbed of about 1,000 machines? That's how. In JIST, the Japan Institute of Science and Technology, up north from Tokyo, there's a large cluster of 1,000 machines and we used 50 of those machines and deployed 20KVMs on each one, the machines have 12 processors and so much memory and all that. It's a big cluster and we just used a small part of it. This experiment is ongoing.
So you don't configure 1,000 servers by hand. Maybe some people do, I don't. What we used is a special tool called AutoNetKit. What you really do is you draw this on your Macintosh, using a free program called yEd and there is the top tier and the gatherers and the routers, and you draw this. Yes, I'm really serious. An AutoNetKit reads the graph ML file produced by yEd and generates and deploys the servers for the RPKI, for the caches, the pseudo-routers, the real routers, et cetera, and deploys them on star bed and in Dallas where we have the pseudo-routers.
So we draw pretty pictures and it generates it and blows it down to the servers. No dirty hands, no ink on my fingers.
So AutoNetKit is really a cool thing, it originally
came from Roma Tre University, people like Andrea Cecchetti, Lorenzo Colitti, who we all know, Stefano, et cetera.
That was NetKit and it configured a router. Then it migrated to Australia, to Adelaide University, Matt Roughan and one of his grad students, Simon Knight.
They did AutoNetKit and it can do multiple routers.
Then a gang at Loughborough University in the UK, Iain, Debbie and Olaf, migrated it, so it knows about the RPKI and servers and caches and you can add new services, like if you want it to model DNS or something like that. They extended it significantly at Loughborough. They redid address assignment, it understand the RPKI and it creates things, so you can tell it, hey, on this tier 1 ISP decides to add 1,000 prefixes, et cetera.
To induce the layers, we send the packets from StarBed in Japan to Dallas, Texas, where we have something called Junosphere, which is 75 virtual Junos routers. What really happens is there is an open V switch in each of the configurations, one in StarBed and one in Dallas, so they share a VLAN tunnel and we can tunnel as many VLANs over that as we want.
We induce delay by putting a router in the picture.
This is not a measurement router, it just says, hey, for
cache 1 to get to cache 2 it has to go through these two routers. Since both of these are in Dallas, there's not much additional latency induced by this link, but that link has a round trip time of hundreds of milliseconds.
As I said, we have open V switch, and we tunnel the open V switch through GRE, so anybody can reach anybody else. The configuration is generated, it generates the open V switch bridging, everything with AutoNetKit.
As I said, if you have to go through a router and it's delayed, if there are multiple router hubs, it really stays in Dallas, so it's not a lot more delay.
I have a challenge. When we wanted to create the objects for this project, the certificates and the ROAs, we said, wow, it would be really nice to use the real routing table. We've got easily BGP dump from any router, so given that BGP dump, how do we generate the certificate hierarchy, which would have created this if the RPKI was fully deployed? This turns out to be a very hard problem. So I'm offering anybody a 2 or 3 star Michelin dinner for the code to do this. There are some very smart people working on it and they have no results yet. But we would love to have a result. It would be worth the dinner easily. Of course, I get to have the dinner with you!
For the initial little test, some of the numbers I show you are doing it with something 10 times this size, we started out with 1,500 ROAs, we allocated them, since we couldn't do the bottom up from the routing data, we just made some guesses and we did kind of a parietal distribution, which we all hate on the Internet but keep using.
So every RIR had about 250 of them, for every ISP who used their web interface, what they call hosted certification, they had 45 prefixes for a tier 1 ISP, and during the run an equal number were created dynamically, about one a second -- something like that -- using the same distribution across the tiers.
Here is a two tier-1 model. There are two tier 1s here. This one, you see all the little black dots, they have routers in them, to induce delay. This tier 1 does not have routers in it, so no delay is induced. These guys should be much faster, these guys much slower. In fact, we found it made no difference, the whole thing is latency insensitive.
We run a three-tier model, it took about an hour to compile and upload to StarBed. It was about 150 megabytes for the small model. We ran it at a 1:1 time ratio, so we ran it for a full day. It produced between 1 and 3GBs of log files, depending on which model we run
and we have to bring them down to a compute server and the analysis takes about 42 minutes. That's a joke.
Some of it we are still analysing two weeks later. We have a whole large run, a two-week run scheduled just before RIPE so we can get much better results and much better data. As I said, this is an early report. This was just the first trial run.
Getting from the publication points, the RIR, et cetera, to the first gatherer, the gatherers were running -- we set their timers to go once an hour. So, no surprise, we see a mean, 50 per cent, of 30 minutes.
The distribution is dead flat. Couldn't be simpler, couldn't be stupider. No surprise.
This one is fun. This is from the gatherers all the way down to the routers, and what's fun about this is there are points to the left of zero. Why are there points to the left of zero? Because for any particular router it could have gotten it from multiple caches, which could have gotten it from different gatherers, so these are the ones that arrive sooner from another gatherer than the mean.
So we have this distribution, and these are level 1s, the ones right near the gatherers, these gather from those, these gather from those, they are all fairly well clustered, fairly close, fairly steep.
Everybody is taking -- caches are firing off every 10 minutes in this particular experiment. So, boom, you've got the fast one, then the delays, and it goes out to almost, as you'd expect, since we have three levels -- level 1 to 2, level 2 to 3 and level 3 to 4, 30 minutes is the max.
From the publication points all the way down to the routers, we see just about what we would expect. Here is where the data go bad. So don't take the next foil seriously. We misconfigured, so the hierarchic organization of the publication points versus the flat organization of the publication points look very similar. They shouldn't and will be better on the next run.
This is even worse. We are experimenting with other protocol than our sync because we are interested in seeing what kind of distribution mechanisms would be interesting, so we tried BitTorrent. These measurements are ridiculous and show we don't know how to configure BitTorrent very well.
Don't believe the last two slides, we will have much better results the next time you see the program.
I believe that's about all we have. I want to thank the StarBed folk and Juniper and Cisco and people who give us money and people who did the work in Adelaide
and of course the folk from Loughborough and Purdue.
And that's it. Are there any questions?
Martin Levy (Hurricane Electric): Routing is distributed by BGP, we have a pretty good handle on that and the RPKI data is distributed, as you said, Rsync with the present code, you are playing around with BitTorrent, we could replace that with something else, that's independent. But it seems to have an effect on the graphs, the one hour poll time, for example show --
Randy Bush: All driven by the poll time.
Martin Levy (Hurricane Electric): So, scratching my head, if I'm a new network and I throw up a new network, I go to my RIR and put my RPKI information in and I'm ready to go -- oh, no, is the new rule on the Internet I have to wait an hour before -- it's like going swimming after you eat.
The question is: we have the separation between distribution of RPKI information, take it up a level, and this is the question, we have a separation between distribution of RPKI information and distribution of routing information. Are you giving us a clue that maybe those two should be a little closer to each other versus --
Randy Bush: You want to slow down BGP?
Martin Levy (Hurricane Electric): It's a leading question
to that answer.
Randy Bush: The one hour, in this case, was chosen fairly arbitrarily. There is consideration that if 1,000 ISPs are hitting the publication hierarchy, that's a little rude, to go it's so much faster. That's why we are looking at other distribution mechanisms, such as BitTorrent. But I think APNIC has stated their policy, and they may have changed it recently, I could be wrong -- that when you enter their data they commit to publishing it within 24 hours. So we didn't feel too badly about the one hour modelling.
The point is if I plan on doing something in the network two weeks from now that will have me announcing my prefix from a different AS, I'm going to publish that ROA now. I'm not going to wait. I'm not going to do it after the BGP announcement.
Martin Levy (Hurricane Electric): I think the question is not whether Randy Bush does that but the many other users of the routing system globally.
Randy Bush: They decide to wait two days before they plug their router into the mains current? That's their decision.
Martin Levy (Hurricane Electric): We have examples of this in many different forms of life in many different ways.
I'm just wondering whether we have introduced this new paradigm, this new issue, and, for want of a better saying, there's an education needed or more importantly maybe we have to be very careful about sample code that goes out, to make sure people know that life is not as simple as it is.
Randy Bush: I agree with you 200 per cent, but the education problem is much worse than even that. I go around and try to explain the RPKI to people, and I realise that I need a whole presentation on public key cryptography, that I haven't got. This roll-out is going to require serious education, you are correct, and that's one component of it.
Martin Levy (Hurricane Electric): OK. My point. Thank you.
Tomoya Yoshida: Any other questions?
Richard Barnes (BBN): Why BitTorrent?
Randy Bush: Because we wanted something that was a different paradigm. We wanted something that was a totally different paradigm. We think talking about Rsync versus HTTP versus any other flat fetch is going to be about the same, it's all garbage, it's all the same flavour. We wanted something that was a different flavour, a common different flavour for which we could get a lot of open source implementation, something that
was fairly well defined, and BitTorrent was our first experiment.
If you have got some other suggestions that are interesting, glad to hear them, glad to play with it.
This is research. This is not my ops hat, this is my research hat. So we gladly play with other things.
Tomoya Yoshida: I have some information, tomorrow we have RPK BoF from 5.00 pm, so we have another update for that, so please, if you would like to know the topics for RPKI, please join the RPKI BoF tomorrow.
It is time to close. Again, thank you very much to the three speakers.
Sunny Chendi: Good morning, everyone, welcome to day 4 of APNIC Conference here in Phnom Penh, Cambodia. Today's first session is APOPs 3. Before that, we started issuing the ballot papers at 8:30 am this morning outside ballroom 1, for the NRO NC elections that will happen in the Policy SIG after the tea break.
If you wish to collect, you can go to the voting desk and collect the ballot papers. Those who are registered for the Conference are eligible to take part in the elections. If you have a nametag, you can show the nametag and collect your ballot papers.
Also, the voting will start at 11 o'clock and closes at 2.00 pm, so you have 11.00 to 2.00 pm duration to cast your vote.
We received a statement from MekongNet in dispute of a presentation that was presented in APOPs 1. They
requested the Chair of the APOPs for this morning to read it out. So I would like to request Maz to read out the statement as it is, with no modifications or anything, please. Thank you.
Matsuzaki Yoshinobu: We received a statement about a presentation named "Internet in Cambodia" during APOPs 1 on 27 August 2012.
Sunny Chendi: The Chair passed the statement to me to read out. This is dated on 29 August 2012: "Re: Clarification statement.
"Ref: Presentation titled 'Internet in Cambodia' during APOPs 1 on 27th August 2012.
"In an earlier presentation titled 'Internet in Cambodia' in APOPs 1 session, regrettably there were some errors on the information about MekongNet which we would need to address and elucidate.
"First of all, there are neutral data centre services in Cambodia. MekongNet, for example, has already provided neutral co-location services to other ISPs and this is, as we knew of, on top of at least two other companies that are currently offering the similar solutions now.
"More importantly, regardless whatever insinuations from the presentation or other sources, the ISP/IXP licence awarded to MekongNet clearly stated that
MekongNet is fully entitled to operate DIX and IIG services in Cambodia.
"It was also wrongly stated in the presentation slide that AngkorNet is also known as MekongNet. These are essentially two different brands with different focuses and certainly they are not to be labelled/named alternatively.
"Last but not least, MekongNet also own thousand kilometres of fibre cables across Cambodia and international transmission network with neighbouring countries, Thailand, Laos and Viet Nam, as an ISP focuses on corporate and wholesale customers, which are ready to offer IPv6 to customers upon request. The updated reliable peering records of MekongNet and in fact the entire Cambodia ISP records can be accessed by general public from the renowned website HE.NET at http://bgp.he.net/country/KH, rather than some unofficial visualising AS path sites. Thank you." It is from MekongNet management.
Matsuzaki Yoshinobu: Thank you, Sunny. Let's get started with APOPs 3 session.
This is the third APOPs and the last one. Today we have three speakers, so you have 30 minutes each, including Q&A. First up is Geoff, talking about the BGP report, please.
Geoff Huston: Good morning. My name is Geoff Huston and I'm with APNIC. This is a progress report on BGP.
Looking around the room, I'm of the view that most of you have seen a fair deal of this material before. So if there is some email you should be getting on with, get on with it. Meanwhile, I'll just drone on here on the stage as quickly as possible.
What I'll talk about this morning is actually the BGP routing protocol and looking at this from the external BGP world, in other words the routing table over time. I will do some projections for future growth.
If you own routing equipment and you are going to buy a new router, the real question is how big should you buy it if you want it to last for, say, the next five years? What are the growth projections in routing and how much CPU power do you need? What kind of update rates can we expect over the coming years and the pressures on that? This is the really big picture. The scale goes from zero to just under half a million entries and the timescale goes from way back in the dim dark mists of time of 1988, all the way to yesterday. This is a picture of the size of the routing table, taken approximately every day for a few years and then every
Oddly enough, what you see is a mixture of both technology and economics. That early part of the growth of the Internet, from really 1985 or earlier, up through until -- well, the great boom and bust of the year 2000, is one of those first phases of growth of the Internet, where we were having those exponential up and to the right, and across the period from 1998 until 2001, which some of you may have lived through, was this sort of Internet to the masses, where all of a sudden, instead of just a few geeks using it, we started to use large-scale deployments and headed into the Internet as a consumer commodity.
Like everything that becomes a boom, the bust is inevitable, and the bust happened across 2001 and 2002, money got pulled out, grand plans were rewritten, the Internet stopped growing for about a year. But we are just evolved apes, optimism is eternal, sooner or later we get to the bullish state of mind and say, "We can do this." From 2002 until 2007, this was just, let's roll stuff out, and we did all over the world and the Internet BGP table grew from 100,000 entries to 250,000 across that period, with a lot of broadband based things.
The growth curve kind of went funny in 2007,
corresponding to Lehman Brothers going funny as well and Goldman Sachs was busily cleaning up all the residual bits of money and what happened was a lot of those growth trajectories got shortened out. Between 2007 and 2009, there was a certain amount of correction and a new dynamic started to happen. The new dynamic is an exponential, the new dynamic from 2009 is roughly linear, which reflects a more sober view of the Internet's growth.
Address exhaustion happened somewhere in between there, starting in 2011, so let's blow up the last few months from January 2011 until now. You see the point at which APNIC runs out, just that point in April 2011, the growth curve of BGP goes down a bit. The dynamic, though, is actually still approximately linear. So this idea that the Internet grows at a frightening pace and that everything is exponential may not necessarily be the case in terms of routing.
We seem to be roughly doing around 5,000 prefixes per month.
Let's look at another view of this. Let's not look at the number of entries in the routing table but the amount of address space being routed. Address exhaustion is now clearly evident. What's going on when you look at it is that around exhaustion we were
actually advertising around 145 /8s in the routing table and things were growing very, very quickly. After APNIC ran out, there was still a certain amount of growth, up at around 155 /8s being advertised, but the growth curve is a lot, lot lower. The strange spikes, roughly me and a few others, we were testing dark /8s and advertising them, and you can see the result.
How did the population of the routing space look in terms of autonomous system numbers? This is weird because I can't see evidence of address run-out or a hiatus in the last 18 months or so; it is almost constant. This industry doesn't even take a break for Christmas, it just pounds out new ASs into the routing table every single day. I'm not sure why we are that good and that orderly at the job, but that's what happens.
Overall, how much are we growing? Do you remember the early days when we were doubling every year and Internet growth was 100 per cent or 200 per cent? That's all history. These days the growth is more moderate. World population grows between 2 and 3 per cent, the Internet population grows at around 14 per cent a year at the moment, so it is still growing faster than population, but not by much. The prefix count is right up there at 14 per cent. The things that
relate to addresses, all because we have run out a bit, the address growth is a lot slower. It was around 14 per cent. It's been like that for a while, not much to say.
How about IPv6? We have been told it's very important, it's the future. Wow, it's weird. Firstly, the numbers are so small -- 4,000, 5,000, up to 10,000.
This is a very small bunch of folk, this is not half a million. Secondly, the two World IPv6 Day thingies are strongly evident. So in 2011, the growth up to World IPv6 Day is clearly there. Even this year, the World IPv6 Launch Day, you see the growth up to it, and then after it, everyone goes, "Oh, well, we did that," and then stops doing much. And both of those are pretty evident in the data.
IPv6 addresses are weird. It's like counting elephants and mice, the elephants dominate. When folk announce and withdraw /20s and so on, clearly they disturb the overall picture, so it's hard to see what is going on in address space.
If you look at autonomous systems, we see a different view because it's now not the growth, it's the enrolment of existing players into IPv6. As you see, World IPv6 Day, even in the days leading up to World IPv6 Day, we saw around 200 folk join in v6 just in the
few days before the event.
Equally, this year, we see around 100 or so in the leading week or so before World IPv6 Launch Day, so it's more clear in the AS numbers.
The statistics are sort of all over the place. The prefix count grew by about 48 per cent per year in 2012, interestingly, the "more specifics", the folk who are de-aggregating are far more enthusiastic this year. So that was a clear growth of 1,500 or around 130 per cent, so the de-aggregators have figured out what works in v4 works in v6 and off we go.
Growth rate overall, around about 50 per cent. If we keep on doing this and looking at the ASs, the v6 network will be about the same size as the v4 network in terms of autonomous system numbers, in about six years from now, if we get to live that long.
What does the future look like for the BGP routing table? I'm going to use the awesome power of mathematics and try to figure out what it looks like in the future by using polynomial based projections over historic data, if you are interested. However, like all forms of prediction, it's probably complete bullshit and the reason is that the environment we are in now is not the environment we have ever experienced in the past.
The address run-out events, APNIC last year, RIPE any
time soon and ARIN certainly some time in 2013, make these projections pretty rubbery, so highly speculative.
Let's take the last eight years, 2004 to now and draw a table, let's smooth out the table, let's do the first order differential -- ask your children, if you don't understand what a first order differential is.
Fascinatingly, the one thing that really stands out like a sore thumb is the global financial crisis, Lehman Brothers in action. The trajectory of growth, which from 2004 to 2008 was going from 50 new entries a day up to 120 kind of reset -- kaboing -- back down to 50 a day. The overall trend is then back again, so the second order differential across those two periods is about the same, which is kind of weird.
That's the daily growth rates. Can we do a projection? Yes, we can do a projection. Remember Newton and the least squares method? Let's apply that.
We get that equation, isn't it cool? We project it forward and it goes like that, and we get this table.
Magic, isn't it? What does it tell us? If you've got a FIB and you are running eBGP and you can handle a couple of million entries, read your mail; there's nothing exciting going on here. On the other hand, if your FIB has only half a million entries, you're toast, so maybe you should do
something about it pretty soon.
The projections are not exciting or dramatically uncertain. It seems to be pretty conservative, but it's not dramatic news.
What about v6? Here is that table for v6 now going back to 2007, because before that the numbers were just rubbish. Again, same old technique, daily growth rate, wow, don't the v6 days make a difference? It may have seemed like some sort of pantomime show about "Isn't v6 wonderful", but oddly enough the industry seems to listen to these things and takes them seriously. In terms of the growth rate in the routing table, these events are dramatic peaks in growth. It worked. This is cool.
Let's do the same thing, bring in some maths, the awesome power of exponentials has been deployed as well, because I wouldn't have a clue what the growth model looks like, and out come these numbers. They are growing like crazy, this is cool, but the numbers are tiny, 28,000 entries, ho hum, boring. They are up and to the right but not exactly exciting.
In technology terms, what makes a curve exciting, what makes it frightening? What makes it frightening is if the costs grow. What makes the costs grow or not grow is pretty much this basic technology equation of
Moore's Law that, as a rough rule of thumb, if you have got a technology thing that's growing at the same rate or less than Moore's Law, the unit cost is either steady or coming down. That's cool.
There are a lot of exceptions to that but frankly it has been a consistent thing in this industry since even the early 1960s. Here is something from Wikipedia that dates back to 1971 and, damn it, it kind of worked: the number of transistor counts on a chip has doubled approximately every two years since that time.
It looks like continuing, it's truly prodigious.
Let's take this curve and apply it to the routing table.
Firstly, let's look at v4. The thin line heading way up there is the Moore's law applied to the curve, and the science projection is well under. If you think routing will become a really expensive problem and you desperately things like LISP to make things work again, you are wrong, that's a myth. Routing is not growing at that kind of rate, nothing like it.
Even in v6 the growth rate is still roughly within the parameters of Moore's Law, both the exponential and polynomial curves don't really differ that much, but they are tiny numbers, 20,000, 30,000, 40,000 and the current technology level for FIBs is around the small number of millions, 1 to 10 million, so although it's
growing it's not at anywhere near a rate that will alter the unit costs of routing.
I don't understand why anyone thinks routing is a problem. The overall growth rates are modest, nothing much is happening. As long as the current BGP keeps working, there is no reason to think it won't keep working. Do you need to panic, do you need to run LISP tomorrow, go into alternative routing technologies? You can do that if you having nothing else to do on a Tuesday afternoon, but if you think you need to do it because you have got to save routing, nah, it's not happening.
These are more figures which are interesting. And I like this one, because the net we are building is not the net we see. The net we see grew by 109 million addresses in the previous 12 months. The routing table we see grew by 120 million addresses. The ISC host survey, which is an approximation, says we grew at around 90 million visible ping-able style of hosts. So you would think from those numbers the Internet is kind of growing at kind of 100 million a year, kind of. It's not 1 million it's not 1 billion, it's about 100 million.
Then you go and look at the vendors. Apple sold 100 million of these little black things in the last 12
months. You love them. You love them in extraordinary quantity. Apple have 24 per cent market share, so the entire market grew by around 400 million just on these things. Who has the iPad, who has the tablets, the Kindles? They all add.
So we can say roughly the number of devices that have things that connect to the Internet, whether by SIM card or jack in the wall or whatever, grew by around 600 million.
So the NATed net grew at a rate of around six times the visible net, as far as we can see. NATing is the Internet, no matter how you look at it. I'll leave you with that thought and you can figure out what it means, because I'm going back to routing.
People say aggregation is a really good thing and you should aggregate like crazy. Is anybody listening when we say that or are you just reading your email? Are we getting better or worse at really aggregation? What's going on? Here is a list of the folk who don't aggregate, Bell South in America, Net Servicios de Communicacao SA are in South America, Telkomnet AS2 PT Telekomunikasi Indonesia -- hello. 2,342 nets advertised. If I were to aggregate them using a relatively simple aggregation algorithm, I would be down to 477. You would be too.
If you can find yourself there, KISX AS Korea, TW Telecom Holdings, and so on. You can read this as well as I can. Look at what you're doing. Frankly, if you do clean up your act a bit, the rest of us will indeed thank you. It's not an entirely pointless exercise, it's worth doing.
Interestingly, the entire bigger picture has been rather bizarre. For the last 10 years, the number of specifics in the routing table is half of the routing table. Somehow, these days all 45,000 of us seem to aggregate at the same level as if there's some controlling factor. So 50 per cent of the routing table. Even when you take a very big route collection, like route views and look at 30, 40, 60 peers, every single peer has the same view, around half the table is more specific. Amazing.
The only thing that has changed is that the amount of address spaces that being sliced and diced as a percentage of the whole is slowly growing. 10 years ago, around 5 per cent of the address space was more specifics, everything else was an aggregate. These days, around 25 per cent of the address space is chopped up into more specifics. Interesting.
Is it the same people? Is learning happening?
I like this graph because, A, it's really noisy; B, it's got lots of colours; it doesn't move, so it's not that exciting, but what's amazing about this is that if you look at individual lines, some folk are down to the right, they are cleaning up their act. They were largely disaggregated and they have been slowly getting better over time.
Other folk are up and to the right, ignorance is being passed over from one to the other. The amount of aggregation clue in the Internet appears to be constant and as some folk learn, other folk un-learn. Those are the numbers, you can figure it out. There are more reports on cyberreport.net, if you want to look for yourself, but it is interesting that learning does and does not happen all at the same time.
Does it matter? This is interesting because when events happen, when you change over from one provider to another when you multihome, when you have got a web service that is going through two providers and you are looking for 24x7 reliability, when you change your routing, how long does it take for the world to work out your new route? Is routing scaling? This is a different view, this is not looking at the size of the table but the number of messages, how many BGP updates per day. Since 2007 the network grew from
100,000 to 400,000 entries. Any decent law of physics would say, noise is proportionate to size, the number of updates would have grown by 4. They haven't, they have been flat, flat, flat, flat, flat, flat. Weird.
It's almost as if you are driving to work every day, every day it takes 10 minutes to drive to work. Over that period, the number of other cars on the road grows by a factor of 10, but it still only takes 10 minutes to drive to work. Why? How are we managing to cram so much more into the routing table yet still keep the dynamic levels totally flat? I tried to see this. How many instability events per day? 50,000, has been for years. How long does it take to converge? About 60 seconds, has been for years, in fact it's getting slightly better. This is weird.
Even when I look at the number of updates to converge, it's taking fewer. Routing is getting better, not worse.
So your trip to work, even though there are more cars, is actually taking you less time. It's the same car, the same protocol, it's the same engine, nothing is changing but the world is improving. This is prodigious magic. Why is it prodigious magic? Now you start to look at the macro world of the Internet. There is a view of the universe as a whole,
the cosmological universe, that it is expanding. In many ways, as humans, when we think of a system that grows bigger, like a city, we generally think the growth is at the edge, the suburbs move onward and outward, sooner or later, if Phnom Penh keeps growing it will extend automatically all the way out to the borders because more people means more size. Is that true of the Internet? As we connect up these 600 million new things, does that make the Internet bigger? Fascinatingly it doesn't. That's not the model of growth we are seeing in the cosmological view of the Internet. It's not the case that as we get bigger the network itself grows, it's not growing.
The other way of getting bigger is by compression.
The gravitational force takes over and is higher than the radiation force. What actually goes on as we grow the Internet is actually we are cramming more and more things into the same diameter. That's why we are seeing all the flat curves. It's not that you attach to the edge of the network, you attach within the existing network. The network is clustering.
Why? There's one interesting graph here, which is AS connectivity. How many other ASs does each AS connect to? Apart from some experimental noise in my
collection, since around 2004, I can see a trend there that I have drawn across the numbers, that show inexorably that connectivity is improving.
What is this saying to me? I suspect the Internet has massively changed shape over the last 10 years, the entrepreneurs have cashed out and left. We are now big business, we are now enormous business. Globally, I suspect there are five major transit operators.
Nationally, I expect there are at most three major national aggregators, and that's it.
The world of transit is small. The number of players is small. They are just growing internally, there are not more of them. The number of access providers is now almost a constant across the world, they are growing internally, there are not more access providers.
There is a third group of ASs you see, the content distribution networks, and they are themselves fixed and growing internally. Interestingly, you are now seeing a different kind of tension in the Internet. It's not large versus small, that's bullshit, that's old stuff.
It's different activity sectors that are trying to figure out who gets the money.
I suspect what you are seeing on the Internet in terms of routing is what you are seeing on the Internet
in terms of business and policy. I suspect the underlying tensions that are driving the growth is the tensions between access networks with direct relationships with users, carriage networks with massive capital investment in undersea and transcontinental infrastructure and content networks with massive investment in data storage and reticulation, and it's that bulking up and specialisation which is occurring.
What does that mean for routing? It's not a problem. What does that mean for the Internet? That's a bit more interesting.
What this means now is what we are seeing is role specialisation and the tension between the roles. Who has the money? Who wants the money? How is that want and need expressed in terms of the tensions that are happening between users, carriage and content? You thought you were going to hear a routing talk, what you actually heard was a policy talk in disguise.
I hope you found it interesting. If you didn't, I hope your email was fascinating. Thank you.
Matsuzaki Yoshinobu: Questions? Thank you.
If not, next up is Emile from RIPE NCC, with a presentation titled "Update on RIPE NCC R&D activities".
Emile Aben: Good morning, I'm Emile from RIPE NCC. My
apologies for this rather nondescript title. I have three things I want to cram into one presentation. It's normally a bad idea, but it gives you three attempts to look up from your email, if you haven't already read it.
The first subject is RIPEstat, which we like to call "Anything you ever wanted to know about an Internet resource." The URL is here, if you go there, if you have an AS you want to know more about or a prefix you want to know more about, v4 or v6, just type it in and you will get a whole bunch of information in one single place, so you don't have to go to the database or the RIRs or to RIS to figure out about routing, or to a geolocation provider; you will get it all in one single page.
One interesting thing I want to point out that we did recently was we have more tighter integration with the APNIC database, in cooperation with APNIC, and we are talking with APNIC about more cooperation here.
What we plan for members, so it is also possible for APNIC of course, is that he we can provide a little bit of database history specifically for the members.
An example here, this is 2400::/12 in the RIPEstat object browser. As you can see, this provides a view of what are the less specific objects, so in this case it's the whole IPv6 Internet; it shows you what the /12 is
and also shows you the more specifics, so you can see how address space was delegated out of this block. It also has a little side navigation thingy, the five-corner shaped thing -- I don't know what they call it in English. So you can actually go up, so go to /11, go to the adjacent prefixes or go to the two more specifics. I think that's a really cool way to visualize address space, and it's now available for the RIPE database as well as the APNIC database.
This is your second chance to look up from your email, World IPv6 Launch. Before 6 June this year, we still had this IPv6 problem which we still have, which is that if you look at the ASs, about 13 per cent of ASs have IPv6 or at least announced something, but if you look at what they are doing with it, so it eyeballs 0.6 per cent or content, 1.3 per cent, it is somewhere in the Internet but it is not getting to the edges of the networks.
World IPv6 Day last year, 2011, helped, and the launch helped a little bit more, and we wanted to do some measurements to provide some more insight on this.
What we did was we made a dashboard, so we took 50 vantage points and picked 50 of the participants of World IPv6 Launch that we thought were interesting, and wanted to provide some more measurements than were
already there. So we were doing DNS A/AAAA fetches, round trip times with ping and ping 6, forward path traceroute and HTTP fetches. The URL shows you all the data we collected. We are also making the data available to anybody who wants to research it any further, as raw data.
This is our measurement network. Because we are RIPE NCC it is centred around our service region, but we have coverage in the other service regions as well. We are missing a little bit of coverage in South Asia, as you can see, but this is because our vantage points needed v4 and v6 and we had the v4 vantage points in this area but not v6 vantage points, unfortunately.
What did we see? The image shows when people announced their AAAAs in DNS, so at that point their web services became accessible over IPv6, or should be accessible over IPv6, of course -- the red arrow is 6 June, midnight UTC, and the green is when all of the vantage points saw AAAA records for a specific website, so all the websites are listed as rows here.
What you see is at 6 June, people turned it up, but the more interesting thing is that before 6 June, the day before, a couple of days before, people already put it up, so people didn't wait for that exact date, and I think that shows confidence that this thing is just
going to work and the other thing is, of course, people left it up.
For the people that didn't turn it up, we tried to contact a lot of them, out of our list, and I knew that the information in our databases may be sometimes a little bit outdated, but this turned out to be surprisingly hard. So it is very handy for people to actually keep your RIR database contact information up to date because sometimes people will want to contact you with interesting stuff, and the contact email addresses in there are not only spam traps.
Going back from our measurements to what we see in the Alexa 1 million, and this is data from Dan Wing from Cisco, that he puts on a web page, and I made a graph out of that. What you see in 2011, that's the first spike. It's got a good uptake up to about 4 per cent, then everything went down again, as everybody knows, and the second, the big thing there around 6 June, 2012 was World IPv6 Launch. As you can see, people kept it up.
The most promising thing I find from this one is that it is still growing. If you look at content, apparently if you look at Alexa, it didn't stop after World IPv6 Launch, there are still people putting AAAA records up for their content.
We also looked at the relative performance of v4 and
v6 during World IPv6 Launch, and this is only for the 50 vantage points and the 50 websites we tracked. So what we did was for every 10-minute interval we looked at the minimum RTT in IPv4, what's the minimum in IPv6, we look at the difference, and divide that by the fastest protocol, so you get a percentage of -- so is IPv6 10 per cent faster, is IPv4 10 per cent faster? For each combination of source destination, that gives us a single value and then we add it up over the whole day and come to this histogram.
What you see here, the big thing is there's a big spike in the middle, meaning in the majority of cases v4 and v6 had equal performance.
If you have to pick a winner, if you really have to, it's going to be v4 because you see the left side of the histogram is slightly bigger and bulkier. Because this solution had very long tails, I binned up everything that was over 250 per cent performance difference, and there you see quite a difference, in 8 per cent of the cases we saw v4 being faster, in 2 per cent of the cases we saw v6 being over 250 per cent faster.
Looking into it a little bit, this turns out to be caused mainly by big content being in lots of different data centres and some data centres being v6 enabled and others not.
So as an example, I'm in South America and there's a big Datacentre right next to me that's v4 only, and then the same content provider has a Datacentre in North America that is v6 enabled. So that content provider gives me an A and a AAAA, for v4 it's next to me but for v6 it is in North America, so of course you cannot change the laws of physics, so the speed of light, of course, my v4 is going to perform faster. I expect this to get better if the bigger content providers turn up v6 on all their data centres.
We were also looking at it a little bit from are things that are close together different from things that are far away from each other? What we did was we took the same data and split it out by, you have IPv4 and IPv6 under 50 milliseconds or everything else, so at least one was over 50 milliseconds.
What you see in the over 50 milliseconds case is that more of the body of the distribution is in the middle, so things become more similar, where in the under 50 milliseconds you see the huge differences being more pronounced, so it's up to 12 per cent v4 faster, 4 per cent v6 over 250 per cent faster. It's kind of logical because if you are over 50 milliseconds you fill the network for a longer time and it's more likely that things like, taking a submarine cable or path, it's more
likely that things are more equal.
For this performance part, if you look at it numerically, if you define equal performance as being within 20 per cent, it's 10 per cent v6 faster, 62 per cent equal, 28 per cent v4 faster. So there is some difference, not a lot. Happy eyeballs can, of course, take care of the extremes. But it's also true that dual stack means that it does actually give you two chances for the best performance, so to look at it very positively, in 10 per cent of the cases the hosts are significantly better off if they use IPv6, based on these source destination pairs that we have, of course.
We also did HTTP measurements. This is a snapshot from World IPv6 Day. What you see in the rows is the participants, in the columns you see the vantage points.
Green means HTTP fetch was okay, red means it failed, blue, we didn't have IPv6 so we didn't try to fetch over IPv6, orange is DNS error. So the blue lines are basically people that didn't turn up, didn't have AAAA record, so we couldn't fetch.
You see the vertical pattern, that meant one of the vantage points in this particular timeframe had limited IPv6, and if we compare this to IPv4, this just showed slightly more failures than IPv4.
If you are interested in this, there is a YouTube
URL there. That shows all of this for World IPv6 Launch in 165 seconds, I think.
What we got from this is that IPv6 has slightly more reachability problems than IPv4. So what could cause this? I was wondering, how does the outside world see you as a network? Do you have any idea of that? In IPv4 you have a pretty good idea, because if something breaks you get a lot of complaints. In IPv6, there's not a lot of users and we are all being happy eyeballed at the moment, so will you get complaints if IPv6 breaks? It's just going to be the geeks that complain about IPv6. So we thought that RIPE Atlas can help there.
I think we have had presentations about RIPE Atlas in previous APNIC meetings and I don't have time to go into the details, but this is our measurement network that has the probes, they are very small, so very easy to deploy. We currently have 1,800-plus installed worldwide and they do pings, traceroutes, all kinds of measurements, to help see the state of the network.
If you want to know more, there is the URL.
That brings me to the third part, which is traceroutes with RIPE Atlas. What we did in preparation for World IPv6 Launch is we allowed the RIPE NCC members to traceroute6 to their websites from all IPv6-enabled
Atlas probes. We are, again, talking with APNIC about more cooperation and possibly making these available to APNIC members too.
If you do a traceroute6 from 600+ vantage points, you get lots and lots of traceroutes. The unprocessed output is available after roughly an hour, because we do the measurement for 30 minutes and it needs some processing time. If you want to work on the raw output and see what you get from that, great, you can get the data from RIPE Atlas.
We also wanted to provide a little more easily digestible information about the connectivity into your network, so we wanted to make it a little easier to navigate, because 600 traceroutes is quite a lot to analyse by hand.
This is a picture of the IPv6 enabled RIPE Atlas nodes currently. It is quite a nice worldwide spread, of course centred on the RIPE region again.
How do you look at 600 traceroutes? Of course, we have the raw data output and I just took an example of that, so this is a traceroute from a probe in Wellington to ns.ripe.net. We tried to infer the AS, and that is not an exact business, but it gives you a pretty good idea of what path these packets traversed.
One way to do your data reduction here is to
summarise to this probable AS path, and we tried to account for things that can get in the way, like Internet Exchange Points, so what you see is -- that upper thing, we summarised that to just this row of numbers.
From that, we tried to make a pretty picture. There is an article in RIPE Labs that describes it a little bit, it's got far more detail than I can explain here.
The second URL there is the live demo, so you can see what we produced there. The live demo might be slightly more interesting than my slides, because it is interactive.
This is an example of what we did to ns.ripe.net.
In the centre, the red dot is the AS where we saw the destination or multiple ASs where we saw the destination if the destination was in multiple ASs.
From the outside is where the RIPE Atlas nodes are, so you travel the AS path from the outside to the inside, and you can actually see if the inferences are correct of course, what connects to what, either directly or indirectly. In the left bottom corner you can see a well known IPv6 network being in the path for lots of connectivity to ns.ripe.net.
Of course, the inferences can be wrong, so what we want to do is make this interactive, so you can see for
yourself, go back to the original traceroute data, and see what the IPs were, what the host names were, so you can see if this was correctly inferred. So in a sense this becomes a way of browsing through this whole repository of traceroute data.
You can do the same for failed traceroutes, so this can inform you about IPv6 reachability problems. One little problem we have here is false positives, because a traceroute is really this tool that was meant to debug connectivity issues, so what you see is -- for instance, this is a trace to ns.ripe.net again and you see the node I pointed out is RIPE NCC, so we have a router there that is probably rate limiting or something.
Again, by this interactivity, you can go in and see the traceroutes that went through there and you can determine yourself if these are ASs that have problems with your network or not. Or if you want to contact somebody about this, you have the traceroute data and the guy on the other side can tell you, no, that's not my AS, that's my neighbour.
What we are trying is to provide useful tools, measurement analysis for the Internet community in our region and beyond. We are very interested in your feedback, let us know what you think, I'll be here the rest of the week. If you are interested in stuff like
this, keep an eye out on RIPE Labs. That's it from me.
Matsuzaki Yoshinobu: Thank you.
Matsuzaki Yoshinobu: I have one question. Do you have an address probe to distribute?
Emile Aben: I don't have probes with me, unfortunately.
I was in a hurry when I took the plane.
Matsuzaki Yoshinobu: Any questions? If not, let's move on. Thank you.
Next up is George from APNIC, talking about "Measuring of IPv6 with advertisements for fun and profit".
George Michaelson: Hello, everyone. I'm from the APNIC Research Group and I want to give you a background on the IPv6 measurement activity we have been doing.
There is this question in our minds: how do we measure the end user? There are a lot of different kinds of measurements we can do, there are people measuring BGP, people measuring website hits, people measuring traffic at the exchange point, to try to get a handle on the relativities of IPv6 uptake in traffic.
We thought it would be interesting to find a way to measure end users. Let's look at the problem of getting v6 to the last mile.
The real question is: how do you measure a lot of end users? What you really want to do is not to measure one, but to measure as many as possible and get a statistically valid sense of what's really going on in the wider community at the end user level.
One answer is to be someone big. If you are as big as Google, then you have a mechanism that will allow you to tell something about every corner of the network, because everybody goes there. Unfortunately, APNIC doesn't have the luxury of being Google.
The alternative is to find a way to get the tests that you want to perform, the measurement to be run by the end users themselves, get your code to run on their machines. If you can do this then you have a way of measuring the behaviour of the end users anywhere.
We wanted to do this measurement, we wanted to get a sense of what all end users could do and we had initially been looking at the behaviour on our own website, and the observation we were making is that we predicted massive IPv6 uptake when we looked at our own website. The reason is all of you. You are all really good at running v6, you are running v6 in your core network and in your back office, so when you come to a website like APNIC to look at resources, of course you have v6, which means we saw an inflated count.
We very quickly realised we are a corner case that is not the right basis for measuring activity, so we needed a way to do measurements of end user capability, we were thinking about websites but we wanted it to be a huge random and statistically valid count. We stumbled across a feature of the advertising network framework.
The idea is that if you are prepared to give an advertising network money, they are prepared to present fresh eyeballs to you. The whole point here is that they have a method of pricing, the cost of the advert that's calls CPM, clicks per million.
The mechanism we are looking at is that if you pay a high click per million, you are saying that you want your advert to be seen by people who are prepared to actually come and visit your website. This is the classic advertising paradigm that you have achieved a sale and it is a high value eyeball. But the advertising network really badly wants your money. So if you bid a low CPM, they have to flip over to another model -- and in that model they say, we will show you different people. There's a chance one of them will like you enough to click on your ad and we will show you as many people as possible, and if we can get you above a threshold, we will take some of that advertising money that was going on a clicks per million bid.
A good network actually provides you with lots and lots and lots of unique clients if you bid low in this mechanism.
The second thing is that advertising networks are using flash, it's the primary vehicle for delivering advertising context. You tell a website owner, I'll give you a revenue stream if you allow me to place flash adverts on your site. Then the flash wrapper goes back to the advertising network and fetches a random ad.
Flash code is just being used as a vehicle to drive this but because it's a full programming language and has the ability to go out on the network we were able to use flash to conduct these tests, to do the tests that fetch objects from the web using unique addresses that are either on dual stack or v4 or v6 and get a sense of what was going on because we could collate all the measurements.
The result of this is that by using flash and by using advertising network placement and bidding low, we got lots and lots and lots of unique IPs.
A reasonable question is just how unique are the IPs? What we have done is an analysis where for every day of sending the advert out we collate how many unique addresses have seen it in that day. We plot that both per day and since inception.
The reason is we are receiving somewhere around 1 million hits a day but there are 2 billion people in the Internet and they are all looking at websites that need advertising revenue. So the gap between the total population and our sample size is so big that they can
guarantee to give us an IP address we have never seen before. There are billions of them.
What we are finding? We have a website we published under the labs URL and we are providing information broken down by the ASN, the origin AS number of the IP address, by the economy, by UN region, by organizations.
We thought we would deliberately design this to present information in a way that would be useful for both network engineers and network planning but also for strategic planning and economic planning.
The economic and regional and organizational breakdowns are the beginnings of some information that we think is much more useful for long-term trend analysis. If you look at the OECD and similar bodies, they are quite interested in publishing monthly and quarterly reports on the rate of change in technology uptake, so we feel by mimicking that style of information we can actually provide something useful to that kind of planning.
We are getting hits from around 125 of the 200 to 250 economies in the world, so somewhere around half of all economies are being seen.
We are seeing about 2,400 AS numbers, which can give us data we can graph. We have seen far, far more ASN.
If you look at the bottom, we have seen over the life of
the experiment 35,000, somewhere around 75 to 80 per cent of the entire AS set that is currently routed, we have had some measurement on.
I like to tell people we've seen Vatican City, we have seen around 16 measures from Vatican City. The last time I gave this talk, I said, "I'm not sure whose eyes were on the keyboard when it happened and I couldn't confirm they did v6." The next speaker said, "I helped network the Vatican City and I can assure you they do have v6." Maybe we need more cardinals to go on the web and see if they can see our advert, so we can measure them better.
We are also using a visualisation technique that Google has done, which provides a really nice simple mapping on to the UN region model we are using for the regional breakdowns. This is an example of the visualisation, showing the South East Asia UN economic region. You can see I have highlighted Thailand, which is showing 0.2 per cent IPv6 preference. Indonesia also shows up as one of the stronger economies in this region, but the general trend in South East Asia is that there is quite a low uptake of IPv6 compared to the world.
If we look at Asia as a whole, you can get this enormously strong signal that China is actually
providing 0.4, so that is markedly higher uptake of IPv6. Japan is currently around 2 per cent, one of the world's best IPv6 uptake nations.
We also have basic charting and you can see here an example of the monthly totals for Japan. This shows quite nicely how there was a relatively flat period across 2011 and early 2012, where the rate of change in Japan was low, but around the time of the World IPv6 Day event there was quite a marked increase in penetration of IPv6 in the Japanese community.
This is an example of a 30-day moving average graph taken from the same input data, and it is for the United States. This shows a really, really strong signal for the impact of World IPv6 Day. We have actually got this down to quite fine grain detail for the individual providers in America, and it shows -- there was a presentation yesterday about v6 deployment and this shows quite clearly Verizon, for instance, modelling this kind of rate of uptake.
Some of the other things we have been doing is taking this information and cross-correlating it with the published information on population, GDP, and population of IP users by economy. This is an example of a table Geoff has been publishing which is ranking the economies in terms of their v6 use ratio. You can see that Romania is at the top of the list at 9 per cent, with France at 4 per cent. This is essentially because in each case two ISPs have done a phenomenal job of v6 deployment. In Romania a company has taken v6 to every single CPE in their deployment and they are sitting at 20 per cent penetration but they only have about 50 per cent market share so the national total has been dragged back.
In the case of France, Freenet, ProXad is using the 6rd tunnelling technique, which is a beautiful example that if an ISP deploys that class of CPE overreach technology it can work very well. When we say tunnels
are bad, we are talking about wide area tunnels going outside your own locus of control. But if you are the ISP and if you have a problem with upgrading your CPE but you can deploy a technology like 6rd, I can absolutely tell you it works and they have achieved 17 per cent penetration of v6 into their customer base, using 6rd.
The thing here that is really interesting is this is a sortable table. If we re-sort this to the estimate of v6 users, by taking the v6 ratio and applying it to the population of the Internet, you get a really interesting ranking of how many people are currently probably using v6. That first five, the ones at the top, that is 11 million people. 3.8 million from the USA, 2.2 million in France, 2 million in China, 2 million in Japan and just under 1 million in Romania. In relative terms that is small, set against the total Internet population, but in absolute terms, 11 million people are demonstrably out there v6 enabled. That's amazing.
The second thing is if you look at the penetration rates, America has a penetration rate of 1.5 or 1.6 per cent so it is quite a lot ahead of the 0.8 world average but China at 0.4 per cent is very close and it is quite clear, given the continuing deployment of v6 in centrally managed networks in the Chinese research
academic telco community, there are going to be a lot of Chinese v6 users very, very quickly. This is going to be a very significant population.
The second observation I would make is if you look down the list, you will see there are a lot of Asia Pacific economies represented in the top 25. Thailand is there, Indonesia is there, Japan is there, China is there. The economic powerhouse is Asia. We are actually participating in something that is going to consistently move up through the food chain. This stuff is fascinating.
We are also providing breakdowns by AS number. This is the information for the World IPv6 Day launch participants. It is lovely that on the day I took this graph a university in Thailand is showing as the current top instance, they had 27 per cent v6 preference. There is some change that happens on a day-by-day basis, but you can also see from this that a lot of the participants in v6 Day are research and academic networks, but also there are real providers.
KDDI, for instance, is in fourth position, has achieved a very significant penetration rate in the Japanese market. XS4ALL has achieved significant penetration in the Netherlands. Verizon is showing up.
It's an interesting view of things.
In summary, I want to try to give you a sense that when I'm optimistic about v6 and when I say it's happened I think there's good data to back that assertion. I understand this is a logistic social supply curve and we have to be wary of too much optimism. It is possible we might reach peak early but all the signs are that we are seeing continued growth in deployment of v6 technology. I think things are looking pretty good.
The second point on the bullet list is that Global Unicast will soon overtake Teredo. That is an interesting inflexion point. Teredo is a phenomenally popular tunnelling technology that is built into every desktop and we know it doesn't work. So if the Global Unicast v6 address usage is going to rise to a point where it exceeds the broken tunnelling mechanism, that's really good.
There is the CPE problem and we have to recognise the investment issues around upgrading the CPE is a big problem, but the 6rd story can be seen to be beneficial, free, have successfully deployed 6rd and achieved a penetration rate of between 17 and 20 per cent of their customer base, right now.
There is a lot more information there and we want to go on collating and collecting this and looking at the
different aspects of behaviour.
If you see the advert, please do not click on it because it winds up costing us more and reduces the number of people we can show it to.
I would like to thank the Internet Society and Google who very generously provided sponsorship towards displaying the ad, and ISC and the RIPE NCC who both very generously provided host to terminate the service load of a million hits a day that we have been seeing.
It's been quite an active collaboration with the RIPE and it's been a very fruitful activity which worked very well, so I would like to thank them.
Matsuzaki Yoshinobu: Thank you. Any questions? APPLAUSE
Richard Barnes (BBN): George, you said don't click the ad, but what does the ad look like?
George Michaelson: If I told you what the ad looked like, you would click on it. It says, "Thank you for helping us measure IPv6."
Randy Bush (IIJ): I'll make a bot to do it.
George Michaelson: Randy, you would never distort a statistical measure like that.
Matsuzaki Yoshinobu: Thank you. Any other questions? If not, that finishes this APOPs session. This evening we will have lightning talks session, and thank you for
submitting your interest to share your experience or thoughts. We will have eight speakers there.
Sunny Chendi: Please upload the slides so we can put them up on the screen.
Just a reminder, the voting desk is open, if you haven't collected your ballot papers for the NRO NC elections, please collect your ballot papers. The voting starts at 11.00 am and finishes at 2.00 pm.
Would you like to break for tea? Let's go for morning tea and come back at 11.00 am. Thank you.