Transcript: AMM - Session 2
Disclaimer
Due to the difficulties capturing a live speaker's words, it is possible this transcript may contain errors and mistranslations. APNIC accepts no liability for any event or action resulting from the transcripts.
Wednesday, 25 August 2010, 14:00-15:30 (UTC +10)
TOMOYA YOSHIDA: Yes, so, it's time to start. So, this is the afternoon APOPS session. My name is Tomoya Yoshida chairing this session.
And we have our three presentations. One is the last operators tools from RIPE NCC from Mark. And the second one is practical DKI M deployment from Daniel, OVEE. And the last one is the Underground Economy from Marcel. So, shall we start?
MARK DRANSE: Good afternoon. Hello, everybody, after lunch. I hope you're all well fed. My name is Mark Dranse and I'm from the RIPE NCC. For anybody that doesn't know what that is, we are the RIR for Europe, Middle East and south-west Asia. So we perform the same function as APNIC, but in a different part of the world. What I'm going to talk to you about today is a collaborative tool. We have RIPE Labs. This has been mentioned at APNIC before, I think at APRICOT back in Kuala ?Lumpur, but there's been changes. I'll forward through some of the content that you can find there and then invite some questions. There's a big URL there. I know that normally in these sessions, people like to sit and read their e-mail and browse the web. Perhaps in this session, you would like to browse that website.
Hello George, everything OK? Thank you.
OK, so what is RIPE Labs? Well, it's a website, fairly obviously, if you've been to the address, you'll see. But more importantly, it is intended to be a platform and a tool for the community. You might ask what the community is? In the RIPE community, we all that our members, our friends, everybody, basically the community extends to everybody, including you, the community. The community is global and across the planet. Everybody can participate and it is open to everybody. It is called RIPE Labs, but not just for RIPE region things.
What you can do at RIPE Labs, you can preview new tools and prototypes. We develop a whole suite of things within the RIPE NCC. We run a Whois server and data collection and analysis tools. There's also contributions from external people, not RIPE NCC related. You can expose your own ideas and research. If you're looking for a platform of which to talk about and talk about something of which to give some crazy idea. You have a software tool to play with and get feedback on, you can put that up there as well. And then you can also come along and contribute your own views on the stuff that's been posted, so there are forums and blogs where you can come and say that this is rubbish, this is great, and this is really good. But make it a bit more like this. Tweak it like that and change it like that.
It's completely open, so anybody can come and say anything that they want.
We've had a little bit of a re-design of the site since it was first launched. This is what it looks like at the moment. It is all nice web 2.0-ified. And there's twitter in there and a tag cloud and you can see how many people have twittered and the articles which have been posted in there. So the improvement s that we've put in place, there's obviously a whole load of look and feel stuff that's been improved. We had quite a lot of feedback from the first site and realised there were places to get it better. We have the tag cloud and the live search and it guesses what you want to look for and shows that to you. And project pages where we can take different postings and articles about the similar sort of topic and merge them all together so you can find what you're looking for.
We made it easier to participate. Previously you had to register if you wanted to say anything. But now you can comment. You only need to register if you actually want to register something.
So what is on at RIPE Labs at the moment? Just to give you a taster of what it is being used for and being used for? I won't read the whole list, but you can see some of the tag clouds, DNS, DNSSEC. Just to bring back the point that this isn't just about RIPE and the RIPE community. One of the biggest words at the top there is APNIC. We're very pleased to have some content from some of the colleagues in this region. I'll run through some of the examples of things that are up there at the moment. Obviously there's IPv6, and unless you've been living under a rock for the last decade, you've probably heard of this. We did some measurements about v4 and v6 capability cape different browsers and resolvers. And looked at the percentage of clients that have v6 capability there. There's a table there. I won't go into it too much because I think that you should go into RIPE Labs and read about it there. There's a lot more data. We've aggregated a lot of global work that a lot of people have been doing on IPv6 uptake and penetration. And we've compiled that into a number of articles so you don't need to go hunting around all of the different websites to compile that together. We've compiled it together in one place. There's stuff there from APNIC, from Hurricane Electric and Google, but obviously we might miss something. So if you've done work or you know of work that's interesting, feel free to comment or e-mail us and we can add that into the next article about that.
There's also a very comprehensive IPv6 CPE support survey contributed by one of the RIPE community members by Marco from Community For All, which is one of the oldest IPs in the Netherlands and one of the oldest to support IPv6. He's done a hell of a lot of work drawing together information about end-user equipment and users and that's updated regularly and the current version of that is available up on the site.
Another thing which we thought of within the RIPE NCC was this notion of IPv6 RIPEness. And that's basically where we consider a measure of how RIPE or how mature our different LIR members are in terms of their v6 uptake. And we came up with a star rating and there's a chart here showing the stars. So there are four categories at the moment. You can see having a v6 allocation. Being actually visible in BGP and having reverse DNS configured. And for each of the points, you get a star, and we rate you based on that. You can see in here, different people have different numbers of stars and each of these columns. It is not easy to see here, but if you look at RIPE Labs, you can get a better view of that. Each of the columns is for a different country within the service region.
They're moving very quickly ahead in terms of v6 deployment and uptake in that region, in that country, sorry.
If you actually go to RIPE Labs, we've generated these graphs going back over about six years, and there's actually a little video animation that you can look at and you can see the charts moving up and down as people get the allocations and deploy them.
There's a lot of DNS related content. Again, if you've been living under the same rock as people who don't know about v6, you probably don't know what DNSSEC is, but this has appeared in the root zone. The RIPE operate the K-root and we have analysis on there and some perspective on what deploying DNSSEC has spent to us as a root server operator. Looking at things like priming queries and large TCP queries. And that brings us on to the DNS reply size tester, which we wrote inhouse, and made available there. This is a tool to determine resolver capabilities. I think that George showed this morning, or this morning, whether some of the things actually talk v4 or v6. We've done something similar, so there's about 685,000 measurements in there from 45,000 different sources and I'm not going to tell you the results, but they?re on the website. We also operate a rather lovely DNS monitoring service. We monitor... sorry, we develop and maintain that constantly. There's a new interface which is in the BETA testing phase at the moment, so if you're interested in the quality of root server DNS operation, you can come along and have a look at that.
So, we have a database, obviously for all of our IP allocation data. We have a database team that maintain that, and they come up with some new tools every so often. We're now providing our Whois data through an interface so you can write your own software to interface with that. There's also a new interface which sits on top of that to do queries against the RIPE database. A full search. You can see, I hope, in the picture there, the different options available to you. And another tool that they've generated in collaboration with our anti-abuse working group is the abuse finder. A tool to find abuse-related contact information from the database. So you just spit an IP address into the tool and it will scrape and try to work out which e-mail address to contact, whereas previously, people will go through chains and chains of referrals and finding the wrong e-mail addresses or too many or not enough.
So this should simplify that act.
BGP routing and others, there's a lot of content in here, too much to list. But I'll run through some of these. Some external contributions from networks, Exa Networks, which is a tool which interacts with BGP. It helps the automation of router configures, so that's been there. So BGP route origin validation, looking at mis-announcements and hijackings. I think if we finish the slides, there might be a lightning talk about that this afternoon.
Looking at the effects of World Cup traffic contributed by a surge from euro-IX. So a lot of analysis there of traffic across different IXPs across Europe, primarily, but across the world, but looking at the effects of the World Cup on that and what happens when matches finish and people go on Facebook to complain about the opposition team!
We have another tool, the RIPE NCC data repository. Which we've created as an enormous data store for people to put different data sets on. We have a lot of data sets internally, so we can now provide access to those via the single unified data repository, and we invite other people who maybe don't have the bandwidth or the storage space to put the data sets online to contribute to that. We have a university in New Zealand. I won't try to pronounce the name, but they've got a lot of data which is one of the data sets which is sitting in there now. And something else that George went over this morning - obviously there are many RIRs doing some of the dark space measurements of the bogon space and we've got some analysis similar to the work that George and Geoff did, so that's available via Labs at the moment.
One question that comes up sometimes is why? RIPE Labs, why did we do this? The intention behind it was to have a faster and tighter innovation cycle for stuff that we develop internally. Historically, we would go and talk to people and find out what they wanted and go and work on it and work on it and then it was finished and we would put it out and people would say, that's not what we wanted. So this gives us the opportunity to actually release prototypes earlier so that people can comment and say what they think. We can also now rapidly react to events of interest. Sometimes things happen. There are cable cuts, there are volcanoes, there are earthquakes and all sorts of things happen in the world. In the olden days, we would spend some time looking at this thinking about it, writing an article, having it checked for commas and spellings and apostrophes and this has a faster editorial signal so we can dump data in almost real time on to Labs as things are happening. And you can now actually hear from the individuals involved, so the engineers in our departments which actually work on the tools that you use, you can actually hear direct from the people that are developing this stuff.
The other benefit we find is, or we desire, which we're seeing as well, is the tighter community interaction. So we can be more transparent in what we're doing and stick stuff out there straight away. We can collaborate and you can come and tell us what you want and you can give us feedback and if you done like what we're doing, at the prototype stage, we can stop what we're doing. Amplify a more effective feedback loop. We have meetings like APNIC have meetings. The members come along twice a year and that's traditionally the point in time where they would talk to us and tell us what they think, but this gives us a simpler way to interact with people throughout the year.
So that was just a taster. There's a whole lot of content up there. It would be nice if you go and take a look and participate and if everybody here looked at an article and made a comment today, that would be very interesting, or asked a question, that would be great for us. If you have any questions or comments now, I'm happy to take them. If not, there's an enormous copy of the URL there. Thank you.
TOMOYA YOSHIDA: Any questions or suggestions to Mark.
TOMOYA YOSHIDA: The second one is Practical DKIM Deployment.
DANIEL BLACK: From the beginning, for a change! Welcome everyone.
What I've heard so far, and I only got here at 10 o'clock this morning, is a lot of talk about the Ethernet layer and going up to the IP layer. What I want to do this talk is to move it up a layer in the protocol stack and talk about how mail service providers can use something like DKIM.
So a lot of this will apply to large-scale organisations. But DKIM is actually generic enough to apply to a smaller organisation, and how you may do that will differ slightly. Work it out for yourself or ask questions at the end.
So whether in the mail industry or not, most of you will be familiar with this picture, that you get a lot of unwanted email and a little bit of desired email. How you filter it depends on whether your customers are a happy or less than happy.
What I've seen typically done in email-filtering is an initial cut of IP-based reputation filtering in the form of, like, black lists or other kind of reputation-based services. And some as we have seen on this list take a much more brutal approach to IP-address filtering.
What happens is now that IPv6 is coming along, what happens to things like DNS black lists? Honestly, I don't know the answer. However, one alternative proposed is domain-based reputation scheme.
What's previously hindered this in the past is a lack of integrity associated with it. However, finally, this is where DKIM comes in.
DKIM talks about domain, as in your normal Internet domains. It is not associated with specific users under those domains. It focuses on domains at the top level. It is focussed in keys, meaning, digital signatures. It talks about identified - and that's part of the nature of the cryptographic process, and ties into the identity of top-level managers. Mail is your old normal RFC Internet message format.
The objective of Dick has changed a bit from the wordings of the RFC, although the technical aspects is the same. DKIM is the assertion of the responsibility of what you actually put on message.
It's sort of just because you put a DKIM signature on it, you are not saying that this is not spam or this is spam, it just proves that you have had some role in the delivery of the message.
Now the fun gritty stuff is how it works.
In this diagram, an email is sent from within a sender organisation. It is sent through a gateway. On that gateway is a bit of software that adds a digital signature.
The message gets sent all the way to the recipient, exactly how it has been done since, like, 1984.
What happens on the recipient side? If the recipient is DKIM aware, it will do a public key look-up using DNS.
And that, it can verify if the message is valid or not or has been modified in transit.
It's dependent on the content of the message. It is not related to the path of the message, like SPF or sender ID were. It can travel a number of ops provided they don't break the signature. What it looks like is a header field in the message, like this. If you look at any email from Yahoo! You will see a signature like, this and there has been for many years.
From the signature the verifier has enough information to look up the public key. The P at the end is the beginning of a base 60 for encoded RSA key.
DKIM signs a number of headers in the message. And it assigns all the bodies. So there is a number of headers that are recommended or not as per the standard.
So what happens with forgery? A message comes in to the recipient. It has a signature which has not been signed with if the private key was not there. The recipient can do a DNS look-up and say, "Well, it is not a valid signature on the message." We have also got the case that a message comes in without a signature. And according to the RFC on DKIM, you should treat these two cases exactly the same.
And the interesting case is when a message is sent through a mailing list, a message gets signed on the outbound, the mailing list can verify it OK, but when the mailing list modifies subject line, the recipient at the other end gets a broken signature, because the content got changed by the mailing list. This is one of the challenges in DKIM.
So if we go back to how we do mail filtering, we previously had a whole heap of mail stream from a domain, which contained legitimate emails, some containing spoofed email. We had to expend a degree of resources to process this email, and work out are we going to let it through or not, or to what extent.
What DKIM does is to separate these mail streams into one mail stream that has a valid DKIM signature, and another stream that doesn't. And hopefully the former, if DKIM is deployed, is going to be the majority of the email.
So in a simple case, we can look at the majority of the email stream, and say, "Well, it had a valid DKIM signature, I can apply at least a set of filter and rules on that." For the harder rules, when DKIM's signature was broken or missing, we can do a tougher set of rules to work out whether it was spoofed, whether it came through a mailing list or whether it wasn't just sent from a website that didn't have DKIM signing on it.
Just because something has a DKIM signature doesn't mean it is good. However, if spammers and fraudsters start an email, they lead a forensic trail and provide an ISP and email recipient an easy way to say, "Just discard anything from this DKIM signature."
So I've illustrated before, there is an importance in mail streams and identifying them. So if we look at, say, this ISP case, there's a number of different types of emails that come out of an ISP.
What we're trying to actually do with DKIM is assign these different mail streams with a different signing domain. What will happen then is a recipient that sort of doesn't like our marketing email for some reason won't be affected by the really important one, which is billing.
Yes, the finance person told me to say that!
After we sort of sign these various streams, and each of them starts to get a different reputation, we can go down the line and say, "Well, you know, an ISP has customers, who knows what they are sending, let's sign that with their own DKIM signature."
If we've got a customer that's, like, just sending a bit too much email, we sign them with a high-rate signature, and just assume they got affected by a bit of mailware and part of botnet before we get around to investigating that.
This helps to protect the integrity of the other customers ' emails.
Now that we have a series of mail streams, what we can do is start entering into bilateral agreements. And this is an old press release that Yahoo! and EBay and PayPal did. EBay and pay pal have been historically targeted for fishing type of schemes. This agreement with Yahoo! says that Yahoo! will filter any message that comes into that has a missing or broken DKIM signature. This contains a benefit for Yahoo!'s customers who are eBay and Pay Pal customers, and the benefits of having a bit more integrity email than there previously was has served well for both these companies.
Since we can actually enter these kinds of arrangements directly, we can also enter this arrangement pragmatically, or through a protocol. What happened with this authored main signing practises is that a message can come in, a recipient goes, "There is no signature, but I will look at the from address of the email and say do they have your policy about DKIM." The policy will be one of three values - unknown or discardable.
Unknown is effectively the same as, " I'm not going to tell you whether I sign or don't all emails."
All means a signable email with DKIM signatures. It is not necessarily the same as, "I'll always receive email with a valid DKIM signature, because there are things like mailing lists that break them all the way." If you are doing filtering based on a DKIM all assumption, be careful. You will probably get false positives in your filtering process.
And then there's discardable, which is what is useful for your pay pals and your eBay's billing .ISP.Com. These are the emails that should arrive at the end-user with a valid signature or you may as well drop them, in fact. In fact, to prevent some kinds of fraud, we recommend that you drop them. It doesn't mean that the recipient have to do them, but it gives you an idea of the importance of the email to the sender.
What's happened very recently, there was a reporting standard on DKIM that is probably going to become an RFC by the end of the year, which provides a feedback loop for those who DKIM signatures and ADSP policies to give them a feedback, you know, when signatures break or when verifiers decide to filter their messages.
This serves two major purposes.
One is that if there is a problem with the signing, or the way on the sender's side, they have the opportunity to fix it. If the receivers are doing something overly aggressive, as far as filtering, perhaps the sender can alter their practices in some way. But the importance of feedback from where I stand is the recipient senders of email are going to get some indication as to who is actually running fishing schemes on their domains. So giving them that kind of feedback is going to be vital.
Feedback historically has played an important role in SMTP. It was SMTP rejection and nondelivery receipts that degraded the existence of open relays. It made sure that people running mail servers had reverse DNS. So in the future, something like this standard will make sure that DKIM is applied in a uniform way.
And accompanying standard is authenticated results. What this means is that at the end, MTA into an organisation, DKIM check can be applied. And this authenticated results header is there for long-term forensics purposes. Obviously, a DKIM - a key in DNS may not be there indefinitely, however, it provides a record of what was the verification outcome.
It provides a useful tool for mail clients, and for, say, web mail displaying authentification information about the email and also use in the filters.
Reputation is once we have a domain information that has some integrity, we can actually associate a reputation. If you listen to a lot of the people on the IETF working group, they will say that this is the only purpose, which may or may not be related to their companies selling that kind of information. But who am I to judge?
So what happens when we have a domain reputation is we can sort of query our reputation provider and say, "Well, I received this valid DKIM signature, but so what? Should I drop it? Should I not?" And the idea of the protocol that is being developed by this working group is to provide a uniform way to deliver that information to a recipient.
This mailing list has only been running from the beginning of August, so they are in the early days of working out what they are doing.
If you run mailing lists, there is also a draft IETF standard on it. And they're working it out, how to actually run mailing lists in a way that is useful to everyone. The leverages the benefits of DKIM and ADSP and provides guidance to mailing list operators to avoid the pitfalls.
And there's also a bit of guidance in that about what to do as a recipient when mailing lists are involved. So if you are interested, have a read.
So now we are interested in the important part of the talk. You. I mean, if you are running a mail server, an ISP, hopefully by this talk you will see the importance of segregating it out into stream-based sending.
If you want to deploy verification and filtering based on DKIM, you can enter your own arrangements. You have some ideas as to what breaks and what doesn't. However, the importance of DKIM verification as filtering can be to protect your own business relationships.
If you as a business have a strong business relationship with another business, by doing DKIM filtering, you are protecting your staff from being socially engineered from people that sort of may not be this other business who are trying to influence you in a particular way.
Feedback groups are important to try to get a message back to DKIM signers who are not signing things right. So if you are deploying, look at that. And there's a couple of other references there.
If you are interested in DKIM, participation is welcome. My thoughts are there's far too many vendors on the list, and getting some network operators on the list would be a welcome change. At the end of the day, the IETF is there to develop standards to help you, but hopefully more so than the vendors, but potentially both at the same time.
The IETF developed DKIM standard, like in 2007, and there are about the stage where they are trying to validate what is actually used, what DKIM signatures are useful, what features are useful. So if you want to provide statistical feedback on a DKIM deployment, it would be much appreciated.
And also, any other operational experience that you have had in deploying it.
So if you are interested, you can read your URLs too.
So I want to say thanks to my employer for getting me mere, and thanks for the open DKIM project for keeping me interested. Thank you for your time.
Does anyone have any questions?
YOSHINOBU MATSUZAKI: Any questions?
Do you have any questions to the audience?
DANIEL BLACK: Who has deployed DKIM or heard of it before?
Deployed it? OK. What were your experiences?
OK, talk to me later. Thank you.
YOSHINOBU MATSUZAKI: And the last one is the video conference.
*DUE TO THE SENSITIVE NATURE OF SOME OF THIS CONTENT, WE WILL NOT BE PROVIDING WEB-CASTING OR TRANSCRIPTION OF THIS PRESENTATION. WE DO APOLOGIZE FOR ANY INCONVENIENCE THIS MAY CAUSE.
TOMOYA YOSHIDA: OK, we'll go to the break now.