lunes, 28 de enero de 2013

Designing & Building Internet6 Applications


This post aims to explain a new opportunity for Internet products is emerging: the IPv6-only way. My goal is to focus on the advantages of such an approach, both from the technical and the business perspective. There is also a long explanation on the reasons why p2p and interactive products underperform in today's Internet and v6-only ones wouldn't.


Many app developers today are not fully aware about the underlying complexity and inefficiency of today's Internet for what regards to bidirectional interactive, peer-to-peer and rich-media applications.

Nowadays, there exist libraries to hide the above-mentioned complexity, however inefficiency remains untouched, when not worsen by generic approaches.

Creating Internet6 apps provides developers a totally new way with truly freedom from those constraints. Additionally, it opens a wide new space of opportunities by pioneering IPv6 actual exploitation.

Yes, I am talking about IPv6-only applications. Am I crazy?   Not at all!

There are plenty experts already spreading the word on benefits of IPv6-only environments (check links in the Annex at the end of this article).
T-mobile USA has also made IPv6-only mobile users a reality in the US. How long will other app developers take to exploit these new scenarios? tic-tac-tic-tac...

If you are a smart developer with new ideas to develop, you share the vision that efficiency and differentiation really matter and you are forward-looking enough to explore - and perhaps rule - the forthcoming new Internet, I hope this article helps to pave your initial way.

Why do I talk about v6-only apps and not just v6-enabled?
Because that would make you focus on the nightmare of transition and old IPv4 constraints, instead of exploring truly differentiation from the state-of-the-art.
Also, the compatibility gap is being provided soon anyway by current successful apps adapting to IPv6 so it's not really an easy opportunity.


Understanding today's Internet problem

Today's Internet architecture is highly fragmented and it is convenient only for client-server interactions. There is a public domain and millions of private ones attached to the first, by the means of intermediate nodes (NATs, address translators) that break the expected plain IP connectivity among all nodes.



Private domains were created because there were no IP (IPv4) addresses for all terminals, so that only servers live in the public space, while regular clients (home devices, smartphones, etc.) may stay hidden using private v4 addresses. On the other hand, Internet6 is not fragmented as all nodes are connected to the public domain though might be partially or totally secured by firewalls (secured domains).

The main consequences are listed hereby:
  1. First, a private domain means that all information incoming/outgoing packets have to be highly processed (calculate and replace IP headers) at NATs, instead of just being routed. Apps like google-maps have reported performance issues, mainly in corporate environments due to port-allocation speed & scalability.
  2. Second, a terminal in a private node is hidden and unreachable, so apps on them cannot be directly located and contacted by all other Internet nodes. As evolution cannot be stopped several proposals attempt to reduce complexity but still fail in performance.
  3. Finally, private addressing does not stop users to unintentionally install malware that fully open doors for malware activities. Quite often security risks have been underestimated with the wishful thinking that private domains are secure.


As the Internet keeps on growing fast due to emerging countries and new devices getting connected (machine-to-machine, Internet-of-Things) the problem is getting worse too.
For instance, so far, one IPv4 addresses was normally shared within all home devices, but some ISPs will have no other way soon that installing carrier-grade NATs (CGN) to make several homes to share the same IP address.


Which barriers are we really sorting out?

As said before, there are a number of technologies and mechanisms that have appeared in order to enable interactive and peer-to-peer (P2P) applications.

In the following paragraphs I briefly summarize some of these techniques and some current issues. This should help us to understand the benefits of developing Internet6 apps and provide some basis for differentiation ideas.

Barriers will be removed as long as IPv6 networks design do not imitate IPv4 networks. Sounds stupid to replicate techniques and elements that are not really needed but this is what network admins normally do unless app & services developers ask them how they want things to be done. If developers wait for v6 networks to be fully deployed it might be too late.


1) NAT Traversal, STUN, TURN, ICE, etc. (P2P Personal Communications)

These technologies were mainly defined to enable peer-to-peer personal communications such as the Skype voice-over-the-Internet application.

Personal communications and inherently peer-to-peer (P2P) services as two agents want to talk each other in real-time. Of course, call control (signalling) can be kept centralized, as its corresponding traffic is low and thus costless and scalable anyway. SIP protocol was designed as a standard to handle signalling.

Skype was the very first massively successful voice over Internet application. Many agree its success actually relays on their ability to implement a P2P service, while its competitors were mainly costly and inefficient centralized solutions.
It wasn't really a surprise as Skype Estonian developers were also the creators of kazaa P2P file sharing application some time before, so the basics of P2P were quite well known by them.

Basically, Skype app analyses the underlying network possibilities of a user client so it can determine an appropriate method to work with. This layer violation procedure means also that the app developers needed to know the complexity of network topology and communication mechanisms selection.
Most of the times, Skype concludes that a user and its remote party are behind NAT boxes and therefore NAT traversal techniques need to be used. A second step is to analyse if the NATs involved in the communication support UDP-hole-punching or any other method.

If NATs support this technique each user establishes a connection against the other party's NAT and the NATs are smart enough to create the logic to successfully route the messages (virtually connect all in a single pipe).

If UDP-hole-punching is not feasible, TCP-hole-punching is considered too, even when it is much worse for media streams transport.

If all the above fails, Skype will use a supernode (nodes known to work and provide capacity) to centralize the traffic of several problematic users, working then as a relay for the communication.

 The standardization of the techniques to traverse NATs have resulted in:
  •  ICE (RFC5245): allows two agents that wish to communicate to discover enough information about their topologies (including levels and details of NATs) to potentially find one or more paths by which they can communicate. It is mainly thought for UDP communications.
  •  STUN (RFC3489):  a standardized set of methods and a network protocol to allow an end host to discover its public IP address if it is located behind a NAT. It also helps ICE protocol to discover candidate addresses to try when establishing communications.
  • TURN (RFC5766): Allows a host behind a NAT (called the TURN client) to request that another host (called the TURN server) act as a relay for the communications.
  •  UDP/TCP-hole-punching (RFC5128): provide intelligence at the NAT side to enable two communicating agents to exchange messages through their respective NATs.


In the Internet6, there are no NATs on the communication path and therefore app developers are totally free from discovering the network topology and using the techniques presented above.
However, NAT traversal standard (RFC5128) explicitly mentions that there might be firewalls in the "pure IPv6 world", which employ a similar filtering behaviour of NATs but without the address translation (V6-CPE-SEC) that may interfere with the functioning of P2P apps.
Indeed, there is this risk, but what app developers should demand to firewall developers is to avoid using these complex techniques in the Internet6 and provide easier interaction alternatives (such as UPnP firewall configuration). In the end, firewalls are to provide security and the hole-punching techniques totally transparent for users and network admins do not seem to keep it at its best...


2) AJAX server polling, COMET & Websockets (The Interactive WEB)

In order to create highly interactive web applications, that need bidirectional and asynchronous communications, developers have performed an abuse of HTTP protocol as a transport layer with server polling techniques.

Good examples are gaming apps, instant messaging, stock tickers, multiuser document editing, real time user interfaces, etc.

As long as servers cannot open sockets against clients (hidden by NATs), the clients keep long-lasting sockets that servers use to push messages through.

In the beginning, clients opened a permanent HTTP connection (socket) per each dynamic element in a web page (AJAX+ server polling), leading to massive server resources consumption that depend in the number of sockets opened at once.

This technique was significantly improved with COMET, where clients open one single HTTP connection multiplexing all incoming messages for all dynamic elements at the client side (normally a dynamic web page). One more improvement was to open a single long-lasting HTTP connection instead of a permanent one.

However, using COMET provided a non -standardized solution with some known issues:
  • High Overhead: Using HTTP as a transport layer means adding an HTTP header for every single client-to-server message.
  • Client-side complexity: The client needs to keep an out-going connections mapping to incoming messages to demultiplex reply messages.
  • Server-side scalability: as servers are forced to keep several underlying TCP connections for each client: one for sending information to the client and a new one for each incoming message.

Websockets protocol (RFC6455) provides an standardized solution addressing the previous issues by establishing a single TCP connection for both traffic directions (client-to-server and vice-versa). Its only relationship to HTTP is that its handshake is interpreted by HTTP servers (Upgrade request, URI: ws or wss).

The RFC says: "The WebSocket Protocol attempts to address the goals of existing bidirectional HTTP technologies in the context of the existing HTTP infrastructure; as such, it is designed to work over HTTP ports 80 and 443 as well as to support HTTP proxies and intermediaries, even if this implies some complexity specific to the current environment".

It also adds: "However, the design does not limit WebSocket to HTTP, and future implementations could use a simpler handshake over a dedicated port without reinventing the entire protocol. This last point is important because the traffic patterns of interactive messaging do not closely match standard HTTP traffic and can induce unusual loads on some components."

For Internet6 developers, p2p and bidirectional communications are straightforward though some negotiation with intermediate firewalls (UPnP) might be desirable for security reasons. However, if websockets are still to be used the last paragraph indicates the benefits of the natural availability of communications beyond HTTP to 80/443 ports.


3) WebRTC (Video Interactive WEB-Applications)

WebRTC is said to be a kind of revolution enabling not only real-time but also peer-to-peer communications between browsers, and thus Web-applications.

It is a part of the HTML5 standard initially implemented by Google and further developed within the W3C and IETF that is currently supported by several browsers such as Chrome, Mozilla and Opera.

The concept behind WebRTC is quite powerful as long as it provides an easy abstraction of video/voice sessions, communications and other systems for the large base of HTML5 developers that aren't normally aware of those complex details.

However, if we have a closer look to WebRTC proposal -for what regards to P2P communication details- we can see there is a component allowing calls that uses the STUN and ICE mechanisms to establish connections across various types of networks.




As a consequence, complexity is effectively hidden to developers, which is a great step, but still inefficiency is high, due to those mechanisms described before.

A more optimized solution would be to use WebRTC in an Internet6 environment, where the workaround for P2P is not needed while all the other components help the developer on the other complex tasks.
Therefore, developing WebRTC based apps in IPv6-only environments will improve interactivity performance and therefore user experience. If we are talking about mobile apps, less communication overhead will also mean battery saving.



Appendix A

References about "IPv6-only" paradigm


- IPv4-only and IPv6-only applications How-to (Eva Castro)
http://gsyc.escet.urjc.es/~eva/IPv6-web/ipv6_only.html

- Experiences from an IPv6-only Network (Jari Arkko)
https://tools.ietf.org/html/rfc6586


- IPv6-only is becoming available
http://tech.slashdot.org/story/12/01/13/2348206/ipv6-only-is-becoming-viable

- Why Your Network Should Go IPv6 Only
http://packetpushers.net/why-your-network-should-go-ipv6-only/

-  Android and IPv6-only
http://www.gossamer-threads.com/lists/nsp/ipv6/32908

- Nexus S Android ICS Top Free Apps on T-Mobile USA IPv6-only Network
https://docs.google.com/spreadsheet/ccc?key=0AnVbRg3DotzFdGVwZWlWeG5wXzVMcG5qczZEZloxWGc&pli=1#gid=0

Other useful references:
- Top 10 tasks for IPv6 developers
http://www.networkworld.com/community/blog/top-10-tasks-ipv6-application-developers

- Tutorial for IPv6 developers:
http://www.6deploy.eu/tutorials/210-6deploy_devel_v0_4.pdf

martes, 8 de enero de 2013

How big is the Internet6 right today?

Let's start 2013 with a brief snapshot of the Internet6 size right today.

Continuously tracking Internet6 size is a key input for Internet products architects/developers/marketing in order to better foresee a possible exponential growth date. By that time, v6 developments should have got to their commercial portfolio, if possible before competitors do.

Basically, we pay attention to three KPIs (Key Performance Indicators): natively connected users, contents & services.In this post, I just analyze the first two, that's contents and users. In forthcoming posts I will try to analyze specific widely-used and/or emerging services as they normally need a stand-alone study.

1. CONTENTS

This KPI is pretty straightforward and simply calculated by checking DNS AAAA records, expected to hold the site's IPv6 address, and making an IPv6 probe connection if potentially available.

There are many sites measuring this way and building nice graphs. I particularly like this one:


This shows off that, globally -for the 500 most accessed Web sites (prestigious Alexa ranking)- almost 25% sites are hosted in the Internet6 (in addition to the v4 one, of course).

Besides the numbers above, it's relevant to track the Top-10 sites that normally concentrate most of the users' traffic. I did the test myself today with Alexa global Top-10.
(Green-bold means YES, red = NO)
1  - Facebook                                     
2  - Google
3  - YouTube
4  - Yahoo!
5  - Baidu.com     (But they test at http://ipv6.baidu.com/)
6  - Wikipedia
7  - Windows live
8  - Amazon.com
9  - QQ.com
10 - Twitter

Not bad!  Isn't it? Let's see if 2013 improves the list anyway :-)

Beyond Top-10, we find some key sites that also do fulfill v6-accesibility:
12 - Blogspot.com
13 - Gloogle India
21 - Bing
83 - Netflix

Checking sites yourself is easy, just try at this IPv6-test-site.
For sites we should be tracking during 2013, check the appendix at the end of this post.

2. USERS

Besides actual numbers provided by ISPs, the best inputs for this KPI are indirectly provided by content sites measurements as they are regularly accessed by almost all users. Also traffic measurements at Internet key intermediate points/nets are valuable. My favorite inputs list follows:

2.1 Google Stats. 

Are there human users that don't access google almost everyday? Ok, those are not included in these stats. However, I believe the picture provided is still the most accurate one. ;-)

Good news, in 2013 we've got over the 1% threshold stated as a turning point by several key Internet players.



2.2 Cisco Stats

Cisco Labs site offers some nice graphs too, that are actually based on multiple sources, including: Google stats (as described in the previous point), Alexa measurements, whois databases from RIRs (RIPE, ARIN, APNIC, AFRINIC, LACNIC) and several other sources as detailed in their site.

 Let's see the global mouse-sensible picture (with some countries data I highlighted on the right side):




By clicking in the above map, you'll get the recent evolution for each country.
Growth at Germany and US appears to be relevant and that's key, as both nations have a significant influence in technology adoption worlwide and at EU scale.



2.3 Akamai IPv6 hits

A lot of Web pages, video and other contents are delivered over Akamai's worldwide  Content-Delivery-Network (CDN). Therefore, the number and evolution of hits to IPv6 contents give as a good picture of Internet6 actual usage.

Akamai offers nice IPv6 stats. I compiled in the following picture the evolution so far.
Quite impressive results of 2011&2012 ISOC-organized World IPv6 Days, aren't they?




2.4 Traffic at main Internet eXchange points (IX)

Normally, I track AMSIX (Amsterdarm) and DE-CIX in Germany. Both show off a good rising-up trend. In AMSIX for instance, today's peak is 9,5 Gbps and regular IPv4 traffic is peaking at 2Tb/s. 



APPENDIX

Some Alexa Top-500 web-sites not in the Internet6 -as tested today- we should be keep on tracking in 2013 are:
5  - Baidu.com   
7  - Windows live
8  - Amazon.com
10 - Twitter
11 - Taobao
14 - LinkedIN
15 - Yahoo! Japan
19-  eBay
23 - Wordpress.com
36 - Pinterest.com
37 - Apple Inc.
38 - Paypal
46 - The Internet Movie Database
56 - BBC on-line
65 - Flickr
73 - instagram.com
74 - Thepiratebay.se
76 - CNN Interactive
114 - Wal-Mart Online
118 - The Weather Channel
124 - IndiaTimes