The fallacy of Android fragmentation – a statistical analysis

android-fragmentation
Lately, there have been a spurt of reports and surveys detailing the concern the developer community has about Android fragmentation. But is Android fragmentation as big of a problem as it’s made out to be? Let’s take a look.

Fragmentation Basics

When we talk about fragmentation, we’re essentially looking at two types – Hardware fragmentation and Software fragmentation.
Hardware fragmentation is a term used to describe the fact that, at any given point in time, devices based on the same software platform are running on different types of hardware (processors, graphics chips, screen size, etc.) Now, this should be less of a worry for any developer as it is inevitable if you want to target the majority of the market.
Apple’s ecosystem has minimal hardware fragmentation (although it has increased with the Retina and non-Retina display devices, and could become even worse if screen sizes of the future iDevices are changed), however, every other software platform, from Windows to Android, faces a level of hardware fragmentation, as vendors target the market at large. Even if Windows Phone or Windows 8 on ARM manage to take off in any meaningful way, this would be a problem. As an industry matures, hardware vendors tend to consolidate their position (as in the case of Intel) and the problem takes care of itself.
So let’s take a look at the elephant in the room, which is software fragmentation. Software fragmentation is used to describe the fact that, at any given point in time, devices running on a software platform are running on different versions of the operating system. In the case of Android, competitors have long harped on that software fragmentation wastes the developers’ time, as they focus resources on making apps compatible with every available version of Android. These version differences are driven by two factors – (1) Customizing every new Android version for manufacturer specific hardware and UI customizations and (2) Customizing this manufacturer build to incorporate carrier customizations.
Having understood this, what does Android fragmentation look like?
Android Version Distribution
Most reactions would be something along these lines:
This is so messy! Why can’t manufacturers & carriers roll out updates faster? This looks like a really big problem.
The problem is, it’s really difficult to draw conclusions of any sort directly from raw data. So what can we do to make our life easier? To this end, Chris Sauve posted his analysis of Android fragmentation, based on a formula that he crafted, which seems to show an interesting pattern. The problem is, custom made formulas tend to be subjective. Let’s see what we come up with, if we run some common statistical measures on Android’s historical version distribution data (from the chart above).
The nature of Android fragmentation
Since we’re essentially going to be analyzing the distribution of Android versions across various devices, we need to look at the properties of that distribution and how it’s been changing over time. One important statistical property we need to look at is kurtosis, which is a measure of dispersion of a data set. The higher the kurtosis, the higher is the number of active Android versions, and hence, higher the fragmentation.
So how does Android fragmentation really look like?
Well, that’s much better, isn’t it? Kurtosis seems to have smoothed out Chris Sauve’s formula, by incorporating the entire data set at any given point. Even with all the manufacturer and carrier noise, this essentially shows that Android fragmentation is not a problem that’s getting worse.
In fact, fragmentation is getting more and more cyclical as Google is moving to an annual release cycle for Android. This chart shows that fragmentation peaks a few months after the release of a new Android version and falls to a low just before the next release. This is very similar to what happened with previous Windows releases, except that, in the PC market, a newly released version took years to overtake the previous version.
Aside from the annoyance this causes to us tech geeks, this fragmentation cycle actually has considerable benefits for developers. Since it currently takes about six months for a new Android version to start making inroads into the market, which is midway through the current release cycle, developers can use this time to optimize their apps for the new version. This actually avoids problems like app crashes after updates as we’ve seen on iOS. Of course, we early adopters still face this issue, but luckily, thanks to the “fragmentation” of the Android ecosystem, the larger market does not.
The recent surge in reports focusing on Android fragmentation can be attributed to the fact that we are currently at the peak of the fragmentation cycle. Android 4.0 Ice Cream Sandwich (ICS) has been the most awaited version of Android from the developers’ perspective, because of the unification of the smartphone and tablet versions. Therefore, there is a vocal minority of developers, who are disappointed with the fact that ICS still has very low market penetration. Once the cycle moves ahead, with ICS going mainstream, quite a few of the dissenting developers will be back in the fold, before the cycle starts again.
Concentration of Android versions
But what about app development? Don’t developers still need to take into account all the earlier Android versions when designing an app? To analyze this, let’s look at the Herfindahl Index or H-Index, which is a measure of concentration, usually applied to measure competition. Compared to kurtosis, the H-Index places greater emphasis on larger data points.The value of the H-Index varies from a minimum of 0 to a maximum of 1, with a higher value indicating a higher degree of concentration into a few Android versions. For example, the H-Index would show a maximum value of 1.0 for a single Android version across all devices, a value nearing 0.50 for two dominant Android versions, a value nearing 0.33 for three dominant Android versions, and so on.
Concentration of Android Versions
Now, this is enlightening! This shows that just as fragmentation hits its highest (a larger spread of active versions), as indicated by Kurtosis, the majority of Android devices are still concentrated into two versions, as indicated by the H-Index value nearing 0.50. Now, as the new Android version reaches higher market penetration, it increases the number of high use Android versions, and the H-Index goes down to a value of 0.30-0.40. After this, the new version replaces the oldest major active version in the release cycle, and hence the H-Index follows the same cyclical pattern.
Now, this means that at any given point in time, developers only need to focus on incorporating the last two Android versions, apart from the latest Android version or upcoming release.
Conclusion
As ICS updates reach flagship handsets, and new ICS handsets and tablets are released to the market over the next couple of months, ICS will start replacing Gingerbread and FroYo. Due to this, the fragmentation level should reach another low, before the release of Android 5.0 Jelly Bean.
Developer interest tends to peak and wane based on their individual capacities and experiences. But, at the end of the day, as developers begin to understand these patterns, fragmentation will not deter those looking to target the majority of the market.