EMC Corporation has announced results of the EMC-sponsored IDC Digital Universe study, “Big Data, Bigger Digital Shadows, and Biggest Growth in the Far East” – which found that despite the unprecedented expansion of the digital universe due to the massive amounts of data being generated daily by people and machines, IDC estimates that only 0.5% of the world’s data is being analysed.
To view the multimedia version of this news release click here.
The proliferation of devices such as PCs and smartphones worldwide, increased Internet access within emerging markets and the boost in data from machines such as surveillance cameras or smart metres has contributed to the doubling of the digital universe within the past two years alone – to a mammoth 2.8 ZB. IDC projects that the digital universe will reach 40 ZB by 2020, an amount that exceeds previous forecasts by 14%.
This year’s study marks the first time IDC was able to capture where the information in the digital universe either originated or was first captured or consumed, revealing some dramatic shifts currently underway. Now in its sixth year, the study – measuring and forecasting the amount of digital information created and copied annually – includes findings around the “Big Data Gap”, which is the gap between the amount of data with hidden value and the amount of value that is actually being extracted; the level of data protection required versus what is being delivered; and the geographic implications of the world’s data.
IDC projects that the digital universe will reach 40 ZB by 2020, an amount that exceeds previous forecasts. From now until 2020 the digital universe is expected to double every two years, with approximately 5,247 GB of data for every man, woman and child on earth in 2020.
A major factor behind the expansion of the digital universe is the growth of machine generated data, increasing from 11% of the digital universe in 2005 to over 40% in 2020.
The promise of Big Data lies within the extraction of value from large, untapped pools of data. However, the majority of new data is largely untagged file-based and unstructured data, which means little is known about it. In 2012, 23% (643 exabytes) of the digital universe would be useful for Big Data if tagged and analysed. However, currently only 3% of the potentially useful data is tagged, and even less is analysed. The amount of useful data is expanding with the growth of the digital universe. By 2020, 33% of the digital universe (13,000+ exabytes) will have Big Data value if it is tagged and analysed.
Although the digital universe was a developed-world phenomenon in the early days, that is about to change as the population of the emerging markets begins to cast a longer shadow. While emerging markets accounted for 23% of the digital universe as recently as 2010, their share is already up to 36% in 2012. By 2020, IDC predicts that 62% of the digital universe will be attributable to emerging markets.
The current global breakdown of the digital universe is: U.S. – 32%, Western Europe – 19%, China – 13%, India – 4%, rest of the world – 32%. By 2020, China alone is expected to generate 22% of the world’s data.
As cloud computing plays an even more important role in the management of Big Data, the number of servers worldwide is expected to grow tenfold and the amount of information managed directly by enterprise data centres will grow by a factor of 14.