IBM has prototyped a new software architecture for automating data management, potentially making it easier for researchers to collect usable information from mega-scale data collection projects like the Square Kilometre Array (SKA) global telescope which aims to address unanswered questions about our universe.
Scheduled to begin construction in 2016 in either Australia /New Zealand or Southern Africa, the SKA will be the world’s largest and most sensitive radio telescope. It has been estimated that the SKA will generate in excess of one Exabyte of raw data in a single day - more than the entire daily internet traffic. One of the central design challenges of the SKA project is how to process this huge volume of astronomical data and enable insights to be drawn from it.
Current, manual-based approaches used by the astronomy community are unlikely to scale up to the data generated from the SKA which could lead to much of the collected data being unused.
Working with Dr Melanie Johnston-Hollitt, a radio astronomer from Victoria University in Wellington, IBM constructed the Information Intensive Framework (IIF) prototype to automate key elements of the work currently undertaken manually by scientists. The software uses the International Virtual Observatory Association Ontology to classify collected data into concepts understood by astronomers and then provides intelligent 'guided search' functionality. This constrains searches to use only the information available within the system resulting in faster access and fewer errors.
IBM’s prototype met its design goals in September, demonstrating productivity improvements in data collection and easier access to that information for researchers with varying skill levels. Analysis of the results of the prototype project also suggested several ways to extend the prototype to meet the extreme performance levels required by SKA.
Jonathan Kings, New Zealand SKA project leader at the Ministry of Economic Development said: “The prototype is a good example of the innovative thinking needed to meet the challenges of building the world's biggest telescope, and shows how IBM and others in the New Zealand SKA Industry Consortium can contribute significant value to the SKA project."
“The Information Intensive Framework prototype tested several new concepts and is IBM’s first attempt to tackle the data intensive challenge faced by astronomy,” said Dougal Watt, Chief Technology Officer, IBM New Zealand, and Chair of NZ SKA Industry Consortium'. “While developed with SKA in mind, the results are also applicable to other organisations faced with a ‘data deluge’. We have identified several local scenarios which would benefit from automated analysis of performance data to uncover trends, identify anomalies and improve decisions. These range from individual manufacturing plants and telecommunications companies to whole transport networks and healthcare systems.”
Dr Johnston-Hollitt said: "Undertaking research on exa-scale datasets will force radio astronomers into a new, as yet, unexplored regime of automated processing, imaging and analysis. Surveys on even SKA precursor telescopes such as ASKAP and MWA are expected to produce catalogues of tens of millions of radio sources. How we organise and classify these data, which we will have in the next 3 years, is a significant challenge. We will need new solutions to fully realize the vast scientific potential of these datasets and it's fantastic that organisations like IBM are prepared to take up that challenge."
This project represents continued investment by IBM in developing the technologies needed for SKA. In April this year, IBM announced a Shared University Research Award to Victoria University of Wellington to support SKA related research, following a similar grant to Auckland University of Technology in 2009. This compliments other IBM exploratory research activities conducted in Australia with CSIRO on digital processing for the Australia Square Kilometre Array Project (ASKAP) and with the International Centre of Radio Astronomy Research (ICRAR) on Data Intensive Research for the SKA over the past two years.
The decision on where to locate SKA is expected to be announced in early 2012.