Promotion of data sharing within the SAEON domain

October 31st, 2011, Published in Articles: PositionIT

by Johan Pauw, SAEON

Long-term data is irreplaceable and if monitoring continues uninterrupted and the data is accessible, the value of the data is compounded. Long-term data is becoming increasingly invaluable to society, which is facing mounting environmental challenges. This holds true everywhere on planet Earth. For this discussion I will exclude information products derived from basic data sets.

The global trend in environmental data is rapidly moving towards free-flow online data systems. Examples are the Global Biodiversity Information Facility (GBIF) and the Global Earth Observation System of Systems (GEOSS). It is pertinent to note that influential intergovernmental global science-policy instruments such as the Intergovernmental Panel on Climate Change (IPCC) have data sharing as a cornerstone principle. The Millennium Assessment (MA), GEOSS and the future Intergovernmental Platform for Biodiversity and Ecosystem Services (IPBES) are in the same boat.

Long-term data is becoming increasingly invaluable to society, which is facing mounting environmental challenges.

Data sharing

There are many other relevant global initiatives, and local environmental science is being pushed towards data sharing by local variants such as the South African Biodiversity Information Facility (SABIF) and the South African Earth Observation System (SAEOS).

SAEON recognises its reliance on, and responsibility towards data sharing, and for several years now has taken the lead in developing the required culture, technology and infrastructure that will advance data sharing across the wider network of SAEON participants. The technology and infrastructure have been shared with various organisations and initiatives, among others the CSIR, the South African Risk and Vulnerability Atlas (SARVA), the World Data System (WDS), GEOSS and SAEOS.

More work will be done in partnership with the Water Research Commission (WRC) and the Agricultural Research Council (ARC). An exciting development is SAEON’s key role in the joint research programme with Meraka Institute, ARC and the South African Data Centre for Oceanography (SADCO), on the enablement of sensor webs with a view to connecting various environmental sensors to an array of databases, including real-time data delivery and visualisation of data.

The worldwide move towards data sharing has, in part, resulted from the development of the internet and information technology which removed the barriers to information sharing. While opening up new possibilities, it contributed to a new culture of open sharing. Much has been published in support of data sharing and many documents have been produced by organisations, the likes of the International Council for Science (ICSU), Organisation for Economic Cooperation and Development (OECD), the International Polar Year (IPY) and our own Department of Science and Technology (DST).

Public access to data from publicly funded research is becoming a condition for funding by many international funding agencies. In the global change arena, as much as environmental issues are not private concerns, it is imperative that data should not be kept within the walls of offices, institutes and countries – the very progress of global change research depends on data being exchanged openly but securely.

Why then is data sharing seemingly still an issue in South Africa?

Several reasons why this may be so are outlined below:

  • Sense of ownership – field researchers are more often inclined to consider even their publicly funded data as private property due to the physical and mental efforts required to collect data in the field. They feel that if the data is served in the public domain it might be harvested and used by consultants for commercial activities without any financial benefit to the data owner. Generally speaking, data from automated sensors is more accessible than data manually collected by researchers. This is clearly a matter of economics, i.e. the more the personal effort that has gone into data collection, the greater the personal entitlement to the datasets, irrespective of the public sponsorship for the work. The matter of other users “selling” data originated by someone else as part of commercial work is covered by copyright protection once the data has been published in the public domain and the data owner is clearly identified.

  • Myopia – some individuals believe that their datasets are of such high value that they will one day either be able to sell it for personal gain, or publish an article in the most prestigious science journal, or that they would not be acknowledged for providing the data, or that the data will be misinterpreted or misrepresented by other authors. Such researchers fail to see that they may be able to generate more and better publications by collaborating and sharing data with others, and by counter publications in the case of misrepresentation.

  • Lack of confidence – some individuals do not trust the quality of their data and thus would not want it out in the public domain, since that may expose their work to criticism.

  • Time and funding constraints – it takes time and money to prepare data sets and to ensure quality control and updating if applicable. Data management and serving have become a growing cost for organisations whose budgets cannot meet the challenge.

  • Lack of skills – some individuals may not have been trained in data management and their lack of necessary skills bars them from sharing their data.

  • Lack of interest – some individuals could not care less after publishing a paper(s) based on their data, because they do not see the value thereof.

  • Ignorance – some individuals might simply be unaware of public data systems and how they might deposit their data securely.

  • The funding models of many research organisations, including science councils, make it an organisational imperative that funds must be generated from selling research outputs, including data. This is often either caused by too low levels of funding invested by government or is an outcome of unsustainable organisational growth.

  • Lack of enforcement – some organisations may actually have a policy of data sharing but internal organisational controls are lacking, resulting in non-compliance by some researchers.

South African Earth Observation System (SAEOS)

The promise of a single IT platform (SAEOS) with the capability to connect, integrate and serve spatial data sets, inclusive of satellite imagery, is substantial, but is totally dependent on quality data being openly available to SAEOS. Currently, SAEON is assisting the DST to drive the establishment of a space secretariat, a primary function of which will be to support the South African Group on Earth Observation (SA-GEO), a forum representing the Earth observation user community’s interests in SAEOS and the South African National Space Agency (SANSA).

The promotion of data sharing within the SAEON domain is an ongoing effort. A number of interventions are required to help the process of establishing a culture and open access data system for SAEOS and Global Change Research:

• The existing publicly shared but secure IT platform (SAEOS), which is interoperable among different data systems and has integrative capabilities, has been far advanced by SAEON and partners. Its continued development should be strongly supported and not duplicated by parallel developments to avoid “reinventing the wheel”.

  • There is an urgent need to provide for the establishment of a core group of systems engineers and data managers within the SAEON/SAEOS group, not only to continuously design, improve and populate the system, but also to provide training to fieldworkers and users of the system. Funding for this intervention is the responsibility of government because it is a not-for-profit endeavour benefitting all levels of society, in true democratic spirit.

  • Data management, as an essential supporting discipline in global change research, should receive appropriate attention as part of university curricula.

  • A local system for the peer reviewing of valuable and quality datasets that will incentivise the publication thereof should be developed – similar to, and possibly in parallel with, existing systems for journal publications.

  • Leadership, direction and appropriate data management policies by the South African government should be cascaded down to the various environmental science-related organisations, possibly via the SA-GEO.

Contact Johan Pauw, SAEON, Tel 012 349-7722,