SA needs more training in handling of big data

April 29th, 2016, Published in Articles: EE Publishers, Articles: EngineerIT


Big data is for many people a buzzword, just like “in the cloud” was not so long ago. Dr. Quentin Williams, strategic research manager at the Council for Scientific and Industrial Research’s (CSIRs) Meraka Institute believes big data is an important research and development area. If indeed South Africa wants to transform itself into a knowledge economy, one of the main requirements will be developing people and equipping them with skills that will meet the demands that new technology continually imposes.

Dr Quentin Williams (centre) discussing a big data project with Johnathan Gerrard Looking on are Persverance Mbewe, Pelonomi Moiloa and Antonius Mamphiswana

Dr. Quentin Williams (centre) discussing a big data project with Johnathan Gerrard. Looking on are Persverance Mbewe, Pelonomi Moiloa and Antonius Mamphiswana.

meraka“Skills are needed to understand and work with big data,” he said.  To meet these requirements Meraka has launched the Data Intensive Research Initiative South Africa (DIRISA) as part of the National Integrated Cyber Infrastructure system which includes the Centre for High Performance Computing in Cape Town and the South African National Research and Education Network.

“We have three initiatives,” said Dr. Williams. “We have a research and development group that looks at new and novel research such as distributed and streaming machine intelligence and convex optimisation, which can be described as deep artificial intelligence research. The second initiative is a training programme, and the third part DIRISA. The objectives of DIRISA are to advocate data intensive research, promote sound data stewardship practices, develop relevant capacity and expertise and the coordination of data-intensive research activities and initiatives.

“Our interaction with the business sector and industry has clearly shown that people qualified in handling big data are very much in demand but short in supply. This prompted us, with the support of the Department of Science and Technology (DST) to introduce the training programme and the Data Science Solutions Factory aimed at assisting industry in developing systems that will work towards South Africa to be more competitive.

“Last year Meraka started a project called Data Science for Impact and Decision Enablement (DSIDE). The aim of the programme is to support capacity building in the rapidly growing field of data science by scheduling recruits to participate in mentor-guided and learn-by-doing problem solving of real-world needs as presented by different stakeholders.

“We started with in 2014 with ten students who all come from third and fourth year engineering training. Last year we increased to 40 students from twelve universities. We have included engineering students, mathematicians, student studying statistics, and computer science and engineering. We have realised that to properly address the question of data science, you need all these disciplines.”

It is a twelve-week programme divided into four weeks during the June to July university holidays and eight weeks during the end of the year university holidays. The students go through an intern programme and are being trained by mentors working on a specific product type. It is a very hands-on industry-focussed programme.

Experienced mentors from the CSIR data science community introduced machine learning topics, tools, theories, and guide trainees in this project-driven environment. Given that this is a learn-by-doing initiative, the stakeholders did not expect the delivery of market-ready output at the end of the programme.

Examples of projects the students worked on are the Servix Extension, a text analytic platform for extraction of actionable insights from customer service complaints and compliments sourced from the web. There are over 350 000 samples of customer service reports for various industries in South Africa.  Another project, SerViz, is being developed to be at the front of an infographic platform where consumers can extract insights powered by research in machine learning and visual analytics.

The success of a programme like DSIDE can only be measured by feedback from the industry. “The feedback we received was overwhelmingly positive, indicating that industry was impressed by the projects the students delivered and several said that they were implementing the systems developed,” said Dr. Williams. Seven students have been employed by the CSIR in a master student programme, two students were employed by industry and the other 31 have returned to university to complete their studies.

This year Meraka is increasing the programme to 50 students. It is however very dependent on the number of mentors available. The programme is funded by the DST and are considering expanding the programme to include other centres in the country in future.

Related Articles

  • Now Media acquires EngineerIT and Energize from EE Publishers
  • Printed electronics: The defining trends in 2019
  • Charlie and the (fully-automated) Chocolate Factory
  • SANSA app calculates best HF communication channel
  • ICT infrastructure to support SA’s utilities of the future