CODATA/RDA Schools of Research Data Science

Coming Soon!

Fundamentals of data science bootcamp primarily aimed at early career researchers in low and middle income countries.

What is the School of Research Data Science?

A programme of short courses equipping young researchers globally with core data skills.

Our schools provide young researchers with the core skills that will
allow them to be more effective and efficient in the work they do, to
analyse the data at their disposal and to take advantage of the emerging
culture of Open Science.

The primary goal of the Research Data Science School is to teach
foundational Data Science skills to Early Career Researchers in LMICs,
with a related goal of regionalizing training capacity by growing a
community of alumni, instructors, classroom helpers, and hosting
institutions. This train-the-trainer approach and growing community of
regional trainers reduces the cost of the events for the local hosting
institutions and increases the number of events that can be run.


The ever-increasing volume and variety of data being generated
impacts academia and the private sector. Contemporary research and
evidence-based decision making cannot be done effectively without a
range of data-related skills, such as, but not limited to, the
principles and practice of Open Science, research data management and
curation, data platforms and infrastructures implementation, data
analysis, statistics, visualisation and modelling techniques, software
development, and annotation. We define ‘Research Data Science’ as the
ensemble of these skills.

While the international, collective ability to create, share, and
analyse vast quantities of data is profound, there remains a shortage of
individuals skilled in Research Data Science worldwide, which limits
this transformative effect. With the appropriate data training however,
the ‘Data Revolution’ offers great opportunities for students and
professionals with modern data skills, such as when entering a job
market where these skills are in demand or conducting research.

The CODATA-RDA School for Research Data Science brings these
concepts and tools to communities that may not have been introduced to
the wide range of open resources currently available.

Open Science

Open and Responsible Research

  • Understand responsible conduct of research as it pertains to data science,
  • Have a broad understanding of the Open Science movement,
  • Have reflected on the impact Open Science on their own research and future career.

Research Data Management

  • Understand the data curation lifecycle,
  • Appreciate the practical advantages of good RDM and open research/science,
  • Understand how to add value and longevity to your data,
  • Understand the principles and importance of standardisation,
  • Understand how to publish data.

Software Carpentry

  • Have an introductory understanding of the Unix shell,
  • Be able to execute simple commands in R,
  • Be able to use Git.
Data Visualisation
  • Understand how to use R to perform visualisation,
  • Be able to perform a critical assessment of effective visualisation techniques.
Machine learning and Neural Networks
  • Understand the basic principles of machine learning,
  • Apply pipelines to build recommender systems,
  • Understand how to use Artificial Neural Networks, with hands-on experience,
  • Understand the principles of Boosted Decision Trees and Support Vector Machines.

Information security

  • Understand that their online activity, and any systems they create or use for Data Science, will be subject to online attack,
  • Understand security design principles that they can apply to their work,
  • Understand the basics of cryptography and encryption, and their importance.

A Curriculum for Foundational Research Data Science Skills for Early Career Researchers

The curriculum for the Data Schools is an RDA-endorsed Recommendation
with the aim of giving Early Career Researchers (ECRs) the foundational
skills in Data Science to work with their data, encouraging them to
consider the importance of open and responsible research.  The
curriculum covers

  • Software Carpentry
  • Research Data Management
  • Open and Responsible Research
  • Data Visualisation
  • Machine learning and Neural Networks
  • Information security

Data Discovery
Data Preservation
Research Data Management

Open Science




The next course is not yet scheduled
Share this course