The workshop addresses the passing of software writing skills to young scientists, the next generation of researchers in the Earth, planetary and space sciences. The writing of code in science following minimal but vital software engineering rules, best practices and processes shall be imparted as fundamental skills. So the workshops addresses young scientist with some experience in writing software for their research work. The lessons will cover:
Helmholtz Centre Potsdam - GFZ German Research Centre for Geosciences, Telegrafenberg, 14473 Potsdam, Germany (map)
No workshop fees, basic food supply included, travel and accommodation expenses are covered by workshop participants
Intermediate using spreadsheets or similar tools for data analysis rather than writing code on their own with some experiences in programming
First hands-on experiences and the need of programming and developing software in your own work. At least the material from the previous Workshop for Novices>> should be known.
25 workplaces with computer (plus a few workplaces for participants with own laptop).
Hussein El-Sayed - Hussein is an Egyptian Software Engineer who works at GoEuro Travel GmbH and graduated from Faculty of Computers and Information Technology. He focuses on spreading the importance of computer science tools and technologies among all researchers as this will increase the productivity and might give them accurate and efficient results if used correctly. He also runs a blog that has lots of useful topics about computer science.
Peter Steinbach - Peter is a high-performance computing developer at the Max Planck Institute of Molecular Cell Biology and Genetics in Dresden. He has a PhD in Particle Physics and is now responsible of accelerating scientific applications in a myriad of languages as well as on a zoology of hardware platforms.
Marta Hoffman-Sommer - Marta is a member of the Open Science Platform team at the Interdisciplinary Centre for Mathematical and Computational Modelling at the University of Warsaw. She participates in projects concerned with open research data and with data management. She is coordinating the launching of a new Repository for Open Data. By training and work experience, Marta is a molecular biologist.
Martin Hammitzsch - Martin is involved in developing software systems at the Centre for GeoInformation Technology (CeGIT) of the GFZ German Research Centre for Geosciences. The need of opening scientific software in his work so that the scientific contribution is acknowledged led Martin to a project which is pushing software publications in sciences, especially the geo-sciences. Since then he is active in the European Geosciences Union (EGU) as Young Scientists (YS) Representative in the division on Earth and Space Science Informatics (ESSI). In addition, Martin is active in the German speaking Open Science working group of the Open Knowledge Foundation (OK) and in the German Open Source Geospatial Foundation (OSGeo) chapter FOSSGIS which promotes free and open source software for geo-spatial purposes.
Facilitate Open Science Training for European Research (FOSTERhttps://www.fosteropenscience.eu/) is a two-year, EU-Funded (FP7) project, carried out by 13 partners across eight countries. The primary aim is to produce a European-wide training programme that will help researchers, postgraduate students, librarians and other stakeholders to incorporate Open Access approaches into their existing research methodologies. FOSTER is designed to equip young researchers and other stakeholders, with the skills to function effectively as the range of Open Access policies are refined and aligned across the EU. FOSTER supports the workshop with their call for Open Science Training 2015 to integrate Open Access and Open Science principles and practice in the current research workflow by targeting the young researcher training environment.
The Software Carpentry (SWC) is a volunteer organization whose goal is to make scientists more productive, and their work more reliable, by teaching them basic computing skills. Founded in 1998, it runs short, intensive workshops that cover program design, version control, testing, and task automation. In October 2014, the Software Carpentry Foundation (SCF) was announced to act as the governing body. The SCF is a non-profit membership organization devoted to improving basic computing skills among researchers in science, engineering, medicine, and other disciplines. SCF’s long-term goal is to ensure that every researcher learns the skills that Software Carpentry teaches early in their career. Activities in UK are led by the Software Sustainability Institute (SSI) which will help sufficiently to run the intended series of workshops. The SWC supports this workshop with trained volunteering instructors and successfully tested teaching materials.
The GFZ German Research Centre for Geosciences is Germany’s premier institute for the geosciences, with strong links to leading institutes across Europe. Its research ranges across the full breadth of the Earth Sciences from the dynamics of Earth’s deep interior to remote observation of its active surface. GFZ combines all solid earth sciences encompassing the fields of geodesy, geophysics, geology, mineralogy, geochemistry, geochronology, geomorphology, physics, mathematics, biology and engineering in a multidisciplinary scientific and technical environment. GFZ is part of the Helmholtz association of German Research Centres, the official mission of which is to solve the grand challenges of science, society and industry. The GFZ organizes, hosts and accompanies the workshop run by SWC instructors and FOSTER speakers.
Software has become an integral part of science, yet software is not properly integrated into the scientific discourse. Findings presented in papers are based on data and once in a while they come along with data – but not commonly with software. However, the software used to gain findings plays a crucial role in the scientific work. Nevertheless, software is rarely seen publishable. Thus researchers may not reproduce the findings without the software which is in conflict with the principle of reproducibility in sciences. For both, the writing of publishable software and the reproducibility issue, the quality of software is of utmost importance. For many programming scientists the treatment of source code, e.g. with code design, version control, documentation, and testing is associated with additional work that is not covered in the primary research task. This includes the adoption of processes following the software development life cycle. However, the adoption of software engineering rules and best practices has to be recognized and accepted as part of the scientific performance.
Most scientists have little incentive to improve code and do not publish code either with their papers or self-contained because software engineering habits are rarely practised by researchers or students. Software engineering skills are not passed on to followers as for paper writing skill. Thus it is often felt that the software or code produced is not publishable. The quality of software and its source code has a decisive influence on the quality of research results obtained and their traceability. So establishing best practices from software engineering not only adopted but also adapted to serve scientific needs is crucial for the success of software publications.
Even though scientists use existing software and code, i.e., from open source software repositories, only few contribute their code back into the repositories. So writing and opening code for Open Science means that subsequent users are able to run the code, e.g. by the provision of sufficient documentation, sample data sets, tests and comments which in turn can be proven by adequate and qualified reviews. This assumes that scientist learn to write and release code and software as they learn to write and publish papers.
Having this in mind, software, which accounts for an increasingly prominent space in research and which has become an indispensable part of science, could be valued and assessed as a contribution to science. But this requires the relevant skills that can be passed to colleagues and followers.
The material for the different lessons is available online at following locations:
● Databases and SQL>>
● Version Control with Git and GitHub>>
● Defensive Programming and Testing>>
● Best practices for scientific computing>>
● Open science, open data>>
● Software publication>> and licensing>>
Additionally a snapshot of the material is available in PDF format>>
The lessons of the workshop were recorded and are available on YouTube>>