The workshop addresses the passing of software writing skills to young scientists, the next generation of researchers in the Earth, planetary and space sciences. The writing of code in science following minimal but vital software engineering rules, best practices and processes shall be imparted as fundamental skills. So the workshops addresses young scientist with no or little experience in writing software for their research work. The lessons will cover:
- The Unix Shell - Get used to the Bash and learn how to automate repetitive tasks
- Distributed revision / version control - Get used to Git and GitHub and learn how to track and share work efficiently
- Programming with Python - Get used to Python with first hands-on experiences
- Looking beyond the basics - Using databases, automation and Make
- Open Access, Open Data, and Open Science - Why bother in research
- Publishing scientific software - Release code as Open Source in the context of Open Science
September 23 - 25, 2015
Helmholtz Centre Potsdam - GFZ German Research Centre for Geosciences, Telegrafenberg, 14473 Potsdam, Germany (map)
No workshop fees, basic food supply included, travel and accommodation expenses are covered by workshop participants
Novices using spreadsheets or similar tools for data analysis rather than writing code on their own and novices with first but little own experiences in programming
None but the need of programming and developing software in your own work
- 24 workplaces with compute
- 12 additional workplaces for participants with own laptop
Instructors and Speakers
Olav Vahtras - Olav received a MS degree in engineering physics and PhD in quantum chemistry, both at Uppsala University. His work at the Minnesota Supercomputer Institute as a post-doc, and at Linköping University as a research associate has included research on molecular properties and electronic structure theory. Then he worked at PDC, the Center for High Performance Computing at the KTH Royal Institute of Technology, Stockholm, as an application specialist in the field of computational chemistry. Now he is a professor of theoretical chemistry at the KTH. His research involves development of quantum chemical methods for prediction of molecular properties and he teaches Python in a national program for computational sciences.
Malvika Sharan - Malvika is a PhD student in the University of Würzburg, Germany, where she is carrying out projects that deal with characterization of RNA-binding proteins by means of bio-computational techniques and high-throughput sequence analysis. As a part of Doctoral Researchers' council of her graduate school, she tries to promote the usage of bioinformatics tools and programming skills among the experimental scientists in order to increase the efficient exchange of data. She has recently completed software-carpentry (SWC) instructor's course.
Krzysztof Siewicz - Krzysztof is a Polish legal counsel specialising in legal issues of information processing with a particular focus on copyright and other rights in intangibles in the context of Open Science. Krzysztof is a member of the Open Science Platform at the Interdisciplinary Centre for Mathematical and Computational Modelling at the University of Warsaw.
Martin Hammitzsch - Martin is involved in developing software systems at the Centre for GeoInformation Technology (CeGIT) of the GFZ German Research Centre for Geosciences. The need of opening scientific software in his work so that the scientific contribution is acknowledged led Martin to a project which is pushing software publications in sciences, especially the geo-sciences. Since then he is active in the European Geosciences Union (EGU) as Young Scientists (YS) Representative in the division on Earth and Space Science Informatics (ESSI). In addition, Martin is active in the German speaking Open Science working group of the Open Knowledge Foundation (OK) and in the German Open Source Geospatial Foundation (OSGeo) chapter FOSSGIS which promotes free and open source software for geo-spatial purposes.
Support and Funding
Facilitate Open Science Training for European Research (FOSTERhttps://www.fosteropenscience.eu/) is a two-year, EU-Funded (FP7) project, carried out by 13 partners across eight countries. The primary aim is to produce a European-wide training programme that will help researchers, postgraduate students, librarians and other stakeholders to incorporate Open Access approaches into their existing research methodologies. FOSTER is designed to equip young researchers and other stakeholders, with the skills to function effectively as the range of Open Access policies are refined and aligned across the EU. FOSTER supports the workshop with their call for Open Science Training 2015 to integrate Open Access and Open Science principles and practice in the current research workflow by targeting the young researcher training environment.
The Software Carpentry (SWC) is a volunteer organization whose goal is to make scientists more productive, and their work more reliable, by teaching them basic computing skills. Founded in 1998, it runs short, intensive workshops that cover program design, version control, testing, and task automation. In October 2014, the Software Carpentry Foundation (SCF) was announced to act as the governing body. The SCF is a non-profit membership organization devoted to improving basic computing skills among researchers in science, engineering, medicine, and other disciplines. SCF’s long-term goal is to ensure that every researcher learns the skills that Software Carpentry teaches early in their career. Activities in UK are led by the Software Sustainability Institute (SSI) which will help sufficiently to run the intended series of workshops. The SWC supports this workshop with trained volunteering instructors and successfully tested teaching materials.
The GFZ German Research Centre for Geosciences is Germany’s premier institute for the geosciences, with strong links to leading institutes across Europe. Its research ranges across the full breadth of the Earth Sciences from the dynamics of Earth’s deep interior to remote observation of its active surface. GFZ combines all solid earth sciences encompassing the fields of geodesy, geophysics, geology, mineralogy, geochemistry, geochronology, geomorphology, physics, mathematics, biology and engineering in a multidisciplinary scientific and technical environment. GFZ is part of the Helmholtz association of German Research Centres, the official mission of which is to solve the grand challenges of science, society and industry. The GFZ organizes, hosts and accompanies the workshop run by SWC instructors and FOSTER speakers.
Software has become an integral part of science, yet software is not properly integrated into the scientific discourse. Findings presented in papers are based on data and once in a while they come along with data – but not commonly with software. However, the software used to gain findings plays a crucial role in the scientific work. Nevertheless, software is rarely seen publishable. Thus researchers may not reproduce the findings without the software which is in conflict with the principle of reproducibility in sciences. For both, the writing of publishable software and the reproducibility issue, the quality of software is of utmost importance. For many programming scientists the treatment of source code, e.g. with code design, version control, documentation, and testing is associated with additional work that is not covered in the primary research task. This includes the adoption of processes following the software development life cycle. However, the adoption of software engineering rules and best practices has to be recognized and accepted as part of the scientific performance.
Most scientists have little incentive to improve code and do not publish code either with their papers or self-contained because software engineering habits are rarely practised by researchers or students. Software engineering skills are not passed on to followers as for paper writing skill. Thus it is often felt that the software or code produced is not publishable. The quality of software and its source code has a decisive influence on the quality of research results obtained and their traceability. So establishing best practices from software engineering not only adopted but also adapted to serve scientific needs is crucial for the success of software publications.
Even though scientists use existing software and code, i.e., from open source software repositories, only few contribute their code back into the repositories. So writing and opening code for Open Science means that subsequent users are able to run the code, e.g. by the provision of sufficient documentation, sample data sets, tests and comments which in turn can be proven by adequate and qualified reviews. This assumes that scientist learn to write and release code and software as they learn to write and publish papers.
Having this in mind, software, which accounts for an increasingly prominent space in research and which has become an indispensable part of science, could be valued and assessed as a contribution to science. But this requires the relevant skills that can be passed to colleagues and followers.
The material for the different lessons is available online:
● The Unix Shell>>
● Version Control with Git and GitHub>>
● Programming with Python>>
● Automation and Make>>
● Using Databases>>
● From Open Access and H2020 to Open Data>>
● Software Publication>> and Licensing>>
Additionally a snapshot of the material is available in PDF format>>