Software Writing Skills for Your Research - Workshop for Proficient

The workshop addresses the passing of software writing skills to both, experienced researchers and young scientists, the next generation of researchers in the Earth, planetary and space sciences. So the workshop addresses researchers with existing hands-on experience in programming for their research work. The writing of code in science following vital software engineering rules, best practices and processes shall be imparted as fundamental skill. 

The lessons cover:

  • Recap on shell scripting and advanced techniques
  • Profiling and debugging your Python codes
  • Unit testing

and optional topics - the 4 most popular will be picked:

  • Making your code run faster, (Numpy, Pypy and Numba)
  • Making your code run faster (Multi-core and HPC)
  • Managing documentation, Git and Github
  • Making your code installable with make and Makefiles
  • Creating and managing packages and modules
  • Managing data on disk, file formats and compression
  • Advanced data visualisation
  • The power of object oriented programming

Date

tba

Location

Helmholtz Centre Potsdam - GFZ German Research Centre for Geosciences, Telegrafenberg, 14473 Potsdam, Germany (map)

Costs

No workshop fees, basic food supply included, travel and accommodation expenses are covered by workshop participants

Target Group and Prerequisites

Proficient with hands-on experiences in programming and with the need to proceed to the next stage in programming and developing software in the own work.

Seats

10-20 workplaces for participants with own laptop.

Language

English

Instructors and Speakers

Martin Callaghan - Martin is a Research Computing Consultant and software developer in the Advanced Research Computing Service at the University of Leeds. He supports researchers to utilise HPC and Cloud resources to solve ever bigger and more complex problems. In addition Martin co-ordinates training within the ARC service, designing and delivering courses across a range of computational topics to help research staff get the most from their computational tools.

Edwin van der Helm - Edwin is doing a PhD in computational astrophysics at the Leiden observatory, and next to that, works as a software developer at Nomizo, a California based company specialized in training neural networks on a GPU cluster.

Lidia Stępińska-Ustasiak - Lidia works for the Open Science Platform, an initiative of the Interdisciplinary Centre for Mathematical and Computational Modelling at the University of Warsaw. She is responsible for advocacy of openness in science, education and communication, member of presidium of the Coalition for Open Education, and contributor to the report Open Science in Poland 2014. A Diagnosis

Organizers

Martin Hammitzsch - Martin is involved in developing software systems at the Centre for GeoInformation Technology (CeGIT) of the GFZ German Research Centre for Geosciences. The need of opening scientific software in his work so that the scientific contribution is acknowledged led Martin to a project which is pushing software publications in sciences, especially the geo-sciences. Since then he is active in the European Geosciences Union (EGU) as Young Scientists (YS) Representative in the division on Earth and Space Science Informatics (ESSI). In addition, Martin is active in the German speaking Open Science working group of the Open Knowledge Foundation (OK) and in the German Open Source Geospatial Foundation (OSGeo) chapter FOSSGIS which promotes free and open source software for geo-spatial purposes.

Support and Funding

Facilitate Open Science Training for European Research (FOSTERhttps://www.fosteropenscience.eu/) is a two-year, EU-Funded (FP7) project, carried out by 13 partners across eight countries. The primary aim is to produce a European-wide training programme that will help researchers, postgraduate students, librarians and other stakeholders to incorporate Open Access approaches into their existing research methodologies. FOSTER is designed to equip young researchers and other stakeholders, with the skills to function effectively as the range of Open Access policies are refined and aligned across the EU. FOSTER supports the workshop with their call for Open Science Training 2015 to integrate Open Access and Open Science principles and practice in the current research workflow by targeting the young researcher training environment.

The Software Carpentry (SWC) is a volunteer organization whose goal is to make scientists more productive, and their work more reliable, by teaching them basic computing skills. Founded in 1998, it runs short, intensive workshops that cover program design, version control, testing, and task automation. In October 2014, the Software Carpentry Foundation (SCF) was announced to act as the governing body. The SCF is a non-profit membership organization devoted to improving basic computing skills among researchers in science, engineering, medicine, and other disciplines. SCF’s long-term goal is to ensure that every researcher learns the skills that Software Carpentry teaches early in their career. Activities in UK are led by the Software Sustainability Institute (SSI) which will help sufficiently to run the intended series of workshops. The SWC supports this workshop with trained volunteering instructors and successfully tested teaching materials.

The GFZ German Research Centre for Geosciences is Germany’s premier institute for the geosciences, with strong links to leading institutes across Europe. Its research ranges across the full breadth of the Earth Sciences from the dynamics of Earth’s deep interior to remote observation of its active surface. GFZ combines all solid earth sciences encompassing the fields of geodesy, geophysics, geology, mineralogy, geochemistry, geochronology, geomorphology, physics, mathematics, biology and engineering in a multidisciplinary scientific and technical environment. GFZ is part of the Helmholtz association of German Research Centres, the official mission of which is to solve the grand challenges of science, society and industry. The GFZ organizes, hosts and accompanies the workshop run by SWC instructors and FOSTER speakers.

Background

Software has become an integral part of science, yet software is not properly integrated into the scientific discourse. Findings presented in papers are based on data and once in a while they come along with data – but not commonly with software. However, the software used to gain findings plays a crucial role in the scientific work. Nevertheless, software is rarely seen publishable. Thus researchers may not reproduce the findings without the software which is in conflict with the principle of reproducibility in sciences. For both, the writing of publishable software and the reproducibility issue, the quality of software is of utmost importance. For many programming scientists the treatment of source code, e.g. with code design, version control, documentation, and testing is associated with additional work that is not covered in the primary research task. This includes the adoption of processes following the software development life cycle. However, the adoption of software engineering rules and best practices has to be recognized and accepted as part of the scientific performance.

Most scientists have little incentive to improve code and do not publish code either with their papers or self-contained because software engineering habits are rarely practised by researchers or students. Software engineering skills are not passed on to followers as for paper writing skill. Thus it is often felt that the software or code produced is not publishable. The quality of software and its source code has a decisive influence on the quality of research results obtained and their traceability. So establishing best practices from software engineering not only adopted but also adapted to serve scientific needs is crucial for the success of software publications.

Even though scientists use existing software and code, i.e., from open source software repositories, only few contribute their code back into the repositories. So writing and opening code for Open Science means that subsequent users are able to run the code, e.g. by the provision of sufficient documentation, sample data sets, tests and comments which in turn can be proven by adequate and qualified reviews. This assumes that scientist learn to write and release code and software as they learn to write and publish papers.

Having this in mind, software, which accounts for an increasingly prominent space in research and which has become an indispensable part of science, could be valued and assessed as a contribution to science. But this requires the relevant skills that can be passed to colleagues and followers.

Material

The material for the different lessons is available online at following locations:
● Recap on shell scripting>> and advanced techniques>>
Profiling and debugging your Python codes>>
Unit testing>>
● Making your code run faster>> and>>
Creating and managing packages and modules>>
Advanced data visualisation>>
● The power of object oriented programming>>
Automation and Make>>

Recordings

The lessons of the workshop were recorded and are available on YouTube>>

Contact

Martin Hammitzsch
Section Head
Martin Hammitzsch
Centre for GeoinformationTechnology
Telegrafenberg
Building A 70 , Room 320
14473 Potsdam
+49 331 288-1717
Profile