Data science for software engineering

Tim Menzies, Ekrem Kocaguneli, Fayola Peters, Burak Turhan, Leandro L. Minku

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)


Target audience: Software practitioners and researchers wanting to understand the state of the art in using data science for software engineering (SE). Content: In the age of big data, data science (the knowledge of deriving meaningful outcomes from data) is an essential skill that should be equipped by software engineers. It can be used to predict useful information on new projects based on completed projects. This tutorial offers core insights about the state-of-the-art in this important field. What participants will learn: Before data science: this tutorial discusses the tasks needed to deploy machine-learning algorithms to organizations (Part 1: Organization Issues). During data science: from discretization to clustering to dichotomization and statistical analysis. And the rest: When local data is scarce, we show how to adapt data from other organizations to local problems. When privacy concerns block access, we show how to privatize data while still being able to mine it. When working with data of dubious quality, we show how to prune spurious information. When data or models seem too complex, we show how to simplify data mining results. When data is too scarce to support intricate models, we show methods for generating predictions. When the world changes, and old models need to be updated, we show how to handle those updates. When the effect is too complex for one model, we show how to reason across ensembles of models. Pre-requisites: This tutorial makes minimal use of maths of advanced algorithms and would be understandable by developers and technical managers.

Original languageEnglish
Title of host publication2013 35th International Conference on Software Engineering, ICSE 2013 - Proceedings
Number of pages3
Publication statusPublished - 30 Oct 2013
Event2013 35th International Conference on Software Engineering, ICSE 2013 - San Francisco, CA, United States
Duration: 18 May 201326 May 2013


Conference2013 35th International Conference on Software Engineering, ICSE 2013
Country/TerritoryUnited States
CitySan Francisco, CA

ASJC Scopus subject areas

  • Software


Dive into the research topics of 'Data science for software engineering'. Together they form a unique fingerprint.

Cite this