Size and quality between software development approaches

Vision and Mission

While many companies build their software with different technologies and programming languages (heterogeneous systems), today’s size and complexity metrics are not robust with regards to these differences. This puts the correctness of productivity measures, cost prediction, and quality assessment at risk.
Goal of this research is to help companies reducing the error that stems from using classical metrics on heterogeneous systems as well as to empower them to judge correctness of productivity and quality measures.

Therefore, this project targets at understanding when and why metrics are not robust to changing languages. Furthermore, we aim at enabling companies to calibrate their metrics use and to create metrics that are robust with regards to the use of heterogeneous languages, but also with regards to other factors such as programming styles.

Furthermore, we aim to understand the role of code generation, which translates between languages of different abstraction levels and thus can lead to an abstraction gain for developers.

Research questions that we aim to address are:

  1. How to create size and complexity metrics that are robust with regards to the use of heterogeneous languages?
  2. How to estimate the abstraction gain code generation offers with regards to size and complexity?
  3. How to create reliable measures for productivity and quality of the code being developed using different languages and technologies?


Productivity measures, cost prediction, and quality assessment play an important role for decision making in software producing companies.
Used measures are provided by consulting companies, such as MCKinsey’s Numetrics, are standard models, such as COCOMO, or are just simple quality assessment metrics, such as defect density. All of them have in common that they rely of some form of size or complexity measure. For example, Numetrics takes complexity measures and number of lines of code as input. The later one is also used in COCOMO and for the calculation of defect density.
At the same time software that is build using diverse languages and technologies, such as code generators are applied. However, benchmarking and cost models, such as Numetrics and COCOMO do not compensate for these differences.

In fact, we could show for two systems written with different programming languages already the simple quality metric “defect density” can lead to an erroneous comparison to a degree that the outcome is turned to the opposite. In the studied case, the difference measured defect densities of both systems varied together by a factor of 3 to 4. By choosing the wrong size metric (or even just metric tool) as a basis for the calculation, we could let the system with a slightly lower quality look like the system with a 2 times better quality.

Similar impacts of these errors can also be expected for productivity measures and cost prediction, when multiple programming languages are used. In consequence, as soon as a systems consists of multiple languages, there is an error introduced to productivity and quality assessment as well as cost prediction that we cannot yet predict.

Company partners

  • Ericsson
  • Volvo Car Group
  • Axis