Our work on using clone detection methods to discover which parts of the source code of a software system developers frequently discuss has been accepted for publication at the 2012 European Conference on Software Maintenance and Reengineering (CSMR’12)!

Abstract – When discussing software, practitioners often reference parts of the project’s source code. Such references have different motivations, such as mentoring and guiding less experienced developers, pointing out code that needs changes, or proposing possible strategies for the implementation of future changes. The fact that particular parts of a source code are being discussed makes these parts of the software special. Knowing which code is being talked about the most can not only help practitioners to guide important software engineering and maintenance activities, but also act as a high-level documentation of development activities for managers. In this paper, we use clone-detection as specific instance of a code search based approach for establishing links between code fragments that are discussed by developers and the actual source code of a project. Through a case study on the Eclipse project we explore the traceability links established through this approach, both quantitatively and qualitatively, and compare fuzzy code search based traceability linking to classical approaches, in particular change log analysis and information retrieval. We demonstrate a sample application of code search based traceability links by visualizing those parts of the project that are most discussed in issue reports with a Treemap visualization. The results of our case study show that the traceability links established through fuzzy code search-based traceability linking are conceptually different than classical approaches based on change log analysis or information retrieval.

{ Comments on this entry are closed }

Mining Development Repositories to Study the Impact of Collaboration on Software Systems View more presentations from Nicolas Bettenburg

{ Comments on this entry are closed }

Mining Development Repositories To Study the Impact of Collaboration on Software Systems

July 23, 2011 Conferences

I am pleased to announce that I will be presenting my PhD research at the Doctoral Symposium track of the 19th ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE’2011), which will this year take place in Hungary.

Abstract – Software development is a largely collaborative effort, of which the actual encoding of program logic in source code is a relatively small part. Yet, little is known about the impact of collaboration between stakeholders on software quality. We hypothesize that the collaboration between stakeholders during software development has a non-negligible impact on the software system. Information about collaborative activities can be recovered from traces of their communication, which are recorded in the repositories used for the development of the software system. This thesis contributes the following: 1) to make this information accessible for practitioners and researchers, we present approaches to distill communication information from development repositories, and empirically validate our proposed extractors. 2) By linking back the extracted communication data to the parts of the software system under discussion, we are able to empirically study the impact of communication, as a proxy to collaboration between stakeholders, on a software system. Through case studies on a broad spectrum of open-source software projects, we demonstrate the important role of social interactions between stakeholders with respect to the evolution of a software system.

Learn more...

A Lightweight Approach To Uncover Technical Information in Unstructured Data

May 2, 2011 Publications

I am pleased to announce, that our paper “A Lightweight Approach To Uncover Technical Information in Unstructured Data” has been accepted for publishing at the 19th IEEE International Conference on Program Comprehension (ICPC’11), which will this year take place in lovely Kingston, Ontario, Canada.

Learn more...

Ontario Graduate Scholarship (International) 2011-2012

April 17, 2011 Awards

Wonderful news! My application for the 2011-2012 OGS was successful! I wasn’t really counting on winning this scholarship for a second year in a row, as competition is extremely strong – especially in the category of international graduate students.

Learn more...

Journal Paper Accepted for Publishing: An Empirical Study on Inconsistent Changes to Code Clones at Release Level

December 18, 2010 Publications

I am pleased to announce that we received notification today that the Journal of Science of Computer Programming (SCICO) has accepted our paper “An Empirical Study on Inconsistent Changes to Code Clones at Release Level” for print. The electronic version is available under DOI:10.1016/j.scico.2010.11.010 or by following this link.

Learn more...

Managing Community Contributions: Lessons Learned from a Case Study on Android and Linux

October 18, 2010 Conferences

I’ll be giving a talk at the 2010 Fall Meeting of the Canadian Consortium for Software Engineering Research (CSER) on “Managing Community Contributions: Lessons Learned from a Case Study on Android and Linux”. Abstract – Modern companies realize that collaboration with a thriving community is important for creating innovative technology [...]

Learn more...

PostgreSQL 9.0 – Who contributed the most and what?

October 18, 2010 Special News

Parsing through the list of PostgreSQL 9.0 release notes, we counted how many contributions each developer named made to get this release up and going. Top contributor by large is Tom Lane, who we also found in previous studies to be one of the most active and central members of the project. Surprisingly, we also see this time that many contributions to the latest version have been made by developers outside the core member team – demonstrating that the project is once again successful in attracting members from the open source community.

What were their contributions about? Read the rest of this post to find out!

Learn more...

Mining Unstructured Data is Like Fishing in Muddy Waters!

August 16, 2010 Conferences

I am pleased to announce that I will be co-organizing the 2010 Workshop on Mining Unstructured Data (MUD) together with Dr. Bram Adams. The workshop is co-located with the 17th Working Conference on Reverse Engineering (WCRE’10), which this year takes place in lovely Berverly, Massachusetts, USA between October 13-16.   [...]

Learn more...

What Android is made of

July 29, 2010 Tools

I just ran some statistics on all 242 projects that make up the complete source code of the Android 2.2 platform. In this experiment I used the CLOC ultility in connection with R64 and GGPlot2. You can find detailed plots for each project here.

Overall, the whole Android platform consists of a little more than 73 Million Lines of Code (73,517,272 to be precise).

Read the rest of this post to find out more about the 242 largest projects!

Learn more...