Validating cohesion metrics by mining open source software data with association rules
Competitive pressure on the software industry encourages organizations to examine the effectiveness of their software development and evolutionary processes. Therefore it is important that software is measured in order to improve the quality. The question is not whether we should measure software but how it should be measured. Software measurement has been in existence for over three decades and it is still in the process of becoming a mature science. The many influences of new software development technologies have led to a diverse growth in software measurement technologies which have resulted in various definitions and validation techniques. An important aspect of software measurement is the measurement of the design, which nowadays often means the measurement of object oriented design. Chidamer and Kemerer (1994) designed a metric suite for object oriented design, which has provided a new foundation for metrics and acts as a starting point for further development of the software measurement science. This study documents theoretical object oriented cohesion metrics and calculates those metrics for classes extracted from a sample of open source software packages. For each open source software package, the following data is recorded: software size, age, domain, number of developers, number of bugs, support requests, feature requests, etc. The study then tests by means of association rules which theoretical cohesion metrics are validated hypothesis: that older software is more cohesive than younger software, bigger packages is less cohesive than smaller packages, and the smaller the software program the more maintainable it is. This study attempts to validate existing theoretical object oriented cohesion metrics by mining open source software data with association rules.