"Black. White. Green. Red. Can I take my friend to bed?"
"All Together Now" by The Beatles
Abstract
A majority of open source development today is carried out by companies. Building on open source allows companies to focus their development effort on the points of difference over their competitors. This article discusses the recent trend towards collectives of companies that develop shared assets in the form of open source projects, and creates a model for company-led open source projects around two dimensions: the level of control over the project and the diversity of applications derived from the project. The article then explores how the model can be interpreted from a product line engineering perspective.
Introduction
Open source has become an integral part of commercial software development. Company engagement with open source ranges from the adoption of open source development practices, the use of open source development tools, and the use of open source components in products, to active contributions to open source projects and creating new company-led open source projects. Whereas in the past, free/libre open source software (F/LOSS) development was considered to be driven by volunteer effort, a majority of open source development today is carried out by paid developers. For example, over 70% of changes to the Linux kernel and over 80% of commits to the Eclipse platform have been made by developers who are paid by companies to contribute to those projects.
Companies use open source to reduce their development and maintenance cost, and to improve their time to market. Building on open source allows them to focus their development effort on the points of difference over their competitors. The non-differentiating portion of the software can be obtained from external sources, either commercial off the shelf (COTS) or F/LOSS. This has recently motivated networks of companies within the same domain (or collectives) to develop shared assets in the form of open source projects.
Research on product line engineering has also started to examine the relationship between F/LOSS development and product line management. The research differentiates between using F/LOSS in a product line and the adoption of product line practices in F/LOSS. This article is a contribution to the second stream. Its objective is to examine the participation structure of company-led open source projects from the perspective of product line engineering.
Related Work
Open source development and product lines are complementary (Chastek, McGregor, and Northrop, 2007 SPLC Conference). Distinct features of open source are license management, distributed development, and high quality (at least for large open source projects, as a result of peer review and multiple use). Software product lines are, likewise, characterized by asset management, distributed development, and processes to manage quality.
Companies use product lines to manage product diversity and reuse (van der Linden, 2009). Product line engineering separates the development of a common platform from the development of applications. Platforms identify points of commonality and variability. Applications are created by binding the variability (Pohl, Böckle, and van der Linden, 2005). Both practices are used, de facto, in large open source projects (van Gurp, 2007). Many open source projects are structured into platform and application components.
The product line characteristics of Linux, Mozilla, and Eclipse have been studied already (Chastek, McGregor, and Northrop, 2007; van Gurp, 2007). These projects receive most of their contributions from companies and have, thus, adopted more formal processes than their volunteer-driven counterparts. This article focuses on company-led projects.
Participation in Company-led F/LOSS Projects
Evolution of F/LOSS projects
Many open source projects start out with a single developer or company with a need. The need is narrowly defined and focuses on resolving an immediate technical challenge (i.e., "scratching an itch") faced by the project initiator. An example of a project started by an individual is Linux; the project started out as a personal project by Linus Thorvalds to build a freely available Unix operating system. An example of a company-initiated project is Eclipse; the project started with IBM donating the codebase for its VisualAge product as open source.
At this point, the project initiator is in full charge of the direction of the open source project. The next stage of evolution occurs when a community forms around the project. Typically, the project initiator is still in charge of the technical roadmap of the project, and the community members (individuals or other companies) create products or services complementary in nature to the project. Growth of the open source project is limited beyond this point, unless it moves from a model where a single entity controls the direction of the project to a model where all community members collectively decide on its course.
Evolution of the project to this model requires that the project initiator is removed at arm's length from the project, as documented by West and Gallagher (2006) for a range of open source projects, and joins the community as just another member. The direction of the project is now set by the member organizations. Often, the relationship between the members and the project is also formalized through a neutral organization or foundation, which acts as the legal representative of the project and facilitates between the community members. For example, the Eclipse project is coordinated by the Eclipse Foundation.
The project members join the project with different needs. They leverage the common codebase of the project to develop a diverse range of applications. As a case in point, Eclipse has 13 top-level projects with over 200 subprojects between them, contributed by more than 50 member companies as well as individual members. The majority of the contributions, or 80% of the commits, are made by member companies. Furthermore, the Eclipse marketplace lists over 1000 applications built on top of the Eclipse core.
From Green to Red
Take, for example, project Green. Green is a project in the education space that was started at a university by a single developer and was then spun out into a company. The project initially had a small group of core contributors, and control of the direction of the project was with the spin-off company. A small community has formed around the project consisting of companies and individuals that develop custom features and offer complementary services to the project. But, at this point, something interesting happens.
More companies want to join the community, however, they do not feel that their needs are met under the current project structure. These companies differentiate themselves from each other through their specific application domains, not in terms of the platform they share. This changes the nature of the project, and to reflect this change, a foundation is created to manage the project and the project is renamed into Red. In the Red project, the other companies take a more active role in the project, and the project initiator becomes one of them. The new project is ready to grow in size and diversity in ways that the Green project could never have done.
How companies participate
Company-led open source projects differ in significant ways in terms of who controls the project, and the diversity of applications derived from the project. Control refers to decision making, and includes control over the direction of the project, the architecture, commits and releases, and who captures the value created by the project. Control can be hierarchical or shared. In a hierarchically controlled project, a single company makes all the decisions. In a project with shared control, decisions are made jointly by the project members.
Figure 1. How Companies Participate in a Company-led Open Source Project
Applications can be either in a narrow domain (such as education) or spread across a variety of domains (such as language training and business intelligence). If the applications are in a narrow domain, the project often has an integral architecture, if the project is controlled by a single company. The reason is that the company has little incentives to divide the architecture into modules, as it requires additional effort. However, when other companies are involved in the project, the architecture needs to be modular to some degree.
There are four basic ways for companies to participate in a company-led open source project as shown in Figure 1. This categorization is based on the experience with the case study and an examination of extensible open source platforms conducted by the author (Noori and Weiss, 2009). As should be apparent from the earlier discussion, the Green project belongs into the top-left quadrant. In the top-right quadrant, a single company exposes an interface to attract third-parties to create applications, for example, the Moodle learning management system. As an example of a company in the bottom-left quadrant, the Zope Europe Association (ZEA) coordinates a group of open source companies, allowing them to compete for large government contracts (Feller, Finnegan, and Hayes, 2006). The bottom-right quadrant is reserved for collectives of companies that jointly control a platform, which provides the basis for a diverse range of applications. The Eclipse project is an example of such a collective.
Discussion
Hierarchical-wide F/LOSS projects and F/LOSS projects with shared control are organized like product lines: a platform and applications that extend it. Hierarchical-wide and shared-wide open source projects like Moodle and Eclipse have a plug-in architecture that provides variability through extension points and extensions. As observed by Chastek and colleagues (2007), the products in this product line are new plug-ins and products using existing plug-ins. In Moodle, plug-ins can be added to extend the behavior of the open source platform through preconceived extension points under the control of Moodle.com. The Eclipse platform also allows members to define extension points in plug-ins they contribute. Both Moodle and Eclipse support a high diversity of applications. However, the amount of variation supported by Eclipse is much higher than for Moodle.
Shared-narrow projects like ZEA allow small companies to compete for much larger contracts than they could individually by providing the members of the collective with a common brand, pooling their assets, and creating a reliable delivery process. Examples of variation are localization and geographic coverage: member companies of the ZEA collective are distributed across all of Europe.
Conclusion
This article develops a model of the participation structure of company-led open source projects. The differences between the participation structures can be interpreted in terms of the product line concepts of commonality (platforms) and variability (applications). Our analysis adds the notion of shared control by a collective. Future work includes validation of the model through a survey.