Special Issue on Innovation through Open Data - A Review of the State-of-the-Art and an Emerging Research Agenda: Guest Editors' Introduction

For decades good governance scholarship has focused attention on the importance of government openness [26], [34]. Since the 1960s, Freedom of Information (FOI) legislation has formed the backbone of institutional support for opening information and documents [76] and participatory processes [4], [79]. However, FOI represents a passive approach to releasing information. Persons or organizations must still request the information they want, referred to casually, as freedom of information requests. Since the 1990s, publishing documents on websites or using communication technologies to engage citizens in participation processes has signaled a more proactive approach to releasing government information and political engagement. Since 2003, governments have re-envisioned their passive and proactive approaches to include an open data agenda, [18], [62], where publishing documents and data in open formats, [35], [63] is the preferred way. Collectively, these developments have forged the basis for what has been commonly referred to as the open government and data movements [30]. Open data practices and policies are praised for their potential to generate public value, particularly through innovation, economic growth, and transparency, [5], [9], [18], [21], [81]. The nature and character of open data has been hailed for its innovative capacity and transformative power [19], [35], [40], [45], [80]. Various studies have confirmed that proactively releasing public and private data in open formats creates considerable benefits for citizens, researchers, companies and other stakeholders, such as business creation or having the ability to understand public or private problems in new ways through advanced data analytics, [5], [9], [18], [21], [81]. Only a handful of articles examine both the unintended consequences and negative side effects of opening data, [33] and the underlying causal mechanisms that actually lead to the desired open data benefits [5]. Open data research is still in its infancy, and as a result, the extant literature uses limited application and development of theory toward understanding the open data phenomenon. While scholars acknowledge diverse perspectives, it is not clear which theories are most relevant, nor whether a single or integrated theory is needed. This special issue is part of a series of two special issues about open data. This issue focuses on the relationship between innovation and open data, while the second special issue emphasizes research on open data related to transparency and open data policies. To realize the practical benefits of this transformative practice and to develop theory, more research needs to focus on understanding how innovation occurs through open data activities. The papers in this special issue begin to address this gap. The introductory article discusses the state-of-the-art with respect to understanding the context of open data innovation, developments, challenges and barriers, presents an overview of open data research and outlines emerging research directions.


Perspectives on Open Data
Seven different perspectives are currently reflected in the literature, although most articles use a single perspective to study open data. Combining perspectives may be more effective in dealing with the issues related to open data and stimulating innovation. Figure 1 provides a picture of the various perspectiveslegislative, political, social, economical, institutional, operational, and technical [37]. A legislative perspective emphasizes how open data legislation, including freedom of information acts, open data policies, open government directives, memorandums and declarations, are important. A political perspective emphasizes the importance of political developments and political differences between countries. A social perspective brings into focus the importance of cultural differences between countries and differences in agendas related to the social benefits of opening data, such as transparency and accountability. An economical perspective points at the financial benefits and gain which can be created with open data. It includes studies estimating the impact of reuse of open data, often estimated in billions of euros annually [20]. Innovation is seen as an important driver for stimulating economic growth. An institutional perspective examines the ways that institutions enable and constrain the publication and adoption of open data, such as the barriers of use from a data provider's point of view [37]. Institutional analysis reveals the importance of examining how the publication of data could become an integral part of the data collection or creation process, rather than a separate activity which is not integrated in the daily procedures and routines [82], [85].  [29] and the metadata that are required to enable the reuse of open data [13].

Complexities of Open Data Innovations
Stimulating innovation in general is not easy [43]. Likewise, stimulating innovation through the open data process [86], the efforts required to create, open, find, and use open data, are equally complex. The complexities emerge from several factors, including the large number of actors involved in the process, the variety of social and technical contexts, uncertainty surrounding how open data will be used, and the difficulty valuing intangible impacts generated through open data innovation.
First, many stakeholders are involved in open data processes [12], [33], including open data providers, open data legislators, open data facilitators and many different types of open data users, such as citizens, researchers, journalists, developers, entrepreneurs, archivists or librarians [31]. These stakeholders have various interests and these interests may conflict [33]. For instance, a journalist may want access to and to use sensitive data which he or she would report on in an article through the news media, whereas a government open data provider will typically not publish sensitive data because of various legal and personal privacy constraints. This example demonstrates that the stakeholders are often loosely connected, and in some respects, dependent on each other's activities. The ability for journalists to use open data depends on whether open data providers publish the data. However, each stakeholder remains focused on their own activities, often performing them in non-standardized ways. The non-standardization reduces the interconnectivity between each other's activities and the opportunity to innovate within the open data process is likely lessened [82]. Learning from how users actually use their data is another potential opportunity for Second, there are a variety of contexts which play a part in the open data process [14], including legal [44] and cultural [59], and the variety of data content and types. Different types of data, with different content, may need a different legal, cultural, or technical treatment. Each context has its own set of characteristics which influences the way that open data are collected, disseminated, used and interpreted. For example, using data on crimes committed in a city will likely need different levels of privacy protection and interpretation of meaning than would data about playground locations within cities or tree types geocoded by city streets.
Third, the publication and use processes of open data are complex [33], [83], [88] and it is not easy to predict how users will use open data, when they will use it, and how it will be used in the future [51]. For example, government agencies may avoid publishing open data as a risk aversion strategy. The risk aversion is driven by their uncertainty in how the data will ultimately be used or combined with other data.
Fourth, understanding how value is created from open data innovation is not straightforward, particularly how public value is generated. Public value is not a new term, but a growing literature examines its definition and characteristics for instance, [42], [53]- [54], [56], [74]. Public value is seen as the product of governmentally produced benefits where public value is derived from the direct usefulness, fairness, and equitability of such benefits to a variety of stakeholders [30]. p. 5. There are many levels of public value observationindividual, group, institutional, and societal [10] and often value created from transparency, accountability, and collaboration are intangible impacts such as trust, well-being, or being more informed. These are important impacts, but they are not easily measured. The uncertainty surrounding the value of innovation through open data makes it a risky investment.
The complex nature of the open data process, and innovation processes, complicate the supply and use of open data and these complexities are often not taken into account by stakeholders. As a result, the potential of the open data process is not completely exploited [82].

Research on Open Data
In the last decade open data research has gained more attention due to the European Public Sector Information directive of 2003 [18], Obama's Memorandum for the Heads of executive Departments and Agencies [62] and other open data policy documents at the international [72] and municipal levels for example, [28], [46]. Figure 2   Our review of 143 papers on open data revealed that publications were mainly conceptual papers, descriptions of the empirical uses of open data or described the design of technology and systems [37]. The use of theory to explain changes or patterns received considerably less attention. The papers for review were collected by searching for open data, open link data and open link government data in various data bases, including the data bases mentioned above. The Scopus data base was used most frequently because our previous analysis showed that out of the four Table 1 summarizes which theories are mentioned, used and developed and the related topics examined in the open data field in the 19 papers. We operationalized the mentioning of theory as any mention of that theory at least once. We operationalized the use or extension of theory as any paper that refers to the use of a theory by applying it to a certain open data topic.
We found that of the 19 articles, many were written by the same authors and that only a very small group of researchers are involved in theory development in the field to-date. Out of the 24 authors that were involved in writing these 19 papers, 8 authors were involved in writing two or more of the 19 papers. From this group, however, they have explored many different types of theories, and only rarely was the same theory used more than twice. Institutional and organizational theories and democratic theories were used several times. Institutional and organizational theories were used to investigate policy development, changing systems, and changing organizational cultures and structures. Democratic theories were used to investigate transparency and trust and participation. In addition, collaboration between different actors involved in publishing and using open data was a topic that had been investigated often, as compared to the other topics. Theories of public accountability [50] Stakeholder theory [50] Deutsch' [16] theory of the nerves of government [50] Collaboration between public and private organizations to support a smart disclosure policy Institutional theory [68] Context and expectations Theory of contextual integrity [60] Coordination mechanisms for open data activities Coordination theory [82] Data sharing, information sharing and knowledge sharing Motivation theory [67] General theory of interagency information sharing [11] Theory of planned behavior [67] Economics of information Theory of natural monopolies [58] Microeconomic theory [58] Competitive theory [58] Utility theory [58] Neoclassical economic theory [58] Social welfare theory [58] Theory of public goods [58] Information storage and communication Information theory [47] Theory of digital objects [47] Innovation Actor Network Theory [8] Theory of diffusion [8] Theory of innovation [8] Institutional contexts and rational choices Structuration theory [52] Moving from closed to open systems, organizational change Organizational theory [1] Institutional theory [37] Network and power relationships between open data owners Actor-network theory [5] Open data adoption Adoption theories (e.g. Theory of Planned Behavior, Technology Adoption Model, United Theory of Acceptance and Use of Technology) [66] Diffusion of Innovation theory [15] Organizational cultures, structures and dynamics in relation to market characteristics Organizational theory [5] Participation Democratic theory [

Developments, Challenges and Barriers
Public and private organizations increasingly publish their data on the internet [27], [49], [52].  [84]. Yet, the publication and use of open data is still accompanied with many barriers [37], [82]- [83], [86]. In [86], 118 barriers for open data were identified. The barriers were divided into ten categories, namely 1) availability and access, 2) find-ability, 3) usability, 4) understand-ability, 5) quality, 6) linking and combining data, 7) comparability and compatibility, 8) metadata, 9) interaction with the data provider, and 10) opening and uploading. Most barriers concerned the use of open data. This research revealed that little attention is paid to the user perspective, which is likely to inhibit innovation. The users need to generate value from open data [84], [86].  Figure 1.
The barrier overview is based on the literature review of barriers and interviews and workshops about barriers described in [37], [86] and on other literature about barriers for example, [82]- [83]. There are many diverse barriers in each perspective, suggesting the need for research from each of these perspectives.   In sum, only few papers to-date contribute to theory building and there is not a dominating theory. The diversity in the use of theories suggests that there are many opportunities for theory development in the field of open data that have not been explored sufficiently. The field is in its beginning stage and theoretical contributions are still scarce.

Open Data Policies, Use and Innovation
Freedom of information legislation and specific directives represent the legal and political landscape of open data policies to-date. The following is an elaboration of where specific research areas regarding policies, process management, innovation and stimulation of use that we believe will push the field forward.
Policies. Open data policies are in place or under development in many different countries world-wide. From a practical point of view, countries may be able to learn from one another [84] or countries with no policies can learn from those who are leading the way and progressing quickly [59].  [87], [89]. For example, Zuiderwijk, Janssen and Jeffery [87] present an open data e-infrastructure which supports data provision, data retrieval and use, data linking, user rating and user cooperation. Moreover, Zuiderwijk, Janssen and Parnia [89] conducted Metadata. Metadata are important to stimulate interoperability of open data infrastructures. Metadata may yield considerable benefits [12], [90]. For instance, metadata improve the accessibility of open data for others beyond the primary data providers by describing, locating and retrieving the data efficiently [3], [17], [41], [65], [71], [73]. Metadata improve the ability to find open data, the chances to be found by describing content and becoming searchable [6], [39], [57], [70] and the chances of a correct interpretation of open data by distilling knowledge from them [25], [39], [70], [75]. Currently mainly flat metadata, i.e. metadata for the discovery of datasets, are used and described in open data research, including Dublin Core (DC), CKAN (Comprehensive Knowledge Archive Network), eGMS (e-Government Metadata Standard) and DCAT (Data Catalog). The disadvantage of these flat metadata is that they barely provide contextual information, which is required to interpret open data. More research is needed on the use contextual metadata models, such as CERIF (Common European research Information Format).
Data quality. Providers and users of open data often do not know what the quality of the data is. This may be a formidable barrier for using open data in meaningful ways. More research is needed on which mechanisms can help in improving data quality and dealing with low data quality, such as a quality rating system for data and so-called seals of approval by organizations who published the data. These mechanisms can help potential open data users in assessing the usefulness of the data for their purposes. The fifth paper of this special issue, written by Behkamal, Kahani, Bagheri and Jeremic, contributes to filling the data quality research gap (A metrics-driven approach for quality assessment of Linked Open Data).

Paper Overview
This special issue covers a diversity of subjects related broadly to innovation through open data processes. It consists of a collection of papers written by experts in this area. In total seven papers were selected for this special issue. All papers have undergone a rigorous blind review process and were reviewed by three reviewers. The papers can be classified in the following three areas:

Semantic Interoperability, Ontologies and Data Quality
The following two papers were accepted in the category semantic interoperability, ontologies and data quality.
Using a method and tool for hybrid ontology engineering: an evaluation in the Flemish Research Information Space -Christophe Debruyne and Pieter de Leenheer. This paper describes a method to create ontologies in which the stakeholder community becomes an integral part of the ontology and ontology-engineering process, as well as the natural language definitions of concepts. These ontologies are referred to as hybrid ontologies. An experiment was conducted in which the participants used the method to build ontologies to establish semantic interoperability between various research information systems and to annotate the data of an existing system provided by a public administration.
A metrics-driven approach for quality assessment of linked open data -Behshid Behkamal, Mohsen Kahani, Ebrahim Bagheri and Zoran Jeremic. The fifth paper in this special issue proposes a set of metrics for evaluating the inherent quality characteristics of open data. The metrics can be used to assess datasets before they are released to the Linked Open Data Cloud. Measurement theory and software measurement techniques are used to assess the quality of datasets. Various quality characteristics of datasets can be assessed, for instance, to help data publishing agencies in evaluating their data. Additionally, users of open data can use the quality metrics to assess the quality of the data that they may want to use and to filter out poor quality data.

Value Creation: Evaluation of Innovation through Open Data
Finally, two papers concerning value creation were included in this special issue. They can be summarized as follows.
Open government data implementation evaluation -Peter Parycek, Johann Höchtl and Michael Ginner. Data-driven innovation through open government data -Thorhildur Jetzek, Michel Avital, and Niels Bjorn-Andersen. The last paper of this special issue aims to explain how the use of open government data can stimulate the generation of value. A framework with four generative mechanisms is used to explain the complex relationship between openness, data and value. In addition, a critical realist approach is used as a foundation for the in-depth study of open government data in the Opower case. Enabling factors, innovation mechanisms and impacts are identified and the study has resulted in a conceptual model for the data driven innovation mechanism.
This special issue is aimed at contributing to the relationship between innovation and open data. Enjoy reading!