Mapping global innovation: analysing patents

Author
Prof. Alessandro Lomi
Università della Svizzera Italiana

Interview with the principal investigator of this NRP75 project.

What was the aim of your project “The global structure of knowledge networks”?

We intended to construct the largest and most complete knowledge network currently available. We tried to do so by merging separate patent data sets provided by the Organisation for Economic Co-operation and Development (OECD) with information on corporate entities to link patents to company-specific information. We used the new data sets to develop and test innovative information retrieval techniques and network analysis models. We then applied these techniques and analytical models to large data sets with complex and evolving network structures.

Results?

A novel approach for the early identification of “influential” patents was developed. Identifying top patents in a particular technology category can help companies gain an overview of important innovations in their field of concern. It can also benefit governments in deciding various policies such as funding particular technology areas. The new approach developed in this project proved to be both qualitatively and quantitatively better than existing state-of-the-art approaches for the identification of milestone patents.

In addition, new algorithms were developed to “scale up” a widely used sophisticated statistical model of social (and other, including citation) networks. This allows the method to be applied to networks far larger than previously possible. These new algorithms were implemented in open-source software and applied to large patent citation networks.

Another result of this project was the creation of a Patent Search tool integrated into a text editor. This system can effectively be used for searching Swiss patents, as it has support for multi-lingual search.

Finally, we have now a working version of a software that for the first time allows affords estimation exponential random graph models (ERGMs) on networks with millions of nodes. This was not possible when the project started. It is possible now – and the software is freely available for everyone to use (http://www.estimnet.org/).

What are the main messages of the project?

  • The country of origin of a patent often affects how frequently cited it is. This is particularly true for Switzerland, which has been known to be a prolific innovation hub.
  • We proposed and developed a new model called Time-Attentive Ranking, which helps to capture the temporal changes and their effect on network nodes. This approach proved both quantitatively and qualitatively better than existing state-of-the-art approaches for identification of milestone patents.
  • A sophisticated statistical model of networks (known as “exponential random graph models”, or ERGM) that were formerly only applicable to relatively small networks, can now be applied to “big data” networks with over one million nodes, with freely available open-source software developed under this project.

Does your project have any scientific implications?

Currently, a trend toward interdisciplinary research is observed in science. To the best of our knowledge, only a limited number of very recent studies has examined the impact of interdisciplinarity on the patent value. We can consider the patent as interdisciplinary if it has many categories in its classification. We cannot predict it intuitively, but we have shown that interdisciplinarity does not increase the patent value. In general, patents with fewer categories are more likely to be cited. We have also shown that patents from top companies have higher value and are more likely to be cited, and this is confirmed by our results. Besides, our results clearly show that on average Swiss patents are more likely to be cited.

We demonstrated that various patent characteristics influence citations. Both technical features and social processes, like homophily (the propensity to cite patents from the same country or within the same class), preferential attachment (the propensity of a small number of patents to receive a large share of citations), and transitivity (the propensity to cite references’ references) can lead to formation of citations. Particularly, our statistical analyses confirm that patents are much more likely to cite each other if patent applicants are from the same country, if patents are classified by the same category, or if the patent are written in the same language. We have found that recent patents are more likely to be cited. We did not observe that interdisciplinary increases the impact of patents. It is possible that the impact of interdisciplinary patents needs a different – and more sophisticated metric, to be detected in patent data. For example, measures that represent the notion of “reach,” rather than number of citations received per se.

We also provided analyses of the citations among the top companies in the citation network. We observed that companies (or organisations), depending on their network positions, play different roles in the citation network. This corroborated our hypothesis that the ones receiving more citations tend to be more influential and that companies that tend to cite other companies through patents are knowledge distributors.

Furthermore, we summarily concluded that raw citation count is not enough to capture the importance of a patent, since it does not consider the age of citations. When accounted for the same using a balanced metric like Time-Aware ranking, we are guaranteed to identify potential patents that are likely to spur technological growth in the near future.

These scientific results provide interesting implications related to the characteristics that Swiss influential patents currently have and, in the future, could have in the patent network.

Does your project have any policy recommendations?

The objectives of the project were essentially methodological, and did not involve providing practical policy recommendations, or advice. However, our experience with a considerable variety of sources of patent data has revealed common limitations in the ways patent data are collected and stored. We observe a clear inconsistency between claims that the system of knowledge is increasingly globalised, and ways in which information about knowledge flows are collected which remains essentially local, i.e., organised on a national basis.

The nature of patents as legal documents constraints the analysis of patent data to those collected, managed and made available by national patent offices. This situation implies the practical impossibility of following patent-citations across national systems. Citations of patents registered in any given country are available only if the cited patent is within the same national system of the citing patent. Patent citations across national patent systems are lost because national patent systems are not well coordinated. Patents registered at the International Patent Office (IPO) represent partial exceptions – but the IPO still organises a relatively small share of the global patent activity.

Furthermore, as recently noted by Meguro and Osabe (2019, Lost in patent classification. World Patent Information, 57, 70–76) tracking patent citations across national systems remains difficult despite the development of a common international patent classification (IPC). If one policy recommendation emerges from the project, it would be to invite additional international efforts to ensure that the global scope of patent data can increase to match the globalisation of knowledge flows and innovation.

Big Data is a very vague term. Can you explain to us what Big Data means to you?

In the case of network data, due to the combinatorial nature of networks, a “small” number of nodes can still mean a “big” number of relationships between nodes – and an exponentially larger number of possible networks of a given size (number of nodes).In the case of network data, a useful way to think about big data is focusing on a small number of generating mechanisms, rather than on the large number of observations that these mechanisms may produce.

About the project

Related links