A
SYNOPSIS
On
“AN ANALYTICAL STUDY OF USE AND MISUSE OF PATENT DATA:
ISSUES FOR CORPORATE FINANCE AND BEYOND”
Submitted to
Rashtrasant Tukadoji Maharaj Nagpur University,
Nagpur
For the award of the degree of
Bachelor of Business Administration
Submitted by
Yash oswal
Under the Guidance of
G.S. College of Commerce & Economics, Nagpur
Academic Year 2018- 19
G.S. COLLEGE OF COMMERCE AND ECNOMICS, NAGPUR
Index
Sr.No. Particulars Page No.
1. Introduction 1
2. Relevance of Study 3
3. Need of the study 4
4. Objectives of the Study 7
5. Limitations of the Study 8
6. Research Design
7. Hypothesis
Research Methodology
8. Primary Data
Secondary Data
9. Work Plan
Proposed Chapterization Scheme
10.
Bibliography
G.S. COLLEGE OF COMMERCE AND ECNOMICS, NAGPUR
INTRODUCTION
In the past several years, an increasing number of papers in the finance, accounting,
and related literatures have made use of patent data. This growth has reflected the
broadening of the topics seen as relevant to corporate finance researchers. As
Zingales (2000) argued, the wave of initial public offerings of purely human capital
firms, such as consultant firms, and even technology firms whose main assets are the
key employees, is changing the very nature of the firm…. The changing nature of the
firm forces us to reexamine much of what we take for granted in corporate finance.
Not only is innovation critical in many cases to firm survival—witness the fates of
firms that failed to innovate successfully, such as Kodak, Motorola, and Xerox—but
it illustrates the critical issues that motivate corporate finance theory more generally.
Topics such as uncertainty, information asymmetries, and the intangibility of assets
are central when it comes to financing innovative firms and projects. The growth of
interest in this topic among finance researchers can be seen from Figure 1, which
depicts the number of papers identified in Google Scholar that both cite at least one
article in the Journal of Finance and contain the phrase “patent citations,” the
references in patents to earlier work added by patent examiners and inventors that
serve as “property markers” delineating the scope of the granted claims. The steady
growth in interest is apparent, with more than a hundred-fold increase in the number
of papers since 1996. Even when these papers are presented as a fraction of all papers
citing a Journal of Finance paper, the share has increased substantially, from less than
one-tenth of one percent to approaching one percent. In Appendix 1, we list almost 70
papers using patent data, which have appeared in one of the “top three” finance
journals—the Journal of Finance, the Journal of Financial Economics, and the
Review of Financial Studies— between 2005 and 2017.
In many cases, the papers have used these data to shed fresh insights onto important
problems. But in other instances, the interpretation of the results has been marred by a
failure to understand some of the peculiarities of patents and patent data. These
misunderstandings have led to conclusions that are not robust to the use of alternative
methodologies. Moreover, the biases that appear are frequently highly predictable.
The presence of such mistakes is understandable. The patent application and review
process is extremely complex. The construction and features of the key database used
for patent research—which originated at the National Bureau of Economic Research
in 1999 and has been updated to 2006—have not been as fully documented as would
be desirable. Rather, much of the knowledge about the use of patent data has been an
G.S. COLLEGE OF COMMERCE AND ECNOMICS, NAGPUR
oral tradition, shared in workshops of the NBER’s Productivity, Innovation and
Entrepreneurship group. This paper is an attempt to rectify this omission. This paper
consists of three primary parts. After an introductory first section for researchers less
familiar with patent data, we begin in the second section by discussing the key patent-
level features that can lead to problematic inferences from these data. These have to
do with the truncation of patent data in ways that vary with the technology class being
pursued and the region and industry of the inventor. We document the reasons for
these distortions and how they can affect researchers. We also highlight the two broad
classes of corrections used to address these issues, the “fixed-effects” and “quasi-
structural” approaches. In the third section, we explore the consequences of these
biases in patent and citation data for firm-level analyses. We use information on
patent grants in the 2006 NBER patent database and compare it with the newer data
on patents granted to the same firms applied for during the same time period. The
newer data is collected through the end of 2012 using the method employed
in Kogan, et al. (2017). It therefore gives us a time window post-2006 to assess if
patents that were applied for in earlier years did eventually get granted. The
difference in the actual patents granted relative to what was recorded in the NBER
data is what we call “patent bias”. Similarly, we compute “citation bias,” where we
compare citations to patents in the NBER data that ends in 2006 with the citations
garnered by the same patents over a longer period (until 2012). We first demonstrate
that these biases are large and systematic: they are present more dramatically in
recent years, in some technology classes, in some industries, and in some regions. We
show that the popular methods in the literature to account for these biases only
partially adjust for them. After characterizing these biases in detail, we explore how
they impact inferences when analyzing patenting activity at the firm level. One
solution for accounting for these biases at the firm level is to ignore them. The
rationale could be that, when these biases are aggregated at the firm level, they end up
being classical measurement error: i.e., they do not impact coefficients of the
explanatory variables when patents or citations are used as dependent variables. We
show that this is not the case. In particular, these biases at the firm level are strongly
correlated with firm characteristics that are of key interest to researchers, both when
we look at unadjusted and adjusted biases. In particular, market capitalization, the
R&D-to-sales ratio, and the ratio of cash to total assets are positively associated with
patent and citation bias, while bid-ask spread is negatively associated with such
biases. These biases are also related to technological and regional characteristics
associated with the firm. Thus, in many empirical settings where firm-level
innovation is explored, several inferences about the phenomenon under study might
be driven by non-classical measurement error.
G.S. COLLEGE OF COMMERCE AND ECNOMICS, NAGPUR
G.S. COLLEGE OF COMMERCE AND ECNOMICS, NAGPUR
Relevance of Study:
1 . Unadjusted and Adjusted Biases in Publicly Traded Firms across Time
.2 Unadjusted and Adjusted Biases in Publicly Traded Firms across Technology Class
3. Unadjusted and Adjusted Biases in Publicly Traded Firms across Regions
4 Unadjusted and Adjusted Biases in Publicly Traded Firms across Industries
5 Bias and Firm Characteristics
6 . The Impact on Inference
7 The Impact of Firm Exits
8 Misleading Assignments
9 The Impact of Firm Exits
10 Robustness: Newer Ways of Adjustment
11 Concordance Limitations
12 Strategic Citations
13 A Checklist for Researchers
G.S. COLLEGE OF COMMERCE AND ECNOMICS, NAGPUR
Need of the study
This section provides a general introduction to patent data. The researcher more
experienced with patent data may wish to skip directly the second section, where we
turn to a discussion of the major potential biases associated with its use. The less
experienced reader may wish to refer to the description of the patent award process in
Part A of the Online Appendix. The Necessity of Patent Data It might be thought that
innovation can be studied by examining research and development expenditures. But
these measures are highly imperfect for three reasons:
• First, firms need only report R&D if the expenditures are “material.” It is thus
difficult to interpret non-responders: does this mean that no research was performed,
or that it was for some reason interpreted as immaterial? For instance, under U.S. tax
law, service firms are generally unable to take advantage of the R&D tax credit. As a
result, institutions such as 7 major investment banks, which may employ dozens of
PhDs and have well-defined new product development groups, nonetheless often
report no R&D expenditures.
• Second, R&D expenditures are typically not broken down by product line or
geography: rather, firms just give an indication of activity at a firm-wide level. Thus,
any detailed analysis of divisional or geographic differences within a firm will be
stymied.
G.S. COLLEGE OF COMMERCE AND ECNOMICS, NAGPUR
• Finally, R&D expenditures are an innovative input, rather than an output. The
effectiveness of the research may vary tremendously. For instance, between the 2001
and 2011 fiscal years, Nokia spent more than three times the amount on R&D than
Apple did, yet languished in its ability to introduce innovative, market relevant
products. All these problems can, at least in theory, be addressed by patent data. Most
key discoveries are protected through patent filings. Patent awards provide a wealth
of technological, geographical, and industry data. The relationship between an
invention’s economic importance and patent data, particularly citations, is well
documented. All these considerations are leading to a greater interest in patent data by
financial economists and related management researchers. Patent Data for
Researchers The first, most fundamental U.S. database is from the U.S. Patent and
Trademark Office (USPTO) itself. This information can be accessed online at
http://www.uspto.gov. The database covers patents awarded between 1976 and today.
Earlier patents are included as well, but only in PDF format. These data pose several
issues. First, there is no identifier that uniquely flags each applicant. Moreover, a
huge number of variants of each name appear. In part, this reflects inconsistency on
the part of the applicants, but it also reflects sloppiness on the part of USPTO. 8 For
instance, among the patent assignment data contains several hundred variants of IBM,
differing in punctuation, spelling, and the use of corporate and legal suffixes (Thoma,
et al., 2010). An additional problem is that these data are difficult to use. While it is
possible to extract the records into a file suitable for regression analysis using PERL
or a similar program, these data are not “user friendly.”3 The NBER Patent Citation
G.S. COLLEGE OF COMMERCE AND ECNOMICS, NAGPUR
Dataset—created under the leadership of Bronwyn Hall, Adam Jaffe, and Manuel
Trajtenberg (HJT)—was designed to address these difficulties, as well as the fact that
USPTO data then (and still now) is not in the easiest form for undertaking research.
The original database sought to capture the key information on each utility patent
awarded between 1963 and 1999 in a readily accessible database. (About 99% of all
patents issued are utility patents; there are also design, plant, and a few other
specialized categories of patents.) The authors created a number of original measures.
They assembled a six-class classification scheme, which consolidated the many
hundreds of patent classes employed by the USPTO into broad categories, such as
drugs and medical, computers and communications, mechanical, and so forth. The
authors recorded the grant date and the application year (though only the final
application date). They computed the generality and originality of patent issues, two
citation-based proxies for the fundamental nature of awards.
G.S. COLLEGE OF COMMERCE AND ECNOMICS, NAGPUR
Objective of the study
(i) that it encourages research and invention,
(ii) in induces an inventor to disclose his discoveries instead of keeping them
as a trade secret;
(iii) it offers a reward for the expenses of developing inventions to the stage at
which they are commercially practicable, and
(iv) it provides an inducement to invest capital in new lines of production
which might not appear profitable if many competing producers embarked on
The object of granting a patent is to encourage and develop new technology
and industry. An inventor may disclose the new invention only if he is
rewarded; otherwise he may work it secretly.
G.S. COLLEGE OF COMMERCE AND ECNOMICS, NAGPUR
Limitaions of the study
. This effort addressed several of the issues associated with the earlier data base
(Bessen, 2009):
• Patents and citations were included through the end of 2006. 10 • If a patent had
multiple assignees, all were all included.
• Patents other than utility awards were included in the sample.
• International Patent Classification classifications were included, in addition to
detailed U.S. subclasses. The most significant progress, however, was made on
matching assignees to Compustat identifiers (using GVKEYs, the more “permanent”
of the two firm identifiers used in Compustat). In particular, they took the following
steps which substantially increased the details about and number of matched patents: •
Creating a separate file that links cases where a firm has multiple GVKEYs (e.g., due
to the fact that it has multiple securities trading) to a single firm identifier, which was
then associated with each patent. • Adding an ownership chain. To the extent that
firms were acquired and their ownership changed, the successive GVKEYs were
identified. It should be noted that these were identified through the Thomson SDC
M&A database, which misses many smaller transactions involving private firms, but
should have good coverage of publicly traded firms such as the ones tracked here
. • Extending the number of matches between assignee names and Compustat. This
was done by: o Using the same mapping between Compustat identifiers and assignee
names as employed in the 1989 data-set, and applying it to the more recent awards. o
G.S. COLLEGE OF COMMERCE AND ECNOMICS, NAGPUR
Using a computerized algorithm which stripped suffixes (e.g., Inc. or LLC) and
standardized abbreviations, and identifying exact matches between Compustat and
patent assignee names. 11 o Identifying inexact matches that were nonetheless
assigned a high probability of being matches by the program, and then manually
examining the Compustat entries and patent records. The 2006 data update did not,
however, revisit the mapping between parent and subsidiary firms. To the extent that
the mapping in the 1989 firm no longer was accurate due to subsequent acquisitions
and divestitures, corporations might be assigned fewer or more patents than they
actually received. Of course, as long as firms assign patents to the ultimate corporate
parent rather than subsidiaries, this issue should not surface. Since the completion of
the 2006 NBER database, there have been a number of efforts to update and enhance
these data. One notable effort is the U.S. Patent Inventor Database (also known as the
HBS Patent Database), which sought to rationalize the (frequently inconsistently
reported) names of individual inventors. The database’s features are described in Li, et
al. (2014). Another issue, not addressed by either of the NBER patent datasets, relates
to earlier patents. Pre-1963 patents are not available from either version of the
database (the 2006 version only extends back to 1976), and are included on the
USPTO web site only in scanned (PDF) form. However, these patents have been
digitized by Google, albeit imperfectly due to the limitations of its text recognition
software. These have been used in a variety of recent papers, including Moser and
Voena (2012) and Kogan, et al. (2017). Kogan, et al. make this dataset available at
https://iu.app.box.com/patents. In these cases, the authors have done manual matches
G.S. COLLEGE OF COMMERCE AND ECNOMICS, NAGPUR
to the publicly listed firms, as no concordance exits. The state of development of data
from other patent offices is much less mature. The area with the greatest activity has
been in Europe. The European Patent Office makes its data available on-line and
through CDs, and certain aspects of the European patent system (e.g., re-
examinations) 12 have been the subject of academic scrutiny (Graham, et al., 2004).
But the development of an EPO research database remains a work in progress: an
initial mapping of UK firms’ filings has been undertaken by Grid Thoma and co-
authors, as well as a mapping between the names in Bureau van Dyck’s Amadeus
dataset and European patent assignees (http://www.epip.eu/datacentre.php). While
there have been recent efforts to make Chinese and Japanese data available online as
well, this information remains much less well scrutinized.
5. Additional Issues
G.S. COLLEGE OF COMMERCE AND ECNOMICS, NAGPUR
Hypothesis of the study
1 The Impact of Technology Class
2. The Central Challenges
3 The Impact of Time
4 The Impact of Technology Class
5 The Impact of Region
6 Are Popular Patent-Level Adjustments for Biases Sufficient for Firm-Level
Analyses?
Research Methodology
G.S. COLLEGE OF COMMERCE AND ECNOMICS, NAGPUR
Primary Data
Secondary Data
References
Almeida, Heitor, Po-Hsuan Hsu, and Dongmei Li, 2013, “Less is More: Financial
Constraints and Innovative Efficiency,” Unpublished working paper, University of
Illinois at UrbanaChampaign, University of Hong Kong, and University of South
Carolina. Amore, Mario, Cedric Schneider, and Alminas Zaldokas, 2013, “Credit
Supply and Corporate Innovation,” Journal of Financial Economics 109, 835-855.
Atanassov, Julian, 2013, “Do Hostile Takeovers Stifle Innovation? Evidence from
G.S. COLLEGE OF COMMERCE AND ECNOMICS, NAGPUR
Antitakeover Legislation and Corporate Patenting,” Journal of Finance 68, 1097–
1131. Autor, David, David Dorn, Gordon Hanson, Gary Pisano, and Pian Shu, 2016,
“Foreign Competition and Domestic Innovation: Evidence from U.S. Patents,”
Working Paper no. 22879, National Bureau of Economic Research. Becker-Blease,
John R., 2011, “Governance and Innovation,” Journal of Corporate Finance 17, 947–
958. Bessen, James, 2009, “NBER PDP Project User Documentation: Matching Patent
Data to Compustat Firms,” Unpublished working paper, Boston University.
Cornaggia, Jess, Yifei Mao, Xuan Tian, Xuan and Brian Wolfe, 2015, “Does Banking
Competition Affect Innovation?,” Journal of Financial Economics 115, 189–209.
Chakraborty, Atreya, Zaur Rzakhanova, Shahbaz Sheikhb, 2014, “Antitakeover
Provisions, Managerial Entrenchment and Firm Innovation,” Journal of Economics
and Business 72, 30–43. Chava, Sudheer, Alexander Oettl, Ajay Subramanian, and
Krishnamurthy Subramanian, 2013, “Banking Deregulation and Innovation.” Journal
of Financial Economics 109, 759-774. Chemmanuer, Thomas, and Xuan Tian, 2014,
“Anti-Takeover Provisions, Innovation, and Firm Value: A Regression Discontinuity
Analysis?,” Unpublished working paper, Boston University and Indiana University.
Cockburn Iain M., Samuel Kortum, and Scott Stern, 2002, "Are All Patent Examiners
Equal? The Impact of Examiner Characteristics," Working Paper no. 8980, National
Bureau of Economic Research. Dass, Nishant, Vikram Nanda, and Steven C. Xiao,
2017, “Truncation Bias Corrections in Patent Data: Implications for Recent Research
on Innovation,” Journal of Corporate Finance 44, 353-374
G.S. COLLEGE OF COMMERCE AND ECNOMICS, NAGPUR
Work Plan
G.S. COLLEGE OF COMMERCE AND ECNOMICS, NAGPUR
Proposed Chapterization Scheme