Page 16 - IDEA Study 8 2017 Direct subsidies and R&D output in firms
P. 16

 Dataset The analysis is based on a large micro dataset that provides rich information on firm’s performance, inventive capabilities, structural characteristics and public support history; the dataset was combined from the following sources: 1. The Research, Development and Innovation Information System of the Czech Republic (ISVaV), which contains complete administrative data on public tenders, subsidy providers, programmes, projects, receivers and results in the field of R&D funded from the national budget (Office of the Government of the Czech Republic 2016).5 2. The PATSTAT database administered by the European Patent Office, which contains information on IP protection instruments filed in the main 40 patent authorities across the world, including the Czech Industrial Property Office. PATSTAT is the largest international database of its kind and contains details of 90 million IP documents (EPO 2016a). 3. Bureau Van Dijk’s Amadeus database, which provides balance sheets, income statements, employment and demographic micro data on Czech firms with 25 and more employees (Bureau Van Dijk 2016).6 The PATSTAT database does not provide unique identification codes for the individual applicants, therefore, we manually matched the names of the organizations in PATSTAT that list the Czech Republic as their country of origin with the Register of Economic Subjects (ARES)7 and assigned the unique taxpayer identification number (IČO) to each organization. After that, we merged the data from PATSTAT, Amadeus and ISVaV using this identifier. 5 The ISVaV data used in this study was valid on January 27, 2016 when a database snapshot was extracted from the original website: (Office of the Government of the Czech Republic 2016). Since then, the database has been moved to a new domain: (Office of the Government of the Czech Republic 2017), which however suffers from incomplete records and limited functionality. Note that the ISVaV has unfortunately never provided data on unsuccessful applicants. 6 In Amadeus database, missing data on the number of employees, location, legal form and industry was estimated using 1-year lag and 1-year lead. 7     14 

   14   15   16   17   18