Analyzing the citation characteristics of books: edited ...

Analyzing the citation characteristics of books: edited books, book series and types of publishers in the Book Citation

Index

Daniel Torres-Salinas1, Nicol?s Robinson-Garc?a2, ?lvaro Cabezas-Clavijo3, Evaristo Jim?nez-Contreras2

1 torressalinas@ EC3: Evaluaci?n de la Ciencia y de la Comunicaci?n Cient?fica, Centro de Investigaci?n Biom?dica Aplicada,

Universidad de Navarra (Spain)

2 {elrobin, evaristo}@ugr.es, acabezasclavijo@3 EC3: Evaluaci?n de laCiencia y de laComunicaci?n Cient?fica, Departamento de Informaci?n y Documentaci?n,

Universidad de Granada (Spain)

Abstract

This paper presents a first approach to analyzing the factors that determine the citation characteristics of books. For this we use the Thomson Reuters' Book Citation Index, a novel multidisciplinary database launched in 2010 which offers bibliometric data of books. We analyze three possible factors which are considered to affect the citation impact of books: the presence of editors, the inclusion in series and the type of publisher. Also, we focus on highly cited books to see if these factors may affect them as well. We considered as highly cited books, those in the top 5% of the most highly cited ones of the database. We define these three aspects and we present the results for four major scientific areas in order to identify field-based differences (Science, Engineering & Technology, Social Sciences and Arts & Humanities). Finally we conclude observing that differences were noted for edited books and types of publishers. Although books included in series showed higher impact in two areas.

Conference Topic

Scientometrics Indicators - Relevance to Science and Technology, Social Sciences and Humanities (Topic 1), Old and New Data Sources for Scientometric Studies: Coverage, Accuracy and Reliability (Topic 2).

1. Introduction One of the basic outcomes from the field of bibliometrics and citation analysis is the characterization of document types and field-based impact which allow fair comparisons and a better understanding on the citation patterns of researchers (Bar-Ilan, 2008). These studies are of great relevance within the field as they put into context the impact of research as well as certain 'anomalies' such as, for instance, the higher impact of reviews over research papers (Archambault & Larivi?re, 2009), the impact of research collaboration (Lambiotte & Panzarasa, 2009) or the importance of monographs within the fields of the Humanities (Hicks, 2004). In this sense, the role played by citation indexes in general and the ones developed by Eugene Garfield and carried by Thomson Reutersin particular, have being of key importance for the development of such analyses (Garfield, 2009). However, these citations indexes are mainly devoted to scientific journals, neglecting other communication channels such as monographs. Hence and despite the many attempts made to analyze their impact (e.g., TorresSalinas &Moed, 2009; White et al., 2009; Linmans, 2010; Kousha, Thelwall & Rezaie, 2011), little is known on the characterization of books' citation patterns.

Many studies have tried to characterize the citation patterns of books. However, these studies are normally based on small data sets based on specific disciplines. For instance, Cronin, Snyder & Atkins (1997) compared a list of top influential authors derived from journals citations with one derived from books in Sociology, concluding that these two types of publications reflect two complementary pieces of a fragmented picture. Tang (2008) takes a

step further and deepens on the citation characteristics of a sample of 750 monographs in the fields of Religion, History, Psychology, Economics, Mathematics and Physics, finding significant differences when compared with the findings in the literature regarding citation in journal articles. Georgas & Cullars (2005) adopt a different approach and analyze the citation characteristics of the Linguistics literature in order to conclude if the habits of the researchers of this field are more closely related to the Social Sciences than to the Humanities. In general, the conclusions of these studies must always be taken with caution as they cannot be extended to all scientific fields.

But this scenario may change radically with the launch of the Thomson Reuters' Book Citation Index (henceforth BKCI) which provides large sets of bibliometric data regarding monographs. This database was launched in October 2010 as a greatly delayed answer to Eugene Garfield's request, who stated: `Undoubtedly, the creation of a Book Citation Index is a major challenge for the future and would be an expected by-product of the new electronic media' (Garfield, 1996). At the time of its launch, it indexed 29618 books and 379082 book chapters covering a time period from 2005 to the present (Torres-Salinas et al., 2012). However, it now covers a time-span from 2003. According to Testa (2010), the BKCI follows a rigorous selection process in which the main criteria are the following: currency of the publications, complete bibliographic information for all cited references, English language is desirable and the implementation of a peer review process. To date, only two studies have been found analyzing the internal characteristics of the BKCI (Leydesdorff & Felt, 2012; Torres-Salinas et al., 2012). These types of seminal studies dissecting the coverage, caveats and limitations are considered of great regard as they serve to validate the accuracy and reliability of sources for bibliometric purposes.

In this context, we present an analysis of the citation characteristics of books relying on the data provided by the BKCI. Specifically, this study aims to analyze if the following factors may influence the citation patterns of the four main macro-areas of the scientific knowledge:

1) Edited books vs. Non-edited books. There is a perception that edited books usually have a greater impact than non edited books. To what extent is this true? Are there differences by field?

2) Series books vs. Non-series books. The prestige or impact derived from the collection in which the book is included is considered in certain areas as an evidence of the quality of books. Is there any empirical evidence on such claim?

3) Type of publishers. Is the publishers' prestige related with books impact? Which publishers have more impact, university presses, comercial publishers or other academic publishers?

2. Material and methods

This section is structured as follows. First we describe the data retrieval and processing procedures, indicating the normalization problems encountered and how these were solved. Also, we define the areas under study and how these were constructed, basing our methodology for this on previous studies and offering an overview of the distribution of books by areas in the BKCI. Then, in subsection 2.2, we define the variables analyzed and we describe the methodology followed as well as the statistical analysis undertaken in order to pursue the goals of the study.

2.1. Data retrieval and processing, and definition of areas

Records indexed as `book chapters' and as a `book' according to the Book Citation Index were downloaded in May 2012. We selected the 2005-2011 study period. The chosen time period is based on the availability of the data at the time of the retrieval. Then, data was included into a relational database created for this purposes. During data processing, publisher names were normalized as many had variants that differed as a function of the location of their head offices in each country. For instance, Springer uses variants such as SpringerVerlag Wien, Springer-Verlag Tokyo, Springer Publishing Co, among others. Next, we unified the citations received by books adding citations received by book chapters. The reason for doing so relies on the way the BKCI is designed, as it considers as separate citations received by a book and by a book chapter included in it. In this study we considered as citations to books, the sum of those received by their book chapters as well as those received by the books.

It is necessary to mention that a fixed citation window was included, which means that older books have a greater chance to get cited than the rest. Also, we must indicate that citations included in the BKCI come from all the citation indexes provided by Thomson Reuters (SCI, SSCI and A&HCI) and not only the BKCI. Once the total citation of books was established we excluded Annual Reviews, which includes a total of 234records as this publisher does not have books but journals, as indicated by Torres-Salinas et al. (2013). Hence the final books sample analyzed was of 28634 books.

In order to provide the reader with a general overview, we decided to cluster all subject categories of the BKCI (249) into four macro areas: Arts & Humanities (HUM), Science (SCI), Social Sciences (SOC) and Engineering & Technology (ENG). Aggregating subject categories is a classical perspective followed in many bibliometric studies when adopting a macro-level approach (Moed, 2005; Leydesdorff & Rafols, 2009). These aggregations are needed in order to provide the reader with an overview of the whole database. This way we minimized possibilities of overlapping for records assigned to more than one subject category. Also, we consider that such areas are easily identifiable by the reader as they establish an analogy with the other Thomson Reuters' citation indexes (Science Citation Index, Social Science Citation Index and Arts & Humanities Citation Index). With the exception of Sciences, which due to the heterogeneity of such a broad area,it was divided into two areas: Science and Engineering & Technology. In table 1 we show the distribution of the sample of books analyzed through the four disciplines.

Table 1. Distribution of books analyzed in this study by areas as well as total and average citations received according to the Book Citation Index. 2005-2011.

Discipline

ENGINEERING & TECHNOLOGY ARTS & HUMANITIES SCIENCE SOCIAL SCIENCE

Total Books without duplicates

Total Books

3871 8251 9682 10637 28634

% Books from the BKCI 14% 29 34% 37% 100%

Total Citations

34705 52224 241230 99943 392429

Average Citations

8,97 6,33 24,91 9,40 13,70

2.3. Definition of variables and indicators

Now, we define and describe the three variables analyzed to characterize books' citations: presence of editors, inclusion of books in a series and type of publisher.

Presence of editors. In order to analyze edited and non-edited books we considered as the former those which were indexed as such according the Book Editor (ED) field provided by the BKCI. We considered non-edited books those which had no information in this field. For instance, the book entitled `Power Laws in the Information Production Process: Lotkaian Informetrics' which is single-authored by L. Egghe has no information in the ED field, therefore it is considered a non-edited book. On the contrary, the book `Web 2.0 and Libraries: Impacts, Technologies and Trends' is edited by D. Parkes and G. Walton and has contributions from different authors, therefore it is considered and edited book.

Inclusion in a series or collection. In order to analyze the inclusion of books in a series or a collection we used the field defined in the BKCI as Series (SE), tagging as such those records which contained information in this field and as non series, those which did not. We identified a total of 3374 different series in the BKCI. The series with a higher number of books indexed in the BKCI for each field are: `Studies in Computational Intelligence' published by Springer (243 books) for Engineering & Technology, `New Middle Ages' by Palgrage (49 books) in Arts & Humanities; `Methods in Molecular Biology' by Humana Press Inc (232 books) in Science, and `Chandos Information Professional Series' by Chandos (118) in Social Sciences.

Type of publisher. In order to define the type of publisher, first we normalized them according to the name variants described above. As a result of such normalization process, 280 publishers were identified in the BKCI. Then, these publishers were distributed across the three following categories:

- University Press. Defined as any publisher belonging to a University such as the Imperial College Press or Duke University Press.

- Non-University Academic Publisher. Publishers belonging or related to organizations such as research institutions, scientific societies or any other type of entity not linked to universities such as the Royal Society of Chemistry orthe Technical Research Centre Finland.

- Commercial Publisher.Publishers considered in this group are those not related to universities or any other scientific entity but to firms with profit motive such as Routledge or Elsevier.

Finally, we characterized the factors that determine books' citations using different statistical descriptive indicators. The statistical analysis of data was carried out with SPSS v 20.0.0. As patterns of citations were not normally distributed, non-parametric tests were also used to derive levels of statistical significance. These tests were applied for the comparison of means (Mann?Whitney and Kruskal-Wallis tests) between the different factors analyzed at a 0.05 significance level. Furthermore, we analyzed the characteristics of the Highly Cited Books (henceforth HBC), that is, the 5% share most highly cited according to these three variables. 1534 books were identified as HBC.

3. Results

In this section we offer the results of our analysis on the impact of books in the BKCI depending on according to three variables: presence of editors, inclusion in series and type of publisher. This section is structured accordingly to these variables.

3.1 Edited vs. Non-edited books

In Table 2 we offer an overview of the sample of books analyzed according to the presence of editors. At large, from the total sample (ALL), 12646 books (44%) have been edited while 15988 books (56%) are not.Edited books have a significantly higher citation rate than those which are non-edited, as shown by the average and the median values.This occurs in the four areas studied.The most significant differences are found in the field of Science where edited books have an average of 35.51 citations per book in opposition to 10.16 citations per book for non-edited books. Also, edited books reach higher citation values as indicated by the standard deviation and median values. To a lesser extent, this situation also occurs in the Social Science and Engineering & Technology fields. The lowest differences between edited and non-edited books are found in the field of Arts & Humanities, where edited books have a citation average of 7.61, while non-edited books have an average of 5.81. Differences in citations were statistically significant in all disciplines (CI=95%, p ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download