The study deals with research performance assessment issues as important aspects of research management and research quality. The case of Kazakhstan clearly demonstrates the impact of the prevailing bibliometric-centered approach. The aim of this study is to suggest an inclusive scale of individual research performance assessment. The method used is a quantitative study of the opinions of researchers and academics on the range of research related activities they traditionally carry out. The study expands the knowledge base on academic human resource management, and can be of high relevance for substantiating the criteria of performance assessment of researchers by HR managers of universities and public research institutions. The research results can be helpful for setting and complying with individual and institutional criteria for research performance evaluation.
A Survey among 264 researchers in Kazakh universities and public research institutions (response rate: 63%) asked them to rate their activities in five groups: supervising activity, professional advancement, publications, public recognition, and scientific & organizational activities.
The results demonstrate that their priorities correspond to national and international priorities: publishing papers in local and international peer-reviewed journals indexed in WoS and Scopus, and monographs.
Findings revealed that “Supervising” and “Professional Advancement” activities have the highest importance among all criteria groups.
Respondents gave the highest preference to participating at overseas and international conferences, seminars and workshops, thus expressing their desire to disseminate their research findings internationally and to build international links.
Scientific & organizational activities are the core activities which correlate to all other activities. And the role of S&O criteria is definitely underestimated in the performance assessment of researchers.
The current research performance evaluation system in Kazakhstan is dominantly based on bibliometrics and is one-sided and biased. An inclusive scale of individual research performance assessment needs to be developed, considering researchers’ ideas and preferences.
Performance evaluation plays an important role in management. Performance evaluation is presented as a process of quantification of the performance and effectiveness of actions (
Research management is quite a creative process. There are no exact standards that allow an objective and precise assessment of research performance. The assessment of individual researchers remains a deliberately difficult process without a standard solution.
Research performance evaluation has a range of goals. There are numerous studies on the impact of research performance evaluation. Some studies state that employees benefit from performance evaluation which leads to greater motivation (
Nowadays, in research performance evaluation the two main approaches used are peer review and bibliometrics. The latter tends to be more popular due to a growing volume of open information published on the internet. The Web of Science and Scopus citation databases are the main tools available for bibliometric assessment. Bibliometric measures derived from these databases became target indicators for the performance assessment of researchers, organizations and countries. Global ranking agencies explore the opportunities of bibliometrics which are used to serve as evaluation criteria. One of the criteria of the Global Competitiveness Index is the number of publications and their citations in the Scopus database, while the Global Innovation Index is based on citation indices derived from the Web of Science. The world university rankings (for instance, QS, Times Higher Education, ARWU) also push the research institutions and universities to publish in sources indexed in Web of Science or Scopus. Bibliometrics has become entrenched in the individual, team and institutional assessment of research performance.
Obviously, inappropriate indicators create perverse incentives (
Moreover, researchers do not only carry out publishing, they engage in a lot of other activities: reviewing, organizing conferences, writing proposals, supervising students research, etc. Despite their high significance, these kinds of activities are not taken into account when measuring research performance. While promoting “inclusive metrics of success and impact to dismantle a discriminatory reward system in science”, Davies et al. (
The current research adopts the inclusive view: the main issue is to recommend inclusive performance assessment criteria through assessing the perceived significance of each type of activity. The inclusive performance assessment motivates high quality of research, high level of productivity and strong performance in all areas.
Therefore, this research aims to assess the importance of different types of activities and metrics from the point of view of researchers in order to recommend an inclusive set of research performance criteria.
The two main approaches of research performance evaluation are peer review and bibliometrics. The debates over the pros and cons of both approaches have continued for about three decades. Here we discuss the recent research findings on both methods.
Peer review is a qualitative tool. The main drawbacks of peer review that have been observed over time are subjectivity and expensiveness. Similarly, the current peer review system has been criticized for failures in representing the opinions and interests of non-peer clients (
Despite the inconsistency of the h-index and the inappropriateness of the journal impact factor for research performance assessment (
However, the stand-alone use of bibliometrics brings distorted results. The emphasis on publications in high impact factor journals diverts the attention of researchers from the real contribution to knowledge production. They shift their focus to tactical actions to match established reward and promotion systems (
The dominance of the bibliometric approach in research performance evaluation has both short and long-term consequences such as encouraging plagiarism, “salami slicing”, demotivation of young researchers to write papers due to slim chances of publication in high-impact-factor journals, etc. (
To apply bibliometric methods it is essential to ensure that international journals are a major means of communication in each particular field (
Based on the literature review, it should be noted that the methodology of bibliometric assessment has a number of advantages over the method of expert assessment: it is inexpensive, non-invasive, easy to implement, provides both rapid updates and intertemporal comparisons, is based on objective quantitative and qualitative data, and provides a high degree of representativeness. Citation metrics can make the process more efficient and cost-effective, but expert evaluation should remain a central element in any research evaluation process (Butler, 2007). The challenge is to combine these two methodologies so that they work effectively together (
One instance where the bibliometric approach can replace peer review is in assessing groups (
The literature review on approaches to research performance assessment shows that both the bibliometric and expert review approaches have their advantages and disadvantages. Recent studies show that there is a trend to combine these two approaches to achieve synergy in research performance evaluation results.
The science evaluation policy reforms in Kazakhstan started in 2011 when the country joined the Bologna education system and announced new PhD thesis requirements of publication in WoS or Scopus indexed journals. That led to the boosting of publications in the predatory journals.
Later, Kazakhstan continued, step by step, its policy of introducing bibliometrics into research project evaluation, settling requirements for research principles, evaluating candidates for the conferring of academic titles, etc. As a result of the policy, Kazakhstan has entered the list of the countries most affected by a reliance on bibliometrics (
Number of publications of Kazakhstan per year in Scimago Journal Rank.
YEAR | NUMBER OF CITABLE DOCUMENTS | RANK |
---|---|---|
2000 | 240 | 86 |
2005 | 371 | 90 |
2010 | 472 | 98 |
2015 | 2479 | 67 |
2020 | 5339 | 68 |
When tracking changes to the national policy of research performance evaluation for different purposes (conferring PhD degree, academic titles, granting funds for projects, etc) we can confirm that the bibliometric approach has become more sophisticated year on year.
The policy related to conferring PhD degrees is still changing and its history resembles the overall policy trends in research performance evaluation. At the initial stage the policymakers put general publication requirements such as publication in Web of Knowledge journals with a non-zero impact factor or indexed in Scopus (
History of changes in required international peer-reviewed publications for conferring PhD degree in Kazakhstan.
YEAR | REQUIRED INTERNATIONAL PEER-REVIEWED PUBLICATIONS |
---|---|
2011 | At least 1 paper in the journals indexed in ISI Web of Knowledge with a non-zero impact factor or indexed in Scopus. |
2016 | |
2018 | |
When the requirements for principal positions in research projects are analyzed it can be seen that they differ not only between research fields but also between types of research. For basic research, the publication requirements are higher than they are for applied research.
In natural, technical, life sciences and medicine, the requirements for project principal positions are stronger than they are for other sciences. Within the last 5 years candidates must have:
Expectations on output of research grants in natural, engineering, life sciences and medicine also differ between types of research and are higher than they are for other sciences:
In social, humanities and military sciences the expected output includes at least 1 article and/or review in a peer-reviewed scientific journal of Q1-Q3 Web of Science, and (or) included in the Social Science Citation Index or Arts and Humanities Citation Index, and (or) having a Scopus CiteScore percentile of at least 35. The lower requirements for social and humanities sciences are due to their bias in favor of English language journals. (
Thus, the research performance evaluation system in Kazakhstan tends to strengthen the bibliometric-centered approach. The recent improvements have been related to differentiations in field and types of research.
Researchers have indicated two approaches to studying and assessing research performance: evaluative and explanatory (
Our study can be considered as an explanatory performance study as it has the objective to build the knowledge society by constructing an inclusive scale for performance assessment in the academic and research sector. This approach is predictive due to the embedded feature of motivating researchers to build capacity through supervising students and growing professionally.
Our research is conducted in two stages:
First, an initial list of all types of research-related activities performed by researchers has been formed. The activities are classified into five groups: supervising, professional advancement, publications, public recognition, scientific & organizational activities.
Second, these types of research activities are assessed through the Likert scale used to measure respondents’ attitudes to a particular activity according to the following values: 1-Unimportant, 2-Not very important, 3-Moderately important, 4-Important, 5-Very important.
There are 264 people included in the sample. The response rate is 63%. The statistical analyses are performed through the IBM SPSS Statistics 22 package program. Frequency distributions (number, percentage) are given for numerical variables while descriptive statistics (mean, standard deviation, median, minimum, maximum) are given for categorical variables (eg, gender). Normality assumptions of numerical variables are examined with the coefficients of skewness and kurtosis and these coefficients are found to be within a ± 2 range. Therefore, parametric statistical methods were used in the study. The differences between two independent groups (e.g. gender) are examined by an Independent Sample t Test. The differences between more than two independent groups (e.g. age groups) were analyzed by the One Way Analysis of Variance (ANOVA). Tukey’s multiple comparison test was used to determine from which group the difference arises as a result of a one-way ANOVA. The statistical significance level was set to p < 0.05 and the analysis was completed at a 95% confidence level.
The analysis has shown the following make-up of respondents (
Respondents grouped by the criteria of age, %.
In addition, the average age of the respondents is 43.4 years, the youngest is 23 years old, and the oldest is 84 years old. 63.7% of them are female. In addition, 55.7% are from the Al-Farabi KazNU faculty and 44.3% are from various public research institutes (Institute of Economics, Institute of maths, Institute of mechanics and machine science, Institute of oriental sciences, Institute of ICT, Institute of language sciences, Institute of philosophy, political sciences and religion studies).
The activities surveyed are classified into five groups: Publications, Public Recognition, Scientific & Organizational activities, Professional Advancement, Supervising. The content of each group is seen in Appendix 1.
The overall results of the survey are given in
Average scores of activity groups by respondents (Likert Scale, 1 = unimportant, 5 = very important).
According to the survey, Supervising (4.06) is perceived as the most “important” by the respondents. One can question this outcome since 55% respondents come from the university and this cannot be true for researchers of public research institutions (PRI). According to
Analysis of Differences between Activity Scores, by Organization.
ACTIVITY | ORGANIZATION | COUNT | MEAN | STANDARD DEVIATION | T | P |
---|---|---|---|---|---|---|
Scientific & Organizational | University | 147 | 3,70 | 0,79 | 0,441 | 0,660 |
PRI | 117 | 3,66 | 0,68 | |||
Professional Advancement | University | 147 | 3,98 | 0,76 | -0,142 | 0,887 |
PRI | 117 | 4,00 | 0,76 | |||
Public Recognition | University | 146 | 3,83 | 0,92 | 1,574 | 0,117 |
PRI | 115 | 3,65 | 0,96 | |||
Supervising | University | 143 | 4,16 | 1,02 | 1,806 | 0,072 |
PRI | 112 | 3,93 | 1,01 | |||
Publications | University | 145 | 3,85 | 0,81 | 0,736 | 0,462 |
PRI | 117 | 3,78 | 0,75 | |||
t: Independent sample t Test.
Professional Advancement (3.99) can be interpreted as “important” as well for both university faculty staff and PRI researchers. The group of Scientific & Organizational activities has the lowest value as seen in
One can doubt if Public Recognition can be referred to as an activity since this group is a result of an activity or an achievement rather than an activity itself. But we are studying attitudes of researchers to different criteria. Therefore awards (e.g. public recognition indicators) can be considered as criteria as well.
Analysis of Differences between Activity Scores, by Age.
ACTIVITY | AGE | COUNT | MEAN | STANDARD DEVIATION | F | P |
---|---|---|---|---|---|---|
Scientific & Organizational | 1. Age 23-30 | 48 | 3,55 | 0,68 | 0,882 | 0,451 |
2. Age 31-40 | 78 | 3,74 | 0,69 | |||
3. Age 41-50 | 59 | 3,74 | 0,82 | |||
4. Age 51 and over | 71 | 3,72 | 0,69 | |||
Professional Advancement | 1. Age 23-30 | 48 | 4,12 | 0,55 | 2,500 | 0,060 |
2. Age 31-40 | 78 | 4,15 | 0,72 | |||
3. Age 41-50 | 59 | 3,92 | 0,87 | |||
4. Age 51 and over | 71 | 3,87 | 0,68 | |||
Public Recognition | 1. Age 23-30 | 47 | 4,01 | 0,78 | 4,424 | |
2. Age 31-40 | 76 | 3,86 | 0,93 | |||
3. Age 41-50 | 59 | 3,80 | 0,88 | |||
4. Age 51 and over | 71 | 3,45 | 0,97 | |||
Supervising | 1. Age 23-30 | 45 | 3,99 | 0,96 | 0,559 | 0,642 |
2. Age 31-40 | 76 | 4,18 | 0,84 | |||
3. Age 41-50 | 56 | 4,10 | 1,02 | |||
4. Age 51 and over | 70 | 4,00 | 1,14 | |||
Publications | 1. Age 23-30 | 47 | 3,77 | 0,74 | 0,526 | 0,665 |
2. Age 31-40 | 77 | 3,92 | 0,79 | |||
3. Age 41-50 | 59 | 3,77 | 0,84 | |||
4. Age 51 and over | 71 | 3,83 | 0,70 | |||
*: p < 0,05 (statistically significant), F: One way Variance Analysis (ANOVA), “Tukey” test for Differences between groups.
To check the relationships between the activities we applied correlation analysis, which demonstrated a moderate positive linear correlation between the criteria scores (
Analysis of relationships between activities.
1 | 2 | 3 | 4 | 5 | ||
---|---|---|---|---|---|---|
1. Scientific & Organizational | r | 1 | ||||
p | ||||||
n | 264 | |||||
2. Professional Advancement | r | 1 | ||||
p | ||||||
n | 264 | |||||
3. Public Recognition | r | 1 | ||||
p | ||||||
n | 261 | |||||
4. Supervising | r | 1 | ||||
p | ||||||
n | 255 | |||||
5. Publications | r | 1 | ||||
p | ||||||
n | 262 | |||||
r: Pearson Correlation Coefficient *: p < 0,05 **: p < 0,01.
More specifically, there is a moderate positive correlation between the scores of Scientific & Organizational (S&O) activity and other groups.
Also, a statistically significant positive correlation exists between: 1) Public Recognition group scores and Supervising and Publications group scores; 2) Supervising and Publications criteria scores (
One more logical finding is the correlation between Publications and Supervising activities. Supervising PhD students helps their research supervisors to publish more papers. And vice versa - if the researchers have a high publication activity, this indicates their capacity, and hence they are assigned as a research supervisor to more students. By way of comparison, the Higher Education Council of Turkey’s criterion for higher education institutions is “the number of students in master’s and doctoral programs, the number of doctoral specialities, and the number of doctoral graduates”.
The correlation between Publication and Public Recognition criteria seems plausible and clear as well. According to the national rules for awarding academic degrees and ranks, one should publish a certain number of papers to be nominated for the academic rank of professor or associate professor. To apply for a PhD thesis defence, doctorates must have at least one publication in journals of a certain quartile/percentile in the Web of Science Core Collection or the Journal Citation Reports or Scopus. In the current year, 2021, a new amendment has been added to the national rules for doctorates, according to which there is an option to skip writing a PhD thesis if a PhD student has two original research papers in the 1-2 quartile journals indexed in the Journal Citation Reports.
The One Way ANOVA test reveals no statistically significant differences between age groups in the various types of activity, except in Public Recognition activity (p < 0.05). For example, Public Recognition scores for the 51 and higher age group are significantly lower than those for the 23-30 and 31-40 age groups.
An independent Sample t Test (p < 0.05) reveals no statistically significant differences between females and males in terms of activity scores except in Public Recognition activities. Public Recognition scores of females are significantly higher than those of males (
Analysis of Differences between Activity Scores, by Sex.
ACTIVITY | SEX | COUNT | MEAN | STANDARD DEVIATION | T | P |
---|---|---|---|---|---|---|
Scientific Organizational | Female | 158 | 3,67 | 0,75 | 0,326 | 0,745 |
Male | 90 | 3,64 | 0,76 | |||
Raising Qualifıcation | Female | 158 | 4,04 | 0,78 | 1,213 | 0,226 |
Male | 90 | 3,91 | 0,73 | |||
Public Recognition | Female | 157 | 3,89 | 0,85 | 3,040 | |
Male | 88 | 3,49 | 1,05 | |||
Supervising | Female | 152 | 4,09 | 1,07 | 0,529 | 0,597 |
Male | 87 | 4,01 | 1,00 | |||
Publications | Female | 157 | 3,87 | 0,80 | 1,158 | 0,248 |
Male | 89 | 3,75 | 0,74 | |||
*: p < 0,05 ((statistically significant) t: Independent sample t Test.
The activities within each group are also assessed by respondents.
Types of activities scored as “important” within the different groups.
The results of the survey demonstrate that the priorities of local researchers correspond to national and international priorities (publishing papers in local and international peer-reviewed journals indexed in WoS and Scopus, monographs). But respondents gave the highest preference to participation at overseas and international seminars and workshops, speeches at overseas conferences, seminars, and round tables. It indicates a desire of researchers to disseminate their research findings internationally and to build international links. By way of comparison, the Higher Education Council of Turkey encourages the internationalization of research by applying the criteria of “joint projects”, which is estimated through the number and budget of projects with international cooperation (both ongoing and completed).
The respondents are also aware of the importance of publishing articles in journals indexed by the Web of Science or Scopus. According to the analysis the average Likert score of the activity “publishing in first and second quartile (Q1-Q2) journals” is slightly higher (4.05) than for “publishing in Q3-Q4 journals” (3.94).
The theoretical and practical analysis of various concepts and tools for assessing research performance shows the growing importance of bibliometrics. Bibliometric criteria remain popular in quantifying the impact of a scientific article (
The performance evaluation system which is dominantly based on bibliometrics is one-sided and biased. Such a system harms science itself as it inclines a scientist to make more publications and citations instead of doing high-quality research. Quality of research should be the primary concern when practising science and the publication of its results whilst important is a secondary concern.
Thus, peer review remains the main method of evaluating the researcher although it may be biased due to subjective factors such as conflicts of interest, lack of research competence, and superficial expertise.
However, one should keep in mind that the production and dissemination of scientific knowledge should primarily provide social and economic benefits through innovations that are not directly reflected in bibliometric scores. It is also essential to take into account that “Sleeping Beauties” can suddenly be awoken many years after publication (
Our findings revealed that “Supervising” and “Professional Advancement” activities have a higher importance among all criteria groups. However, within the “Publication” activities, publishing papers in privileged local and international peer-reviewed journals and conference proceedings indexed in WoS and Scopus, and monographs are of a high level of importance as well.
“Publication”, “Public Recognition” and “Scientific & Organizational” activities are assessed as moderately important. Unsurprisingly, the “Public Recognition” score is for older groups significantly lower it is for younger groups.
Surprisingly for the Asian continent the Public Recognition score for females is much higher than for males. One possible reason for this could be that there is an underestimation of females in conferring ranks, titles and other awards. This hypothesis can be used for future research.
The interesting and specific point is that S&O activities correlate to all the other activity groups. There is a moderate positive correlation between the scores of Scientific & Organizational (S&O) activity and other groups. We can suppose that S&O activities are the core activities which correlate to all other activities. And the role of S&O criteria is definitely underestimated in the performance assessment of researchers. This confirms the earlier conclusion about the necessity to take S&O criteria into account when evaluating researchers’ performance. It is supposed that they lead to a more broad and intensive dissemination of research findings. This issue could be a subject of future research as well.
Analysis of sub-activities within each group showed the importance of building scientific communications internationally. High importance of “Participation at foreign and international seminars and workshops” and “Speeches at foreign conferences, seminars, round tables” indicates the welcomed willing of researchers to disseminate their research findings internationally and to build international links.
The study expands the knowledge base on the academic human resource management aspects, and can be of high relevance for substantiating the performance assessment of researchers criteria by HR managers of universities and public research institutions. The research results can be helpful for complying and coinciding individual and institutional criteria for research performance evaluation. This study suggests an inclusive scale to measure the performance of researchers. The inclusive scale for research performance assessment can motivate researchers to achieve organizational goals in a less resource-consuming way. Balanced intentions of university policies and researchers can allow the enhancement of research stimulus and increase research quality.
The additional file for this article can be found as follows:
Questionnaire. DOI:
According to
The reliability of these five activity scores is examined with the Cronbach Alpha coefficient and all of them were found to have high reliability levels (α > 0.700). The skewness and kurtosis coefficients of the activity scores were in the range of ± 2. Therefore, the normal distribution assumption is satisfied.
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.