Skip to main navigation Skip to search Skip to main content

An artificial intelligence-based framework for data-driven categorization of computer scientists: a case study of world’s Top 10 computing departments

Research output: Contribution to journalArticlepeer-review

Abstract

The total number of published articles and the resulting citations are generally acknowledged as suitable criteria of the scientist’s evaluation. However, it is challenging to determine the ranking of scientists as the value of their scientific work (at times) is not directly reflective of the abovementioned aspects. In this regard, multiple other elements needs to be examined in combination for better evaluating the scientific worth of an individual. This work presents a learning-based technique, i.e., an Artificial Intelligence (AI)-based solution towards categorizing scientists utilizing a multifaceted criteria. In this context, a novel ranking metric is proposed which is grounded on authorship, experience, publications count, total citations, i10-index, and h-index. To assess the proposed framework’s performance, a dataset is collected considering the world’s top ten computing departments and ten domestic ones. This results in a data of 1000 computer scientists. The dataset is preprocessed and afterwards three techniques for feature selection are employed, i.e., Mutual Information (MI), Chi-Square (X2), and Fisher-Test (F-Test) to rank the features in the data. To validate the collected data, the framework has three clustering techniques as well, namely, k-medoids, k-means, and spectral clustering to identify the optimum number of heterogeneous groups. Three cluster validity indices are used to evaluate the clustering outcomes, namely, Calinski-Harabasz Index (CHI), Davies Bouldin Index (DBI), and Silhouette Coefficient (SC). Once the optimum clusters are obtained, five classification procedures are used, including, Artificial Neural Network (ANN), k-Nearest Neighbor (k-NN), Decision Tree (DT), Gaussian Naive Bayes (GNB), and Linear Regression Classifier (LRC) to predict the category of a previously unknown scientist. Among all classifiers, an average accuracy of 94.44% is shown by the ANN to predict an unknown/new scientist category. The current proposal is also compared with closely related past works. The proposed framework offers the possibility to independently classify scientists based on AI techniques.

Original languageEnglish
Pages (from-to)1513-1545
Number of pages33
JournalScientometrics
Volume128
Issue number3
Early online date31 Dec 2022
DOIs
Publication statusPublished - Mar 2023

Bibliographical note

Publisher Copyright:
© 2022, Akadémiai Kiadó, Budapest, Hungary.

Keywords

  • Artificial intelligence
  • Classification
  • Clustering
  • Data driven decision-making
  • Research output measurement
  • Scientists ranking

ASJC Scopus subject areas

  • General Social Sciences
  • Computer Science Applications
  • Library and Information Sciences

Fingerprint

Dive into the research topics of 'An artificial intelligence-based framework for data-driven categorization of computer scientists: a case study of world’s Top 10 computing departments'. Together they form a unique fingerprint.

Cite this