Human Capital Multidisciplinary Research Center

4.1.1 Developing Methods and Tools for Big Data Mining as Applied to Research on Human Potential

Ilya Kuzminov
Project Leader (2020-2022)

Konstantin Vishnevskiy
Project Leader (2022)

Project period

2020-2022

Context of Research Project within a Subject of Human Capital

Identifying promising areas of science and technology that can make a significant contribution to the development of human potential requires constant scanning, monitoring and analysis of big data and daily replenished collections of documents. For this purpose, it is assumed to conduct intelligent analysis of big data using the latest semantic methods, as well as preparing complex descriptions and visual representations of the dynamics and structure of these areas.

The integration, modification and addition of the best existing models of vector representations of phrases implemented within the framework of the scientific project Developing Methods and Tools for Big Data Mining as Applied to Research on Human Potential allows us to implement a model for partial automation of identification and comparative cross-country analysis of trends in the field of human potential management and to radically improve the objectivity and effectiveness of research. Algorithms for collecting, processing and analyzing big data based on artificial intelligence will be used to identify challenges and analyze the global human development agenda, mapping the scientific and technological landscape, searching and evaluating global and Russian competence centers (organizations, collectives, persons), identifying emerging trends, weak signals and jokers

Project Aim

Developing and improving methods and tools for intellectual analysis of big data, their testing and practical application in order to analyze data on the prospects of human potential development and strengthen expert analytics in the field of human potential research

Project Objectives:

  1. Developing and improving methods and tools for the intellectual analysis of big data, their approbation and practical application in order to analyze data on the prospects of human development
  2. Testing the informational value and predictive power of text-mining indicators and their combinations (scoring metrics) by expert evaluation (markup) of the text-mining mapping results of trends, events, forecasts and human development factors
  3. Improving algorithms for extracting, indexing, ranking and intellectual analysis of scientific, technological, economic, social, humanitarian and other terms on the topics of human potential, social and humanitarian sciences based on the study of applied information technology products for big data processing infrastructure

Key Findings

2020

Algorithms for assessing the thematic proximity between scientific, analytical, regulatory, strategic documents and narrative-conceptual documents of a socio-humanitarian orientation based on deep learning algorithms, semantic analysis and text mining have been created and tested in order to strengthen expert analytics in the field of human potential research. In particular, tools for analyzing the evolution of terminology and its structural clustering based on Word2Vec and SciBERT machine learning models have been developed and applied. A study has been conducted on the possibility of using Kubernetes technology in managing the processing of large text data and metadata, large document collections for strategic mapping of the agendas of both multidisciplinary and specialized research in the field of human development

2021

The information value and predictive power of calculated text-mining indicators and their combinations (scoring metrics) have been evaluated in comparison with the results of expert markup of text-mining mapping of trends, events, forecasts and human development factors in terms of improving their quality

2022

An approach is proposed and tested using language markers, as well as language substitution rules for intellectual analysis of scientific, technological, economic, social, humanitarian and other terms in order to further obtain a comprehensive landscape of trends in the direction of human potential, social and humanitarian sciences. Text mining mapping of trends has been performed based on a trend detection model improved using an approach based on implemented markers and rules.

Семантическая карта трендов

Educational Programs

Educational Modules at the English-language Master's Program Science, Technology and Innovation Management and Policy (Governance of Science, Technology and Innovation)

RIA

Gokhberg L, Kuzmin G., Kuzminov I., Sokolov A., Timofeev A. Computer program Node for Collecting Statistics of a Distributed Task Execution System in the Big Data Mining System iFORA Certificate of registration 2022619136, 05/19/2022.

Conferences

XXII April International Scientific Conference on Economic and Social Development (Moscow, Russia, April 4-22, 2021):
- International Scientific Symposium Using Open Big Data Analysis to Identify Global Trends and Challenges Related to the Formation and Use of Human Potential (RU) (12.04.2021)