4.1.1 Developing Methods and Tools for Big Data Mining as Applied to Research on Human Potential
Ilya Kuzminov
Project Leader (2020-2022)
Konstantin Vishnevskiy
Project Leader (2022)
Project period
2020-2022
Context of Research Project within a Subject of Human Capital
Identifying promising areas of science and technology that can make a significant contribution to the development of human potential requires constant scanning, monitoring and analysis of big data and daily replenished collections of documents. For this purpose, it is assumed to conduct intelligent analysis of big data using the latest semantic methods, as well as preparing complex descriptions and visual representations of the dynamics and structure of these areas.
The integration, modification and addition of the best existing models of vector representations of phrases implemented within the framework of the scientific project Developing Methods and Tools for Big Data Mining as Applied to Research on Human Potential allows us to implement a model for partial automation of identification and comparative cross-country analysis of trends in the field of human potential management and to radically improve the objectivity and effectiveness of research. Algorithms for collecting, processing and analyzing big data based on artificial intelligence will be used to identify challenges and analyze the global human development agenda, mapping the scientific and technological landscape, searching and evaluating global and Russian competence centers (organizations, collectives, persons), identifying emerging trends, weak signals and jokers
Project Aim
Developing and improving methods and tools for intellectual analysis of big data, their testing and practical application in order to analyze data on the prospects of human potential development and strengthen expert analytics in the field of human potential research
Project Objectives:
- Developing and improving methods and tools for the intellectual analysis of big data, their approbation and practical application in order to analyze data on the prospects of human development
- Testing the informational value and predictive power of text-mining indicators and their combinations (scoring metrics) by expert evaluation (markup) of the text-mining mapping results of trends, events, forecasts and human development factors
- Improving algorithms for extracting, indexing, ranking and intellectual analysis of scientific, technological, economic, social, humanitarian and other terms on the topics of human potential, social and humanitarian sciences based on the study of applied information technology products for big data processing infrastructure
Key Findings
2020
Algorithms for assessing the thematic proximity between scientific, analytical, regulatory, strategic documents and narrative-conceptual documents of a socio-humanitarian orientation based on deep learning algorithms, semantic analysis and text mining have been created and tested in order to strengthen expert analytics in the field of human potential research. In particular, tools for analyzing the evolution of terminology and its structural clustering based on Word2Vec and SciBERT machine learning models have been developed and applied. A study has been conducted on the possibility of using Kubernetes technology in managing the processing of large text data and metadata, large document collections for strategic mapping of the agendas of both multidisciplinary and specialized research in the field of human development
2021
The information value and predictive power of calculated text-mining indicators and their combinations (scoring metrics) have been evaluated in comparison with the results of expert markup of text-mining mapping of trends, events, forecasts and human development factors in terms of improving their quality
2022
An approach is proposed and tested using language markers, as well as language substitution rules for intellectual analysis of scientific, technological, economic, social, humanitarian and other terms in order to further obtain a comprehensive landscape of trends in the direction of human potential, social and humanitarian sciences. Text mining mapping of trends has been performed based on a trend detection model improved using an approach based on implemented markers and rules.
Семантическая карта трендов
Educational Programs
RIA
Gokhberg L, Kuzmin G., Kuzminov I., Sokolov A., Timofeev A. Computer program Node for Collecting Statistics of a Distributed Task Execution System in the Big Data Mining System iFORA Certificate of registration 2022619136, 05/19/2022.
Conferences
XXII April International Scientific Conference on Economic and Social Development (Moscow, Russia, April 4-22, 2021):
- International Scientific Symposium Using Open Big Data Analysis to Identify Global Trends and Challenges Related to the Formation and Use of Human Potential (RU) (12.04.2021)