Smart-Label project

Machine learning with scarcely labeled data for Industry 4.0

Project summary

Although the benefits of AI are increasingly accepted, its adoption in the industry is far from optimal. Some AI techniques have found their place, for example fuzzy logic, others more recent have not yet had enough penetration. In this project we will focus on solving problems in the manufacturing industry for which there is not enough labeled data, where classical methods fail, and therefore more recent machine learning methods are necessary.

Within AI, machine learning focuses on creating algorithms that “learn” using the analysis of historical data, being able to discover hidden trends and patterns, which help in decision-making processes. Traditionally, there is a difference between supervised and unsupervised learning. In the first one, it is necessary to register data in which both the values of the input variable and output variable to be predicted are known (if the output variable takes a value from a finite set of possible values, we would be facing a problem of classification, if the output is a continuous value, we would have a regression problem). The result of the learning process is a model capable of predicting the output value for new input values. In an industrial context this is very useful because the output variable whose prediction is learned could be something like the time before the next failure (which would help to make the decision to carry out an action of preventive maintenance) or could be the roughness of a part to be machined (which will allow changing the parameters of the machining process in order to obtain the desired roughness). On the other hand, in unsupervised learning, there is no variable to predict, but instead we want to discover other types of patterns, such as the similarity between the data, or the fact that a set of measurements corresponds to an outlier (which could be an indication of a malfunction of the machine).

The problem with supervised learning is that the historical data often has to be manually tagged, which makes obtaining training data a very time-consuming and expensive process. To alleviate this issue, semi-supervised learning methods have recently emerged, in which, starting from a limited set of labeled data, they also take advantage of the existing structure in the unlabeled data to learn more robust models than those that could have been obtained using only the small available labeled data.

Our group has its origin in the design of classification algorithms. In different research projects we have been able to adapt them to other types of problems, such as regression problems (continuous output variable) or multi-label and multi-output problems (where several output variables are learned simultaneously by exploiting also the relationship that may exist between them). In this project we propose to adapt these algorithms to problems in which the number of labeled data is limited, that is, we want to design new semi-supervised learning algorithms and use them to solve industrial problems.

EyeVR dataset (available to download)

As part of the project, a dataset was created a dataset (EyeVR) was created for the identification of users in the use of cranes in industrial contexts in virtual reality.

This dataset contains (a) tags identifying the user using the system (b) measurements of variables collected by the eye-tracking sensors integrated in the virtual reality headsets while the participants performed the proposed tasks.

The proposed exercises, a total of 11, were carried out over 3 different sessions. They are exercises of different levels of difficulty.
More details about them are provided in the metadata folder, in our previous research publications and in the data descriptor accompanying this dataset. The dataset includes data from a total of 71 participants along three experiences.

More information on the dataset and the experiences carried out to collect data can be found in our research works and the dataset documentation:

  • Serrano-Mamolar, A., Miguel-Alonso, I., Checa, D., & Pardo-Aguilar, C. (2023). Towards learner performance evaluation in iVR learning environments using eye-tracking and Machine-learning. Comunicar, 31(76), 9–20. Retrieved from https://doi.org/10.3916/C76-2023-01 
  • Ramírez-Sanz, J.M., Peña-Alonso, H.M., Serrano-Mamolar, A., Arnaiz-González, Á., Bustillo, A. (2023). Detection of Stress Stimuli in Learning Contexts of iVR Environments. In: De Paolis, L.T., Arpaia, P., Sacco, M. (eds) Extended Reality. XR Salento 2023. Lecture Notes in Computer Science, vol 14219. Springer, Cham. https://doi.org/10.1007/978-3-031-43404-4_29

Link to the folder of the dataset Link to download the dataset

Publications

2024

Garrido-Labrador, José Luis; Serrano-Mamolar, Ana; Maudes-Raedo, Jesús; Rodríguez, Juan J.; García-Osorio, César

Ensemble methods and semi-supervised learning for information fusion: A review and future research directions Journal Article

In: Information Fusion, vol. 107, 2024, ISSN: 1566-2535.

Links | BibTeX

Kuncheva, Ludmila I.; Garrido-Labrador, José Luis; Ramos-Pérez, Ismael; Hennessey, Samuel L.; Rodríguez, Juan J.

Semi-supervised classification with pairwise constraints: A case study on animal identification from video Journal Article

In: Information Fusion, vol. 104, 2024, ISSN: 1566-2535.

Links | BibTeX

Ramos-Pérez, Ismael; Barbero-Aparicio, José Antonio; Canepa-Oneto, Antonio; Arnaiz-González, Álvar; Maudes-Raedo, Jesús

An Extensive Performance Comparison between Feature Reduction and Feature Selection Preprocessing Algorithms on Imbalanced Wide Data Journal Article

In: Information, vol. 15, no. 4, 2024, ISSN: 2078-2489.

Abstract | Links | BibTeX

Maestro-Prieto, Jose Alberto; Ramírez-Sanz, José Miguel; Andrés Bustillo, and Juan José Rodriguez-Díez

Semi-supervised diagnosis of wind-turbine gearbox misalignment and imbalance faults Journal Article

In: Applied Intelligence, 2024, ISSN: 1573-7497.

Abstract | Links | BibTeX

Martin-Melero, Íñigo; Serrano-Mamolar, Ana; Rodríguez-Diez, Juan J.

Evaluation of Semi-Supervised Machine Learning applied to Affective State Detection Bachelor Thesis

2024.

Links | BibTeX

Garrido-Labrador, José Luis; Serrano-Mamolar, Ana; Maudes-Raedo, Jesús; Rodríguez, Juan José; García-Osorio, César

Ensemble methods and semi-supervised learning for information fusion: A review and future research directions Journal Article

In: Information Fusion, vol. 107, 2024.

Abstract | Links | BibTeX

Barbero-Aparicio, José A.; Olivares-Gil, Alicia; Rodríguez, Juan J.; García-Osorio, César; Díez-Pastor, José F.

Addressing data scarcity in protein fitness landscape analysis: A study on semi-supervised and deep transfer learning techniques Journal Article

In: Information Fusion, vol. 102, pp. 102035, 2024, ISSN: 1566-2535.

Abstract | Links | BibTeX

2023

Ramírez-Sanz, José Miguel; Maestro-Prieto, Jose-Alberto; Arnaiz-González, Álvar; Bustillo, Andrés

Semi-supervised learning for industrial fault detection and diagnosis: A systemic review Journal Article

In: ISA Transactions, vol. 143, pp. 255–270, 2023, ISSN: 0019-0578.

Links | BibTeX

Mena-Alonso, Álvaro; Latorre-Carmona, Pedro; González, Dorys C.; Díez-Pastor, José F.; Rodríguez, Juan J.; Mínguez, Jesús; Vicente, Miguel A.

A cost-effective stereo camera-based system for measuring crack propagation in fibre-reinforced concrete Journal Article

In: Archiv.Civ.Mech.Eng, vol. 23, no. 3, 2023, ISSN: 2083-3318.

Abstract | Links | BibTeX

Kuncheva, Ludmila I.; Garrido-Labrador, José Luis; Ramos-Pérez, Ismael; Hennessey, Samuel L.; Rodríguez, Juan J.

An experiment on animal re-identification from video Journal Article

In: Ecological Informatics, vol. 74, 2023, ISSN: 1574-9541.

Links | BibTeX

Barbero-Aparicio, José A.; Olivares-Gil, Alicia; Díez-Pastor, José F.; García-Osorio, César

Deep learning and support vector machines for transcription start site identification Journal Article

In: PeerJ Computer Science, vol. 9, iss. e1340, 2023, ISSN: 2376-5992.

Abstract | Links | BibTeX

Setó-Rey, Daniel; Santos-Martín, José Ignacio; López-Nozal, Carlos

Vulnerability of Package Dependency Networks Journal Article

In: IEEE Trans. Netw. Sci. Eng., pp. 1–13, 2023, ISSN: 2327-4697.

Links | BibTeX

2022

Pimenov, Danil Yurievich; Bustillo, Andrés; Wojciechowski, Szymon; Sharma, Vishal Santosh; Gupta, Munish Kumar; Kuntğlu, Mustafa

Artificial intelligence systems for tool condition monitoring in machining: analysis and critical review Journal Article

In: Journal of Intelligent Manufacturing, vol. 2022, 2022, ISSN: 0956-5515.

Abstract | Links | BibTeX

Ramos-Pérez, Ismael; Arnaiz-González, Álvar; Rodríguez, Juan José; García-Osorio, César

When is resampling beneficial for feature selection with imbalanced wide data? Journal Article

In: Expert Systems with Applications, vol. 188, pp. 116015, 2022, ISSN: 0957-4174.

Abstract | Links | BibTeX

Olivares-Gil, Alicia; Arnaiz-Rodríguez, Adrián; Ramírez-Sanz, José Miguel; Garrido-Labrador, José Luis; Ahedo, Virginia; García-Osorio, César; Santos, José Ignacio; Galán, José Manuel

Mapping the scientific structure of organization and management of enterprises using complex networks Journal Article

In: Int. J. Prod. Manag. Eng., vol. 10, no. 1, pp. 65–76, 2022, ISSN: 2340-4876.

Abstract | Links | BibTeX

Cruz, David Checa; Urbikain, Gorka; Beranoagirre, Aitor; Bustillo, Andrés; Lacalle, Luis Norberto López

Using Machine-Learning techniques and Virtual Reality to design cutting tools for energy optimization in milling operations Journal Article

In: International Journal of Computer Integrated Manufacturing, vol. 35, no. 1, pp. 1-21, 2022, ISSN: 0951-192X.

Abstract | Links | BibTeX

2021

Díez-Pastor, José Francisco; Latorre-Carmona, Pedro; Garrido-Labrador, José Luis; Ramírez-Sanz, José Miguel; Rodríguez, Juan J.

Experimental Assessment of Feature Extraction Techniques Applied to the Identification of Properties of Common Objects, Using a Radar System Journal Article

In: Applied Sciences, vol. 11, no. 15, 2021, ISSN: 2076-3417.

Abstract | Links | BibTeX