Projects

This page presents the research projects and doctoral theses I have co-supervised.

Co-Supervised Doctoral Theses

I have been actively involved in co-supervising several doctoral theses, contributing to research in business process management, machine learning automation, logistics, and semantic web technologies.

Business Process Maintenance: Case of Business Processes

Abstract: In this thesis, we focus on managing the propagation of change impact at several levels of granularity and abstraction of business process models (BPM) in order to master change management in this type of process. We start by analyzing the dependencies between activities, data and the roles of actors in a business process. For this, we have developed a prototype tool where facts are collected and analyzed in the form of a matrix.

Then, we improved our approach by adding the semantic aspect by defining an ontology of BPMN 2.0 dependencies. The latter makes it possible to store in a structured manner the information obtained from different versions of business processes, to map and manipulate the elements necessary for predicting the impact of changing a business process. A third contribution makes it possible to predict the level of change in BPM models. We defined three levels of change (low, medium, and high) based on structural metrics used in the predictive model. Five different machine learning algorithms are used and then evaluated in this study. The experiments carried out show the improvement in performance using the SVM and Gaussian Naive Bayes algorithms.

View thesis details →

Towards Effective and Explainable Automation of Machine Learning Processes: Application to Industry 4.0

Doctoral Student: Moncef Garouani
Defense Date: September 27, 2022
Institution: Hassan I University of Casablanca (in co-supervision with ULCO)

Abstract: Machine learning (ML) has penetrated all aspects of modern life, and brought more convenience and satisfaction for variables of interest. However, building such solutions is a time consuming and challenging process that requires highly technical expertise. This certainly engages many more people, not necessarily experts, to perform analytics tasks. While the selection and the parametrization of ML models require tedious episodes of trial and error. Additionally, domain experts often lack the expertise to apply advanced analytics. Consequently, they intend frequent consultations with data scientists. However, these collaborations often result in increased costs in terms of undesired delays. It thus can lead risks such as human-resource bottlenecks.

Subsequently, as the tasks become more complex, similarly the more support solutions are needed for the increased ML usability for the non-ML masters. To that end, Automated ML (AutoML) is a data-mining formalism with the aim of reducing human effort and readily improving the development cycle through automation. The field of AutoML aims to make these decisions in a data-driven, objective, and automated way. Thereby, AutoML makes ML techniques accessible to domain scientists who are interested in applying advanced analytics but lack the required expertise. This can be seen as a democratization of ML.

AutoML is usually treated as an algorithms selection and parametrization problem. In this regard, existing approaches include Bayesian optimization, evolutionary algorithms as well as reinforcement learning. These approaches have focused on providing user assistance by automating parts or the entire data analysis process, but without being concerned on its impact on the analysis. The goal has generally been focused on the performance factors, thus leaving aside other important and even crucial aspects such as computational complexity, confidence and transparency.

In contrast, this thesis aims at developing alternative methods that provide assistance in building appropriate modeling techniques while providing the rationale for the selected models. In particular, we consider this important demand in intelligent assistance as a meta-analysis process, and we make progress towards addressing two challenges in AutoML research. First, to overcome the computational complexity problem, we studied a formulation of AutoML as a recommendation problem, and proposed a new conceptualization of a Meta-Learning (MtL)-based expert system capable of recommending optimal ML pipelines for a given task; Second, we investigated the automatic explainability aspect of the AutoML process to address the problem of the acceptance of, and the trust in such black-boxes support systems.

To this end, we have designed and implemented a framework architecture that leverages ideas from MtL to learn the relationship between a new set of datasets meta-data and mining algorithms. This eventually enables recommending ML pipelines according to their potential impact on the analysis. To guide the development of our work, we chose to focus on the Industry 4.0 as a main field of application for all the constraints it offers. Finally, in this doctoral thesis, we focus on the user assistance in the algorithms selection and tuning step. We devise an architecture and build a tool, AMLBID, that provides users support with the aim of improving the analysis and decreasing the amount of time spent in algorithms selection and parametrization. It is a tool that for the first time does not aim at providing data analysis support only, but instead, it is oriented towards positively contributing to the trust in such powerful support systems by automatically providing a set of explanation levels to inspect the provided results.

Ontologies and Semantic Web for an Evolutive Development of Logistic Applications

Doctoral Student: Hayder Ibrahim Hendi
Defense Date: 2017
Institution: Université du Littoral Côte d'Opale (ULCO)

Abstract: Logistics problems are often complex combinatorial problems. These may also implicitly refer to the processes, actors, activities, and methods concerning various aspects that need to be considered. Thus the same process may involve the processes of sale/purchase, transport/delivery, and stock management. These processes are so diverse and interconnected that it is difficult for a logistic expert to compete all of them.

In this thesis, we propose the explications with the help of ontologies of conceptual and semantic knowledge concerning the logistic processes. This explicit knowledge is then used to develop a reasoning system to guide the logistic expert for an incremental and semi-automatic construction of a software solution to an instantly posed problem. We define an ontology concerning the inter-connected logistics and associated optimization problem. We, henceforth, establish an explicit semantic link between the domains of logistics and the optimization. It may allow the logistic expert to identify precisely and unambiguously the confronted logistic problem and the associated optimization problem.

The identification of the problems then leads to a process to choose the solutions ranging from the choice of the precise logistic process to be implemented to that of the method to solve the combinatorial problem until the discovery of the software component to be invoked and which is implemented by a web service. The approach we have adopted and implemented has been experimented with the "Vehicle Routing Problems", the "Passenger Train Problem" and the "Container Terminal problems".

View thesis on HAL →

Contribution to the Business Process Evolution Management

Doctoral Student: Mohammed Oussama Kherbouche
Defense Date: December 2, 2013

Abstract: The evolution management of the business processes requires an exhaustive understanding of the change. An evolution engineer needs to understand reasons of a change, its application levels, and subsequently its impact on the whole system. In this thesis, we propose an approach for an a priori change impact analysis, to better control the business process evolution. This may help the business experts and the process designers to evaluate change impact in order to reduce the associated risks and estimate the related costs. It may also help to improve the service and quality of the business processes.

This work contributes an eventual improvement, in regard, to verify the coherence and the compliance of the business process models, after each change. It leads to evaluate an a priori change impact analysis in structural and qualitative aspects. The multiple-perspectives of the proposed approach have been reviewed experimentally. The validation of the approach is evaluated by extending the Eclipse Development Environment, with the help of a set of plug-ins, as a prototype platform.

Research Projects

I am involved in several research projects focusing on Cyber-Physical Systems (CPS), conversational AI, and intelligent recommendation systems.

Development of a Conversational System for CPS Selection Assistance

Objective: This project aims to develop an intelligent conversational system capable of assisting users in describing their specific needs, in order to receive personalized recommendations for CPS (Cyber-Physical Systems). The chatbot will be capable of understanding complex queries, accessing a rich database, and providing precise answers, all thanks to the use of fine-tuning, embeddings, and prompt engineering techniques.

The system begins by understanding the user's specific needs through questions asked by the chatbot or guided forms. For example:

What sport is practiced by the athlete?
What performance indicators should be monitored (heart rate, speed, power)?
What is the duration or frequency of training sessions?

Conversational System (AI Chatbot)

Description: This module uses a natural language processing (NLP) model based on AI (such as GPT-n) to understand and interpret user queries.

Function: The chatbot guides the user in describing their needs for a CPS system. It proposes relevant questions and provides personalized answers.

Components:

Fine-Tuning and Prompts: For each interaction, the chatbot uses a pre-trained model adjusted via fine-tuning or prompt engineering to guarantee contextualized responses.
Knowledge Exploration Middleware: Use of embeddings and a semantic search engine to extract relevant information from the CPS knowledge base.

CPS Recommendation Engine

Description: This module is responsible for analyzing user needs and generating adapted CPS configurations.

Function: The engine analyzes user data (type of sport, performance indicators) and proposes CPS sensors and components, as well as analysis algorithms, based on expressed needs.

Components:

Need-based filtering: This sub-module filters sensors, devices, and CPS solutions based on constraints (accuracy, budget, compatibility).
Optimization: Optimizes the CPS configuration taking into account performance and environment criteria.
Algorithm Recommendation: Proposes processing algorithms adapted to the types of collected data (e.g., machine learning algorithms for performance prediction).

Cloud Infrastructure and Model Management

Description: Hosting of the system on a cloud infrastructure to ensure scalability and accessibility of the service.

Function: Allows deployment of the chatbot, recommendation engine, and database in a secure and large-scale manner.

Components:

Storage and NLP model management: Hosts and manages the fine-tuned models used by the chatbot.
Database management: Stores and organizes information on CPS components and user interactions.

Final Objective: The system aims to provide personalized, reliable, and explainable recommendations for users seeking to choose a CPS system adapted to their specific needs. By integrating fine-tuning and prompt engineering techniques, the chatbot will be capable of responding precisely to user queries, while avoiding common model errors. This project can be extended to include new application domains as models are refined.

CPS Configuration Recommendation System

We are developing a CPS Recommendation Engine capable of generating configurations of CPS (Cyber-Physical Systems). This recommendation system aims to suggest adapted solutions based on users' specific needs.

The CPS recommendation system is a tool to provide users with optimized configurations based on their specific needs. Data collection, whether technical data on CPS components, processing algorithms, or user feedback, is fundamental to guarantee the accuracy and relevance of recommendations. Through an approach combining needs analysis, configuration optimization, and decision support, this system will help select the most adapted CPS solutions for scenarios as varied as sports performance monitoring.

The system begins by understanding the user's specific needs through questions asked by the chatbot (previously developed) or guided forms. For example:

What sport is practiced by the athlete?
What performance indicators should be monitored (heart rate, speed, power)?
What is the duration or frequency of training sessions?

Once the needs are understood, the system recommends an adapted CPS configuration, relying on a database of available devices and sensors. Filtering is done based on criteria such as sensor accuracy, device compatibility with each other, and budget constraints.

In addition to proposing an initial selection of sensors and equipment, the system optimizes the configuration based on the environment (indoor or outdoor), connection mode (wireless or wired), and user requirements (real-time, post-event analysis).

The system also proposes data analysis algorithms adapted to the sports domain, such as signal processing algorithms to analyze sensor data in real-time, or machine learning algorithms for prediction and long-term performance analysis.

The recommendation system allows the user to adjust the proposed suggestions (choice of sensors, connectivity, etc.), based on their preferences or feedback from experience.

CPS Components Database

Description: A centralized database that stores all information on CPS components (sensors, devices, algorithms), their technical specifications, and their compatibilities.

Function: It feeds the recommendation engine with updated data and ensures the management of interoperability between different components.

Components:

Technical data of sensors and devices (accuracy, costs, connectivity).
Data processing algorithms and CPS compatibilities.
User data and feedback to personalize recommendations.

Data Processing and Intelligence Module

Description: This module manages the processing of data collected by the system, particularly information provided by the user, feedback from experience, and performances of proposed configurations.

Function: It analyzes past interactions to improve future recommendations and avoid configuration errors or bad recommendations.

Components:

Feedback management: Analyzes user feedback to refine recommendations and adjust prompts and algorithms.
Machine learning: Uses learning models to improve the performance of the recommendation system and processing algorithms.

Integration and Communication Middleware

Description: An integration layer that connects the different modules (chatbot, recommendation engine, databases) and ensures the management of exchanges between them.

Function: Facilitates communications between the chatbot and the recommendation engine, and between the latter and the CPS components database.

Components:

RESTful API: Provides endpoints to access the recommendation engine and CPS database functionalities.
Orchestration: Coordinates exchanges and processing between different modules to ensure smooth interaction.

Final Objective: The system provides explanations on why certain configurations are recommended, highlighting performance, compatibility, or cost criteria, in order to support the user's decision-making.