Mauriana Pesaresi

Seminar Series 2021/2022

In this series of seminars dedicated to the memory of Mauriana Pesaresi, doctoral students of the Computer Science Department of the University of Pisa will present core issues of their research, focusing on open problems and possibly on their innovative contributions.

For any further information, you can reach us out via email.

🚀 Upcoming

🔮 Next Talks

10th

June

15:00-16:00

Domenico Tortola

Trusted interactions in a trustless netwok: novel technical solutions

No abstract available

⌛️ Past Talks

21st

January

15:00-16:00

Rudy Semola

CL-as-a-Service: a novel tool to Continual and Real-time machine learning

Currently, for companies in different industries, it is difficult to move their ML pipelines and systems towards real-time prediction and automated, stateful training. On the other hand, some research trends that seem to overcome these limitations are Auto-ML, Continual learning, Online prediction and ML system. Recent open-source tools such as Avalanche can be exploited to build novel systems that help the companies to reduce the cost and at the same time level up their competitive technological advantage. This presentation analyzes the current European and National company industry state with step-by-step use cases, considerations, and technologies required. It is also presented the concept of Continual Learning as a Service (CLaaS) system as a general computing platform mainly based on continual learning approaches and Avalanche tool. It is shown as CLaaS is a novel computing Toolkit option for companies to support the development and deployment of machine learning models in an efficient and stateful fashion. To show the idea we have developed a lightweight CLaaS DEMO backend for researchers and engineers to exploit CL services, as well as a Rest API interface. At the end, we will discuss possible future works and research direction.

28th

January

15:00-16:00

Tarlis Portela

Movelet-based Classification of Multiple Aspect Trajectories

Mobility data, usually called moving object trajectories, represent the movement of objects like people, vehicles, ships. In the last decade, trajectory analysis has received significant attention. Existing trajectory classification methods have mainly considered space, time, and numerical data, ignoring the semantic dimensions. A recent line of approaches is called Movelet-based methods and it has been shown that they outperform the state-of-the-art in terms of classification accuracy. However, it is computationally unfeasible for most real trajectory datasets that contain big volumes of high dimensional data. We will show some methods called MASTERMovlelets, SUPERMovelets, and HiPerMovelets that provide a good trade off between classification accuracy and running time and we will discuss possible future works.

4th

February

15:00-16:00

Riccardo Massidda

Concept-Based Methods for Neural Network Interpretation

Concept-based methods attempt to interpret existing neural networks or to design inherently interpretable models by exploiting human-comprehensible concepts. In the current talk, I will present few significant examples of such methods, discussing their commonalities, their underlying assumptions, and their applications. More in detail, I will focus on the semantic alignment of neural directions and visual concepts in CNNs for computer vision. In this context, different existing approaches might be understood in terms of a unified general framework. Furthermore, I will show the impact of acknowledging semantic relations on such framework. Finally, the talk discusses the main issues affecting concept-based methods and hints to possible research strategies to tackle them.

11th

February

15:00-16:00

Arslan Siddique

Automatic key frame selection for video-based structure from motion pipelines

Structure from motion (SfM) is a photogrammetric imaging technique that refers to estimating 3D object models using a range of 2D images. 3D object models can be constructed from a 2D video sequence taken using a hand-held camera. However, processing all frames leads to a very high computational complexity and poor results. Automatic key frame selection significantly reduces computational time and improves the accuracy of Structure from Motion pipelines. In the first stage, features are extracted and tracked throughout the sequence. In a second stage stereo matching is used to obtain a detailed estimate of the geometry of the observed scene and compute frame-to-frame Homography and Fundamental matrices. Finally, a geometric robust information criterion (GRIC) is computed to decide the removal of this frame. The method has been tested on a test scenario containing some stationary scenes and experiments show that the proposed method is able to discard the frames corresponding to stationary scenes.

18th

February

15:00-16:00

Andrea Guerra

PIR: Private Information Retrieval

How can a user retrieve an item from a server in possession of a database without revealing which item was retrieved? The Private Information Retrieval (PIR) in cryptography is a protocol used to answer this question. So, PIR allows a client to download an element (e.g., movie, friend record) from a database held by an untrusted server without revealing to the server which element was downloaded. In this talk, we will see the importance of this problem and the challenges it presents. Indeed, PIR is a key building block in many privacy-preserving systems, unfortunately, existing constructions remain very expensive. This expense is fundamental: PIR schemes force the server to operate on all elements in the database to answer a single query, furthermore, to protect the privacy of the query, PIR schemes require the user to send n ciphertexts where n is the size of the database.

25th

February

15:00-16:00

Jacopo Massa

Data-aware application placement and routing in the Cloud-IoT continuum

With the widespread adoption of the Internet Of Things (IoT), billions of devices are now connected to the Internet and can reach computing facilities along the Cloud-IoT continuum to process the data they produce. This led to a dramatic increase in the deployed IoT-based applications as well as in the data they need to crunch. Those applications oftentimes have Quality of Service (QoS) requirements to be met by determining suitable placements for all processing and data services they are made of, and software-defined routings across the IoT and all different application components. Following a continuous reasoning approach, I will show you a modelling of Cloud-IoT infrastructures and multi-service applications, a methodology that permits determining eligible service placements and data traffic routings over Cloud-IoT resources, along with a Prolog prototype which implements both the presented model and methodology.

4th

March

15:00-16:00

Cosimo Rulli

Deep Neural Network Compression

Deep Neural Networks (DNNs) deliver state-of-the-art performance in a plethora of different tasks, such Computer Vision, Natural Language Processing and Speech Recognition. Their effectiveness comes at the price of an elevated computational burden, in terms of memory occupancy, inference time, and energy consumption, which hinders their usage in resource-constrained devices. Model Compression techniques tackle this problem by leveraging the largely proved over-parametrization of modern DNNs and aim at the reducing the computational requirements of DNNs without affecting their accuracy. In this seminar, we illustrate the taxonomy of DNNs compression methods, particularly focusing on quantization, pruning and knowledge distillation. We introduce their technical aspects, and discuss the different trade-offs between computational burden and accuracy that each technique allows to achieve.

11th

March

15:00-16:00

Martina Cinquini

CALIME: Causality-Aware Local Interpretable Model-agnostic Explanations

Over the last few years, eXplainable Artificial Intelligence (XAI) methods have been experiencing a wave of popularity due to their ability to provide human-understandable explanations that express the rationale of black-box models used by decision-making systems. Despite the widespread adoption of these procedures, a significant drawback of XAI methods is the assumption of features independence that implies the inability to rely on direct knowledge about potential dependencies among variables. Indeed, black-box behaviors are typically approximated by studying the effects on randomly generated feature values that might rarely appear in original samples. As a consequence, post-hoc explanation methods can capture associations in the data between input features and target class without guaranteeing the presence of any causal relationships among input features. We introduce an extended version of a widely used local and model-agnostic explainer that explicitly encodes causal relationships in the data generated around the input instance for which the explanation is required.

18th

March

15:00-16:00

Flavio Ascari

Extensive, locally complete abstract interpretation

Abstract interpretation is a framework to design sound static analyses which over-approximate the set of program behaviours. While this can prove correctness, it can't show incorrectness because false alarms may arise from the over-approximation. These may undermine the credibility of a static analysis tool, making it less effective in practice. For abstract interpretation, an ideal but very uncommon situation is completeness, in which the abstract interpreter doesn't introduce false alarms, hence coming out more reliable. However, to our knowledge methods to show completeness-like properties deal with intensional (ie. dependant on syntax) abstractions of a program, while completeness itself is an extensional property of the program semantics only. In the talk, we propose an extension of Local Completeness Logic, an Hoare-style logic to prove completeness of the abstraction of a program c on an input P. Our main contribution is the addition of a new rule we call (refine-ext). This rule allows to perform part of the analysis in a finer abstract domain, then to move the result to the coarser one without any precision loss. With this addition, the logic is able to prove all triples where the extensional best correct approximation of the program semantics is locally complete, thus untying the set of provable properties from the way the program is written.

25th

March

15:00-16:00

Federica Di Pasquale

Mixed Integer Programming Solvers

Mixed Integer Programming (MIP) Solvers are powerful tools for solving hard optimization problems. Over the past 60 years, both commercial and open-source MIP solvers have made tremendous progress thanks to the increasingly sophisticated techniques that have been developed. The core solution algorithm is common to most of the current state-of-the-art MIP solvers and it is called Branch and Cut, as it is an intermediate approach in between the generic Branch and Bound scheme and the Cutting Plane algorithm. However, the implementation details usually have a huge impact on the overall performances of a MIP solver. In this presentation, we will see the basic ideas of a Branch and Cut algorithm and a description of the main components of a MIP solver. Finally, we will discuss some possible research directions in order to achieve further improvements.

1st

April

15:00-16:00

Chiara Pugliese

MAT-Builder: a System to Build Semantically Enriched Trajectories

The notion of multiple aspect trajectory (MAT) has been recently introduced in the literature to represent movement data that is heavily semantically enriched with dimensions (aspects) representing various types of semantic information (e.g., stops, moves, weather, traffic, events, and points of interest). Aspects may be large in number, heterogeneous, or structurally complex. Although there is a growing volume of literature addressing the modelling and analysis of multiple aspect trajectories, the community suffers from a general lack of publicly available datasets. This is due to privacy concerns that make it difficult to publish such type of data, and to the lack of tools that are capable of linking raw spatio-temporal data to different types of semantic contextual data. In this work we aim to address this last issue by presenting MAT-BUILDER, a system that not only supports users during the whole semantic enrichment process, but also allows the use of a variety of external data sources. Furthermore, MAT-BUILDER has been designed with modularity and extensibility in mind, thus enabling practitioners to easily add new functionalities.

8th

April

15:00-16:00

Njagi Mwaniki

Inferring Viral Transmission Using Closely Related Viral Genomes

The genome of each individual in a species is different from that of every other individual within the same species. Further, the genome of each species is different from every other species. Despite this, most modern tools and methods for genome analysis treat the genome of a species as if it were linear, updating at regular intervals, and homogenous between individuals. Such a genome is known as a linear reference and is prone to the reference bias problem: a tendency of the reference genome to over-report sequences present in it and under-report those that are not. Genome graphs combat this bias by representing variation as alternative nodes and edges.I demonstrate the plausibility of differentiating related consensus genomes by comparing how they compare to a genome graph, then using this comparison to construct a phylogeny and further infer transmission.

22nd

April

15:00-16:00

Nicolò Tonci

Targeting scale-out and scale-up platforms with a unified parallel programming model

The proliferation of interconnected devices, capable of generating huge amounts of data, requires highly performing applications to extract insights. These needs amplify the demand of new scalable algorithms and programming interfaces, both for shared-memory and distributed-memory architectures. Currently, the standard tools to approach this challenge are either too complex or unflexible for programmers. This research aimed to propose a new distributed-memory run-time system for FastFlow’s building-blocks. FastFlow is a C++ structured parallel programming library, originally targeting shared-memory platforms. This new support allowed the exploitation of the parallel resources of multiple interconnected machines. The main target of the proposed run-time system was to enable an easy and smooth transition between shared-memory FastFlow applications to equivalent distributed-memory ones. A preliminary analysis of the new run-time system reported considerable benefits from the programmability and flexibility points of view. Additionally, a set of experiments, based both on real and artificial programs, validated the presented solution, demonstrating a significant performance gain and ability to fully exploit the resources of a cluster of machines.

29th

April

15:00-16:00

Alberto Ottimo

Data Stream Processing on FPGAs

State-of-art Data Stream Processing (DSP) systems provide high-level programming abstractions that hide the complexity of managing and deploying streaming applications on scale-out architectures. Unfortunately, integrating accelerators like FPGAs on those systems is not primitively considered. The application developer is obliged to manually intermix low-level primitives with the business logic code of the application. To overcome such a limitation, we propose a code generation approach that can produce efficient code starting from a suitable high-level representation of a DSP application, generating optimized code tailored for architectures equipped with FPGAs. We discuss several challenges a developer faces in designing and integrating FPGA accelerators in their applications. Then, we propose a run-time system and a set of base operators targeting Intel FPGAs and using the OpenCL standard to address these challenges. We evaluate our work using different representative data stream applications. Our experiments show that our work delivers competitive performance compared to existing DSP systems. We expect this work will accelerate progress in this domain.

6th

May

15:00-16:00

Marco Cardia

IT systems for assessing the environmental impact of industrial processes

The introduction of the Internet of Things (IoT) in the manufacturing field allows to develop information systems able to offer the possibility of using a large amount of data collected from heterogeneous sources in real time. The information thus obtained is functional not only to the attainment of an industrial development but also for evaluation and improvement of its sustainability. These large amounts of data can be exploited to address the environmental impact that is associated with the production systems of enterprises. This can be achieved through a series of strategies, including the monitoring of all relevant data on air, water or soil quality, which are clearly affected by the way the production system is carried out. In our work we aim to identify the indicators showing potential pollutants for each case. These indicators are obtained through laboratory tests. Consequently, it is necessary to find general sensors providing measures able to emulate the metrics detected by sensor. Once we have defined KPIs, a Machine Learning prototype will be constructed to provide reliable predictions of the pollutants. During the last phase, we aim to understand how we can change processes and business dynamics of the companies to reduce the environmental impact.

13th

May

15:00-16:00

Luca Corbucci

Semantic enrichment of XAI explanation for healthcare

Explaining black-box models decisions is crucial to increase doctors’ trust in AI-based clinical systems. However, eXplainable Artificial Intelligence techniques usually provide explanations that are not easily understandable by experts outside of AI. In this talk I will present a methodology aiming at enabling clinical reasoning by semantically enriching AI explanations. Starting from a medical AI explanation, based only on the input features provided to the algorithm, our methodology leverages medical ontologies and NLP embedding techniques to link relevant information present in the patient’s clinical notes to the original explanation. We validate our methodology with two experiments involving a human expert. Our results highlight promising performance in correctly identifying relevant information about the diseases of the patients.

20th

May

15:00-16:00

Marco Gaglianese

Monitoring the Cloud-IoT continuum for latency-Aware applications placement

Monitoring will play an enabling role in the orchestration of next-gen Fog applications. Particularly, monitoring of Fog computing infrastructures should deal with platform heterogeneity, scarce resource availability at the edge, and high dynamicity all along the Cloud-IoT continuum. In this presentation, we describe FogMon, a C++ distributed monitoring prototype targeting Fog computing infrastructures. FogMon monitors hardware resources at different Fog nodes, end-to-end network QoS between such nodes, and connected IoT devices. Besides, it features a self-organising peer-to-peer topology with selfrestructuring mechanisms, and differential monitoring updates, which ensure scalability, fault-tolerance and low communication overhead. Experiments on a real testbed show how the footprint of FogMon is limited and how its self-restructuring topology makes it resilient to infrastructure dynamicity.

27th

May

15:00-16:00

Alabbasi Wesam Nitham

Privacy vs Accuracy Trade-Off in Privacy Aware Face Recognition in Smart Systems

This work proposes a novel approach for privacy preserving face recognition aimed to formally define a trade-off optimization criterion between data privacy and algorithm accuracy. In our methodology, real world face images are anonymized with Gaussian blurring for privacy preservation. The anonymized images are processed for face detection, face alignment, face representation, and face verification. The proposed methodology has been validated with a set of experiments on a well-known dataset and three face recognition classifiers. The results demonstrate the effectiveness of our approach to correctly verify face images with different levels of privacy and results accuracy, and to maximize privacy with the least negative impact on face detection and face verification accuracy.

3rd

June

15:00-16:00

Reshawn Ramjattan

Machine Learning for Industry 4.0

Industry 4.0, or the fourth industrial revolution, refers to the rapid changes that interconnectivity and smart automation have brought to the industrial world. The scalability and automation of Machine Learning can enable significant improvements to efficiency and productivity in several areas of Industry 4.0. Top consulting companies estimate tremendous potential value from these ML-based solutions to industrial problems, and some companies have already started finding success. In this presentation, we will introduce the intersection of ML and Industry 4.0 through a systematic review of the academic literature as well as an industry perspective. We will also discuss the potential of some advancing ML areas, such as Continual Learning, Democratized Technology and Distributed ML.