AI models are algorithms that process input and convert them into predictions. They are trained using data and aim to replicate humans' decision process.
In traditional Machine Learning, client data are sent to a central server where the AI model is trained and updated.
Centralized systems are vulnerable by nature as they have a single point of failure.
In the instance of traditional machine learning, the central server is likely to concentrate attacks from malicious actors or to act irresponsibly, jeopardizing the learning process.
Privacy is at risk in this system where local user data are being sent to a third party.
Users are likely to be reluctant to share their data and restrictive regulation may also be implemented to prevent it (such as GDPR in Europe).
To perform well, AI models need enormous quantities of data to train on.
Transferring data from clients to the central server generates important communication costs, thereby limiting learning scalability.
In Federated learning, the training process is transferred directly to clients who become “learners”.
The global AI model is sent to learners for local training, who then send back an update of model parameters to the central server for aggregation, largely improving data privacy and reducing transmission costs.
Despite the transfer of training tasks to local learners, the central aggregator still represents a single point of failure in the system.
In addition, as learning is decentralized to protect privacy, the system now also becomes vulnerable to contamination resulting from potentially flawed or malicious local updates.
Federated learning greatly improves privacy: user data remain stored locally, and only model parameters are shared by learners to the central aggregator.
However, user data may still be reverse-engineered from these local updates.
Transferring model parameters instead of user data largely decreases transmission costs, improving the system’s ability to absorb high volumes of training and therefore its structural scalability.
However, this new transfer of task means Federated learners need to be willing to participate. Attracting and retaining learners represents a challenge to scalability in this framework.
Without incentive, learners may disengage from the AI training, affecting the global model’s performance and the effectiveness of the training.
Fantastyc is a blockchain based federated learning framework, optimizing data privacy, system robustness and learning scalability all at once.
The central aggregator is replaced by a network of peers, thereby solving the “single point of failure” issue inherent to centralized systems.
Data is stored in a public ledger distributed across the network, consisting in an unalterable chain of interdependent blocks.
Each node has a copy of the ledger, validates the information, and helps reach a consensus about its accuracy.
The Byzantine generals problem describes the difficulty decentralized parties have in reaching consensus without relying on a trusted central party.
Blockchain technology is a modern reformulation of this old problem. It aims to achieve a system that functions without central authority and is Byzantine Fault Tolerant (BFT), i.e. resilient to unreliable participants.
However, classic BFL frameworks may still be subject to byzantine aggregators or learners.
Fantastyc uses a combination of BFT Proof-of-Stake consensus and BFT aggregation function, ensuring BFT both at aggregator and learner levels.
With Fantastyc, the updated model parameters are encrypted and only partly shared across the network, making it impossible to reverse-engineer local learner data and further improving privacy.
In Fantastyc, learner contributions shared as digital fingerprints are lighter than model updates, which in turns implies faster aggregation and validation, optimizing overall learning scalability.
Moreover, smart-contracts embedded in the blockchain enable the model provider to automate incentive mechanisms, keeping learners engaged.