-
Notifications
You must be signed in to change notification settings - Fork 4
/
CITATION.cff
137 lines (136 loc) · 5.14 KB
/
CITATION.cff
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: >-
Aergia: Leveraging Heterogeneity in Federated
Learning Systems
message: >-
Proceedings of the 23rd ACM/IFIP International
Middleware Conference
authors:
- given-names: Bart
family-names: Cox
affiliation: >-
Delft University of Technology, Delft,
Netherlands
orcid: 'https://orcid.org/0000-0001-5209-6161'
email: [email protected]
- given-names: Lydia
name-particle: 'Y'
family-names: Chen
affiliation: >-
Delft University of Technology, Delft,
Netherlands
orcid: 'https://orcid.org/0000-0002-4228-6735'
- given-names: Jérémie
family-names: Decouchant
affiliation: >-
Delft University of Technology, Delft,
Netherlands
orcid: 'https://orcid.org/0000-0001-9143-3984'
identifiers:
- type: doi
value: 10.1145/3528535.3565238
description: Conference Paper
abstract: >-
Federated Learning (FL) is a popular deep learning
approach that prevents centralizing large amounts
of data, and instead relies on clients that update
a global model using their local datasets.
Classical FL algorithms use a central federator
that, for each training round, waits for all
clients to send their model updates before
aggregating them. In practical deployments, clients
might have different computing powers and network
capabilities, which might lead slow clients to
become performance bottlenecks. Previous works have
suggested to use a deadline for each learning round
so that the federator ignores the late updates of
slow clients, or so that clients send partially
trained models before the deadline. To speed up the
training process, we instead propose Aergia, a
novel approach where slow clients (i) freeze the
part of their model that is the most
computationally intensive to train; (ii) train the
unfrozen part of their model; and (iii) offload the
training of the frozen part of their model to a
faster client that trains it using its own dataset.
The offloading decisions are orchestrated by the
federator based on the training speed that clients
report and on the similarities between their
datasets, which are privately evaluated thanks to a
trusted execution environment. We show through
extensive experiments that Aergia maintains high
accuracy and significantly reduces the training
time under heterogeneous settings by up to 27\% and
53\% compared to FedAvg and TiFL, respectively.
keywords:
- stragglers
- federated learning
- task offloading
preferred-citation:
type: conference-paper
title: >-
Aergia: Leveraging Heterogeneity in Federated
Learning Systems
authors:
- given-names: Bart
family-names: Cox
affiliation: >-
Delft University of Technology, Delft,
Netherlands
orcid: 'https://orcid.org/0000-0001-5209-6161'
email: [email protected]
- given-names: Lydia
name-particle: 'Y'
family-names: Chen
affiliation: >-
Delft University of Technology, Delft,
Netherlands
orcid: 'https://orcid.org/0000-0002-4228-6735'
- given-names: Jérémie
family-names: Decouchant
affiliation: >-
Delft University of Technology, Delft,
Netherlands
orcid: 'https://orcid.org/0000-0001-9143-3984'
identifiers:
- type: doi
value: 10.1145/3528535.3565238
description: Conference Paper
abstract: >-
Federated Learning (FL) is a popular deep learning
approach that prevents centralizing large amounts
of data, and instead relies on clients that update
a global model using their local datasets.
Classical FL algorithms use a central federator
that, for each training round, waits for all
clients to send their model updates before
aggregating them. In practical deployments, clients
might have different computing powers and network
capabilities, which might lead slow clients to
become performance bottlenecks. Previous works have
suggested to use a deadline for each learning round
so that the federator ignores the late updates of
slow clients, or so that clients send partially
trained models before the deadline. To speed up the
training process, we instead propose Aergia, a
novel approach where slow clients (i) freeze the
part of their model that is the most
computationally intensive to train; (ii) train the
unfrozen part of their model; and (iii) offload the
training of the frozen part of their model to a
faster client that trains it using its own dataset.
The offloading decisions are orchestrated by the
federator based on the training speed that clients
report and on the similarities between their
datasets, which are privately evaluated thanks to a
trusted execution environment. We show through
extensive experiments that Aergia maintains high
accuracy and significantly reduces the training
time under heterogeneous settings by up to 27\% and
53\% compared to FedAvg and TiFL, respectively.
keywords:
- stragglers
- federated learning
- task offloading