fibers.texi


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584

\input texinfo   @c -*-texinfo-*-
@c %**start of header
@setfilename fibers.info
@settitle Fibers
@c %**end of header

@set VERSION 0.2.0
@set UPDATED 11 September 2016

@copying
This manual is for Fibers (version @value{VERSION}, updated
@value{UPDATED})

Copyright 2016 Andy Wingo

@quotation
@c For more information, see COPYING.docs in the fibers
@c distribution.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.3 or
any later version published by the Free Software Foundation; with no
Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.
@end quotation
@end copying

@dircategory The Algorithmic Language Scheme
@direntry
* Fibers: (fibers.info).  Lightweight concurrency for Guile.
@end direntry

@titlepage
@title Fibers
@subtitle version @value{VERSION}, updated @value{UPDATED}
@author Andy Wingo
@page
@vskip 0pt plus 1filll
@insertcopying
@end titlepage

@ifnottex
@node Top
@top Fibers


@insertcopying
@menu
* Introduction::          What's this all about?
* Reference::             API reference.
* Pitfalls::              Stay on the happy path.
* Status::                Fibers is a work in progress.
@end menu

@end ifnottex

@iftex
@shortcontents
@end iftex

@node Introduction
@chapter Introduction

Fibers is a facility for lightweight concurrency in Guile.

@menu
* Context::               How do other systems handle concurrency?
* Design::                Fibers' point in the design space.
@end menu

@node Context
@section A brief history of language facilities for concurrency

Modern machines have the raw capability to serve hundreds of thousands
of simultaneous long-lived connections, but it's often hard to manage
this at the software level.  Fibers tries to solve this problem in a
nice way.  Before discussing the approach taken in Fibers, it's worth
spending some time on history to see how we got here.

One of the most dominant patterns for concurrency these days is
``callbacks'', notably in the Twisted library for Python and the
Node.js run-time for JavaScript.  The basic observation in the
callback approach to concurrency is that the efficient way to handle
tens of thousands of connections at once is with low-level operating
system facilities like @code{poll} or @code{epoll}.  You add all of
the file descriptors that you are interested in to a ``poll set'' and
then ask the operating system which ones are readable or writable, as
appropriate.  Once the operating system says ``yes, file descriptor
7145 is readable'', you can do something with that socket; but what?
With callbacks, the answer is ``call a user-supplied closure'': a
callback, representing the continuation of the computation on that
socket.

Building a network service with a callback-oriented concurrency system
means breaking the program into little chunks that can run without
blocking.  Whereever a program could block, instead of just continuing
the program, you register a callback.  Unfortunately this requirement
permeates the program, from top to bottom: you always pay the mental
cost of inverting your program's control flow by turning it into
callbacks, and you always incur run-time cost of closure creation,
even when the particular I/O could proceed without blocking.  It's a
somewhat galling requirement, given that this contortion is required
of the programmer, but could be done by the compiler.  We Schemers
demand better abstractions than manual, obligatory
continuation-passing-style conversion.

Callback-based systems also encourage unstructured concurrency, as in
practice callbacks are not the only path for data and control flow in
a system: usually there is mutable global state as well.  Without
strong patterns and conventions, callback-based systems often exhibit
bugs caused by concurrent reads and writes to global state.

Some of the problems of callbacks can be mitigated by using
``promises'' or other library-level abstractions; if you're a Haskell
person, you can think of this as lifting all possibly-blocking
operations into a monad.  If you're not a Haskeller, that's cool,
neither am I!  But if your typey spidey senses are tingling, it's for
good reason: with promises, your whole program has to be transformed
to return promises-for-values instead of values anywhere it would
block.

An obvious solution to the control-flow problem of callbacks is to use
threads.  In the most generic sense, a thread is a language feature
which denotes an independent computation.  Threads are created by
other threads, but fork off and run independently instead of returning
to their caller.  In a system with threads, there is implicitly a
scheduler somewhere that multiplexes the threads so that when one
suspends, another can run.

In practice, the concept of threads is often conflated with a
particular implementation, @dfn{kernel threads}.  Kernel threads are
very low-level abstractions that are provided by the operating system.
The nice thing about kernel threads is that they can use any CPU that
is the kernel knows about.  That's an important factor in today's
computing landscape, where Moore's law seems to have ended and we are
getting more cores instead of more gigahertz.

However, as a building block for a highly concurrent system, kernel
threads have a few important problems.

One is that kernel threads simply aren't designed to be allocated in
huge numbers, and instead are more optimized to run in a
one-per-CPU-core fashion.  Their memory usage is relatively high for
what should be a lightweight abstraction: some 10 kilobytes at least
and often some megabytes, in the form of the thread's stack.  There
are ongoing efforts to reduce this for some systems but we cannot
expect wide deployment in the next 5 years, if ever.  Even in the best
case, a hundred thousand kernel threads will take at least a gigabyte
of memory, which seems a bit excessive for book-keeping overhead.

Kernel threads can be a bit irritating to schedule, too: when one
thread suspends, it's for a reason, and it can be that user-space
knows a good next thread that should run.  However because kernel
threads are scheduled in the kernel, it's rarely possible for the
kernel to make informed decisions.  There are some ``user-mode
scheduling'' facilities that are in development for some systems, but
again only for some systems.

The other significant problem is that building non-crashy systems on
top of kernel threads is hard to do, not to mention ``correct''
systems.  It's an embarrassing situation.  For one thing, the
low-level synchronization primitives that are typically provided with
kernel threads, mutexes and condition variables, are not composable.
Also, as with callback-oriented concurrency, one thread can silently
corrupt another via unstructured mutation of shared state.  It's worse
with kernel threads, though: a kernel thread can be interrupted at any
point, not just at I/O.  And though callback-oriented systems can
theoretically operate on multiple CPUs at once, in practice they
don't.  This restriction is sometimes touted as a benefit by
proponents of callback-oriented systems, because in such a system, the
callback invocations have a single, sequential order.  With multiple
CPUs, this is not the case, as multiple threads can run at the same
time, in parallel.

Kernel threads can work.  The Java virtual machine does at least
manage to prevent low-level memory corruption and to do so with high
performance, but still, even Java-based systems that aim for maximum
concurrency avoid using a thread per connection because threads use
too much memory.

In this context it's no wonder that there's a third strain of
concurrency: shared-nothing message-passing systems like Erlang.
Erlang isolates each thread (called @dfn{processes} in the Erlang
world), giving each it its own heap and ``mailbox''.  Processes can
spawn other processes, and the concurrency primitive is
message-passing.  A process that tries receive a message from an empty
mailbox will ``block'', from its perspective.  In the meantime the
system will run other processes.  Message sends never block, oddly;
instead, sending to a process with many messages pending makes it more
likely that Erlang will pre-empt the sending process.  It's a strange
tradeoff, but it makes sense when you realize that Erlang was designed
for network transparency: the same message send/receive interface can
be used to send messages to processes on remote machines as well.

No network is truly transparent, however.  At the most basic level,
the performance of network sends should be much slower than local
sends.  Whereas a message sent to a remote process has to be written
out byte-by-byte over the network, there is no need to copy immutable
data within the same address space.  The complexity of a remote
message send is O(n) in the size of the message, whereas a local
immutable send is O(1).  This suggests that hiding the different
complexities behind one operator is the wrong thing to do.  And
indeed, given byte read and write operators over sockets, it's
possible to implement remote message send and receive as a process
that serializes and parses messages between a channel and a byte sink
or source.  In this way we get cheap local channels, and network shims
are under the programmer's control.  This is the approach that the Go
language takes, and is the one we use in Guile.

Channels and message send/receive is a great start, but it's not
sufficient for all purposes.  At the minimum, a program should be able
to wait for a message send or receive over multiple channels at once.
Go supports receive over multiple channels via its @code{select}
primitive, but there is a more general solution invented by Concurrent
ML, which is an abstraction over messages.  In Concurrent ML, you can
get an event directly from a channel; actually synchronizing on the
reception of the event is a separate operation.  Together with a
primitive to synchronize on one of a number of events, Concurrent ML
allows programs to build @code{select} in user-space.  It looks like
the right direction for Guile to go too.

@node Design
@section Fibers design

In Fibers, the unit of computation is the @dfn{fiber}, a lightweight
thread managed by Guile.  A fiber communicates with the world via
normal Guile ports: @code{get-bytevector}, @code{put-string}, and all
that.  Between themselves, fibers send and receive Scheme values over
@dfn{channels}, lightweight queues.

Whenever a fiber tries to read but no data is available, or tries to
write but no data can be written, Guile will suspend the fiber and
arrange for it to be resumed when the port or channel operation can
proceed.  In the meantime, Guile will run other fibers.  When no fiber
is runnable, Guile will use efficient system facilities to sleep until
input or output can proceed.

When a fiber would block, it suspends to the scheduler from the
current thread.  The scheduler will arrange to re-start the fiber when
the port or channel becomes readable or writable, as appropriate.  For
ports, the scheduler adds the file descriptor associated with the port
to an @code{epoll} set.  In either case, the scheduler remembers which
fibers are waiting and for what, so that the user can inspect the
state of their system.

If no scheduler has been installed in the current thread, the fiber
will... well, we don't know yet!  Either it blocks its thread, or it
aborts.  We don't know.

On the Scheme level, a fiber is a delimited continuation.  When a
scheduler runs a fiber, it does so within a prompt; when the fiber
suspends, it suspends to the prompt.  The scheduler saves the
resulting continuation as part of the fiber's state.  In this way the
per-fiber computational state overhead is just the size of the pending
stack frames of the fiber, which can be just a handful of words.

Currently fibers are pinned to the kernel thread in which they are
created.  We should probably implement some kind of work-stealing
behavior so that if you choose to devote multiple CPU cores to
servicing fibers, that they can share the workload.

Ports are how fibers communicate with the world; channels are how
fibers communicate with each other.  When a channel is created, it is
created with a maximum number of enqueued messages (by default one).
Sending to a channel with space in the queue will not suspend the
calling thread.  A fiber that tries to write to a full queue or read
from a channel with an empty queue will add itself to a list of
``waiters'' for the channel and suspend itself to the scheduler.  A
read from a full queue or a write to an empty queue will unblock
waiters on a channel.

Unlike Erlang channels, channels in Fiber are purely local and do not
attempt to provide the illusion of network transparency.  This does
have the positive advantage that we are able to provide better
backpressure support than Erlang, blocking when a channels's queue is
full instead of letting the sender keep sending many messages.

On the other hand, currently fibers are not preemptively scheduled.  A
fiber will only suspend when it would block on channel receive or
send, or on read or write on a port.  This would be an OK point in the
design space if only one kernel thread could be running fibers at
once, as in Node.js.  However given that this is not the case, Fibers
does not have many of the concurrency invariants that such systems
enjoy, so perhaps we should support preemption in the future.

To avoid starvation, a fiber can only run once within a ``turn''.
Each turn starts with a poll on file descriptors of interest and marks
the associated fibers as runnable.  If no fiber is runnable at the
start of the poll, the poll call will ask the kernel to wait for a
runnable descriptor.  Otherwise the poll call will ask the kernel to
return immediately.  There is an additional FD added to the poll set
that is used to interrupt a blocking poll, for example if a fiber
becomes runnable due to I/O on a channel from a separate kernel thread
while the first scheduler was still polling.

To enable expressive cross-kernel-thread communications, channel sends
and receives are atomic and thread-safe.

To start scheduling fibers, user code will typically create a
scheduler, instate it on the thread, add some fibers, then run the
scheduler.  That call to run the scheduler will only return when there
there are no more fibers waiting to be scheduled.

@node Reference
@chapter API reference

Fibers is a library built on Guile, consisting of a public interface,
a channels library, and an internals interface.

@menu
* Using Fibers::        User-facing interface to fibers
* Channels::            Share memory by communicating.
* Internals::           Scheduler and fiber objects and operations.
@end menu

@node Using Fibers
@section Using Fibers

The public interface of fibers right now is quite minimal.  To use it,
import the @code{(fibers)} module:

@example
(use-modules (fibers))
@end example

To create a new fibers scheduler and run it in the current Guile
thread, use @code{run-fibers}.

@defun run-fibers [init-thunk=@code{#f}] [#:install-suspendable-ports?=@code{#t}] [#:scheduler=@code{(make-scheduler)}] [#:keep-scheduler?]
Run @var{init-thunk} within a fiber in a fresh scheduler, blocking
until the scheduler has no more runnable fibers.  Return the value(s)
returned by the call to @var{init-thunk}.

For example:
@example
(run-fibers (lambda () 1)) @result{} 1
(run-fibers
 (lambda ()
   (spawn-fiber (lambda () (display "hey!\n")))))
  @print{} hey!
  @result{} #<<fiber> ...>
@end example

Calling @code{run-fibers} will ensure that Guile's port implementation
allows fibers to suspend if a read or a write on a port would block.
@xref{Non-Blocking I/O,,,guile.info,Guile Reference Manual}, for more
details on suspendable ports.  If for some reason you want port reads
or writes to prevent other fibers from running, pass @code{#f} as the
@code{#:install-suspendable-ports?} keyword argument.

By default, @code{run-fibers} will create a fresh scheduler.  If you
happen to have a pre-existing scheduler (because you are used the
internals interface to create one), you can pass it to
@code{run-fibers} using the @code{#:scheduler} keyword argument.

The scheduler will be destroyed when @code{run-fibers} finishes,
unless the scheduler was already ``current'' (@pxref{Internals}).  If
you need to keep the scheduler, either make sure it is current or
explicitly pass @code{#t} as the @code{#:keep-scheduler?} keyword
argument.
@end defun

@defun spawn-fiber thunk [#:scheduler=@code{(require-current-scheduler)}]
Spawn a new fiber that will run @var{thunk}.  Return the new fiber.
The new fiber will run concurrently with other fibers.

The fiber will be added to the current scheduler, which is usually
what you want.  It's also possible to spawn the fiber on a specific
scheduler, which is useful to ensure that the fiber runs on a
different kernel thread.  In that case, pass the @code{#:scheduler}
keyword argument.

Currently, fibers will only ever be run within the scheduler to which
they are first added, which effectively pins them to a single kernel
thread.  This limitation may be relaxed in the future.
@end defun

@defun current-fiber
Return the current fiber, or @code{#f} if not called within the
dynamic extent of a thunk passed to @code{spawn-fiber}.
@end defun

@defun sleep seconds
Wake up the current fiber after @var{seconds} of wall-clock time have
elapsed.  This definition will replace the binding for @code{sleep} in
the importing module, effectively overriding Guile's ``core''
definition.
@end defun

@node Channels
@section Channels

Channels are the way to communicate between fibers.  To use them, load
the channels module:

@example
(use-modules (fibers channels))
@end example

@defun make-channel [#:queue-size=@code{1}]
Create and return a fresh channel.  By default the channel will have
space for one buffered message; pass a larger value as
@code{#:queue-size} to increase the buffer size.
@end defun

@defun channel? obj
Return @code{#t} if @var{obj} is a channel, and @code{#f} otherwise.
@end defun

@defun put-message channel message
Send @var{message} on @var{channel}, and return zero values.  If
@var{channel} is full (i.e., it already @var{queue-size} messages
queued), block until some other fiber calls @code{get-message} on the
channel.
@end defun

@defun get-message channel
Receive a message from @var{channel}, and return zero values.  If
@var{channel} is empty (i.e., there are no messages in its queue),
block until some other fiber calls @code{put-message} on the channel.
@end defun

Channels are thread-safe; you can use them to send and receive values
between fibers on different kernel threads.

@node Internals
@section Internals

These internal interfaces are a bit dangerous, in the sense that if
they are used wrongly, they can corrupt the state of your program.
For example, the scheduler has some specific mechanisms to ensure
thread-safety, and not all of the procedures in this module can be
invoked on a scheduler from any thread.  We will document them at some
point, but for now this section is a stub.

@example
(use-modules (fibers internal))
@end example

@defun make-scheduler
@end defun

@defvar current-scheduler
@end defvar

@defun run-scheduler sched [#:join-fiber]
@end defun

@defun destroy-scheduler sched
@end defun

@defun add-fd-events! sched fd events fiber
@end defun

@defun add-sleeper! sched fiber seconds
@end defun

@defun create-fiber sched thunk
@end defun

@defvar current-fiber
@end defvar

@defun kill-fiber fiber
@end defun

@defspec fiber-scheduler
@end defspec

@defspec fiber-state
@end defspec

@defun suspend-current-fiber [after-suspend]
@end defun

@defun resume-fiber fiber thunk
@end defun


@node Pitfalls
@chapter Pitfalls

Running Guile code within a fiber mostly ``just works''.  There are a
couple of pitfalls to be aware of though.

@menu
* Blocking::            Avoid calling blocking operations.
* Mutation::            Avoid unstructured mutation of shared data.
@end menu

@node Blocking
@section Blocking

When you run a program under fibers, the fibers library arranges to
make it so that port operations can suspend the fiber instead of
block.  This generally works, with some caveats.

@enumerate
@item
The port type has to either never block, or support non-blocking I/O.
Currently the only kind of port in Guile are file ports (including
sockets), and for them this condition is fulfilled.  However notably
non-blocking I/O is not supported for custom binary I/O ports, not yet
anyway.  If you need this, get it fixed in Guile :)
@item
You have to make sure that any file port you operate on is opened in
nonblocking mode.  @xref{Non-Blocking I/O,,,guile.info,Guile Reference
Manual}, for the obscure @code{fcntl} incantation to use on your
ports.
@item
You have to avoid any operation on ports that is not supported yet in
Guile for non-blocking I/O.  Since non-blocking I/O is new in Guile,
only some I/O operations are expressed in terms of the primitive
operations.  Notably, Scheme @code{read}, @code{display}, and
@code{write} are still implemented in C, which prevents any fiber that
uses them from suspending and resuming correctly.  What will happen
instead is that the call blocks instead of suspending.  If you find a
situation like this, talk to Guile developers to get it fixed :)
@item
You can enable non-blocking I/O for local files, but Linux at least
will always say that the local file is ready for I/O even if it has to
page in data from a spinning-metal device.  This is a well-known
limitation for which the solution is apparently to do local I/O via a
thread pool.  We could implement this in Fibers, or in Guile... not
sure what the right thing is!
@end enumerate

You also have to avoid any other library or system calls that would
block.  One common source of blocking is @code{getaddrinfo} and
related network address resolution library calls.  Again, apparently
the solution is thread pools?  Probably in Fibers we should implement
a thread-pooled address resolver.

The @code{(fibers)} module exports a @code{sleep} replacement.  Code
that sleeps should import the @code{(fibers)} module to be sure that
they aren't using Guile's @code{sleep} function.

Finally, a fiber itself has to avoid blocking other fibers; it must
reach a ``yield point'' some time.  A yield point includes a read or
write on a port or a channel that would block, or a @code{sleep}.
Other than that, nothing will pre-empt a fiber, at least not
currently.  If you need to yield to the scheduler, then at least do a
@code{(sleep 0)} or something.

@node Mutation
@section Mutation

Although code run within fibers looks like normal straight-up Scheme,
it runs concurrently with other fibers.  This means that if you mutate
shared state and other fibers mutate the same shared state using
normal Scheme procedures like @code{set!}, @code{vector-set!}, or the
like, then probably you're going to have a bad time.  Although
currently fibers aren't scheduled pre-emptively, multi-step
transactions may be suspended if your code reaches a yield point in
the middle of performing the transaction.

Likewise if you have multiple kernel threads running fiber schedulers,
then it could indeed be that you have multiple fibers running in
parallel.

The best way around this problem is to avoid unstructured mutation,
and to instead share data by communicating over channels.  Using
channels to communicate data produces much more robust, safe systems.

If you need to mutate global data, do so within a mutex.

@node Status
@chapter Project status

It's early days.  At the time of this writing, no one uses fibers in
production that we are aware of.  Should you be the first?  Well
maybe, if you feel like you understand the implementation, are
prepared to debug, and have some time on your hands.  Otherwise
probably it's better to wait.

See the @code{TODO.md} file in the repository for a current list of
to-do items.

@c @node Concept Index
@c @unnumbered Concept Index
@c @printindex cp
@c @node Function Index
@c @unnumbered Function Index
@c @printindex fn

@bye