allanswers.org - comp.programming.threads FAQ [last mod 97/5/24]

 Home >  Programmingthreads-faq >

comp.programming.threads FAQ [last mod 97/5/24]

Section 1 of 2 - Prev - Next


URL: http://www.serpentine.com/~bos/threads-faq/
Posting-Frequency: monthly
Archive-name: threads-faq/part1
Last-modified: Sat May 24 21:52:47 1997

0.    TABLE OF CONTENTS
                                       
1.    Answers to frequently asked questions for comp.programming.threads:
      Part 1 of 1
2.    Introduction
2.1.  Reader contributions and comments
2.2.  How to read this FAQ
2.3.  Acknowledgments and caveats
3.    What are threads?
3.1.  Why are threads interesting?
3.2.  A little history
4.    What are the main families of threads?
4.1.  POSIX-style threads
4.2.  Microsoft-style threads
4.3.  Others
5.    Some terminology
5.1.  (DCE, POSIX, UI) Async safety
5.2.  Asynchronous and blocking system calls
5.3.  Context switch
5.4.  Critical section
5.5.  Lightweight process
5.6.  MT safety
5.7.  Protection boundary
5.8.  Scheduling
6.    What are the different kinds of threads?
6.1.  Architectural differences
6.2.  Performance differences
6.3.  Potential problems with functionality
7.    Where can I find books on threads?
7.1.  POSIX-style threads
7.2.  Microsoft-style threads
7.3.  Books on implementations
7.4.  The POSIX threads standard
8.    Where can I obtain training on using threads?
9.    (Unix) Are there any freely-available threads packages?
10.   (DCE, POSIX, UI) Why does my threaded program not handle signals
      sensibly?
11.   (DCE?, POSIX) Why does everyone tell me to avoid asynchronous
      cancellation?
12.   Why are reentrant library and system call interfaces good?
12.1. (DCE, POSIX, UI) When should I use thread-safe "_r" library
      calls?
13.   (POSIX) How can I perform a join on any thread?
14.   (DCE, UI, POSIX) After I create a certain number of threads, my
      program crashes
15.   Where can I find POSIX thread benchmarks?
16.   Does any DBMS vendor provide a thread-safe interface?
17.   Why is my threaded program running into performance problems?
18.   What tools will help me to program with threads?
19.   What operating systems provide threads?
20.   What about other threads-related software?
21.   Where can I find other information on threads?
21.1. Articles appearing in periodicals
22.   Notice of copyright and permissions
   
2.    Introduction
                                       
   This posting consists of answers to many of the questions most
   frequently asked and summaries of the topics most frequently covered
   on comp.programming.threads, the Usenet newsgroup for discussion of
   issues in multithreaded programming. The purpose of this posting is to
   circulate existing information, and to avoid rehashing old topics of
   discussion and questions. Please read all parts of this document
   before posting to this newsgroup.
   
   The FAQ is posted monthly to comp.programming.threads, in multiple
   parts. It is also available on the World-Wide Web, at
   . You may prefer to
   browse the FAQ on the Web rather than on Usenet, as it contains many
   useful hyperlinks (and tables are readable, which is unfortunately not
   the case for the text version).
   
2.1.  Reader contributions and comments

   Your contributions, comments, and corrections are welcomed; mail sent
   to  will be dealt with as quickly as I can
   manage. Generally, performing a reply or followup to this article from
   within your newsreader should do the Right Thing.
   
   While I am more than happy to include submissions of material for the
   FAQ if they seem appropriate, it would make my life a lot easier if
   such text were proof-read in advance, and kept concise. I don't have
   as much time as I would like to digest 15K text files and summarise
   them in three paragraphs for inclusion here. If you are interested in
   contributing material, please see the to-do list at the end of part 3
   of the FAQ.
   
2.2.  How to read this FAQ

   Some headers in this FAQ are preceded by words in parentheses, such as
   "(POSIX)". This indicates that the sections in question are specific
   to a particular threads family, or to the implementation provided by a
   specific vendor.
   
   Wherever it may not otherwise be obvious that a particular section
   refers only to some families or implementations, you will find one or
   more of the following key words to help you.
   Key Implementation
   DCE OSF/DCE threads (POSIX draft 4)
   OS/2 IBM OS/2 threads
   POSIX POSIX 1003.1c-1995 standard threads
   UI Unix International threads
   Unix Of general relevance to Unix users
   WIN32 Microsoft Win32 API threads
   
2.3.  Acknowledgments and caveats

   Although this FAQ has been the result of a co-operative effort, any
   blame for inaccuracies and/or errors lies entirely with my work. I
   would like to thank the following people for their part in
   contributing to this FAQ:
   Dave Butenhof 
   Bil Lewis 
   
3.    What are threads?
                                       
   A thread is an encapsulation of the flow of control in a program. Most
   people are used to writing single-threaded programs - that is,
   programs that only execute one path through their code "at a time".
   Multithreaded programs may have several threads running through
   different code paths "simultaneously".
   
   Why are some phrases above in quotes? In a typical process in which
   multiple threads exist, zero or more threads may actually be running
   at any one time. This depends on the number of CPUs the computer on
   which the process is running, and also on how the threads system is
   implemented. A machine with _n_ CPUs can, intuitively enough, run no
   more than _n_ threads in parallel, but it may give the appearance of
   running many more than _n_ "simultaneously", by sharing the CPUs among
   threads.
   
3.1.  Why are threads interesting?

   A context switch between two threads in a single process is
   _considerably_ cheaper than a context switch between two processes. In
   addition, the fact that all data except for stack and registers are
   shared between threads makes them a natural vehicle for expressing
   tasks that can be broken down into subtasks that can be run
   cooperatively.
   
3.2.  A little history

   If you are interested in reading about the history of threads, see the
   relevant section of the comp.os.research FAQ at
   .
   
4.    What are the main families of threads?
                                       
   There are two main families of threads:
     * POSIX-style threads, which generally run on Unix systems.
     * Microsoft-style threads, which generally run on PCs.
       
   These families can be further subdivided.
   
4.1.  POSIX-style threads

   This family consists of three subgroups:
     * "Real" POSIX threads, based on the IEEE POSIX 1003.1c-1995 (also
       known as the ISO/IEC 9945-1:1996) standard, part of the ANSI/IEEE
       1003.1, 1996 edition, standard. POSIX implementations are, not
       surprisingly, the emerging standard on Unix systems.
          + POSIX threads are usually referred to as Pthreads.
          + You will often see POSIX threads referred to as POSIX.1c
            threads, since 1003.1c is the section of the POSIX standard
            that deals with threads.
          + You may also see references to draft 10 of POSIX.1c, which
            became the standard.
     * DCE threads are based on draft 4 (an early draft) of the POSIX
       threads standard (which was originally named 1003.4a, and became
       1003.1c upon standardisation). You may find these on some Unix
       implementations.
     * Unix International (UI) threads, also known as Solaris threads,
       are based on the Unix International threads standard (a close
       relative of the POSIX standard). The only major Unix variants that
       support UI threads are Solaris 2, from Sun, and UnixWare 2, from
       SCO.
       
   Both DCE and UI threads are fairly compatible with the POSIX threads
   standard, although converting from either to "real" POSIX threads will
   require a moderate amount of work.
   
   Those few tardy Unix vendors who do not yet ship POSIX threads
   implementations are expected to do so "real soon now". If you are
   developing multithreaded applications from scratch on Unix, you would
   do well to use POSIX threads.
   
4.2.  Microsoft-style threads

   This family consists of two subgroups, both originally developed by
   Microsoft.
     * WIN32 threads are the standard threads on Microsoft Windows 95 and
       Windows NT.
     * OS/2 threads are the standard threads on OS/2, from IBM.
       
   Although both of these were originally implemented by Microsoft, they
   have diverged somewhat over the years. Moving from one to the other
   will require a moderate amount of work.
   
4.3.  Others

   Mach and its derivatives (such as Digital UNIX) provide a threads
   package called C threads. This is not very widely used.
   
5.    Some terminology
                                       
   The terms here refer to each other in a myriad of ways, so the best
   way to navigate through this section is to read it, and then read it
   again. Don't be afraid to skip forwards or backwards as the need
   appears.
   
5.1.  (DCE, POSIX, UI) Async safety

   Some library routines can be safely called from within signal
   handlers; these are referred to as async-safe. A thread that is
   executing some async-safe code will not deadlock if it is interrupted
   by a signal. If you want to make some of your own code async-safe, you
   should block signals before you obtain any locks.
   
5.2.  Asynchronous and blocking system calls

   Most system calls, whether on Unix or other platforms, block (or
   "suspend") the calling thread until they complete, and continue its
   execution immediately following the call. Some systems also provide
   asynchronous (or _non-blocking_) forms of these calls; the kernel
   notifies the caller through some kind of out-of-band method when such
   a system call has completed.
   
   Asynchronous system calls are generally much harder for the programmer
   to deal with than blocking calls.
   
5.3.  Context switch

   A context switch is the action of switching a CPU between executing
   one thread and another (or transferring control between them). This
   may involve crossing one or more protection boundary.
   
5.4.  Critical section

   A critical section of code is one in which data that may be accessed
   by other threads are inconsistent. At a higher level, a critical
   section can be viewed as a section of code in which a guarantee you
   make to other threads about the state of some data may not be true.
   
   If other threads can access these data during a critical section, your
   program may not behave correctly. This may cause it to crash, lock up,
   produce incorrect results, or do just about any other unpleasant thing
   you care to imagine.
   
   Other threads are generally denied access to inconsistent data during
   a critical section (usually through use of locks). If some of your
   critical sections are too long, however, it may result in your code
   performing poorly.
   
5.5.  Lightweight process

   A lightweight process (also known in some implementations,
   confusingly, as a _kernel thread_) is a schedulable entity that the
   kernel is aware of. On most systems, it consists of some execution
   context and some accounting information (i.e. much less than a
   full-blown process).
   
   Several operating systems allow lightweight processes to be "bound" to
   particular CPUs; this guarantees that those threads will only execute
   on the specified CPUs.
   
5.6.  MT safety

   If some piece of code is described as MT-safe, this indicates that it
   can be used safely within a multithreaded program, _and_ that it
   supports a "reasonable" level of concurrency. This isn't very
   interesting; what you, as a programmer using threads, need to worry
   about is code that is _not_ MT-safe. MT-unsafe code may use global
   and/or static data. If you need to call MT-unsafe code from within a
   multithreaded program, you may need to go to some effort to ensure
   that only one thread calls that code at any time.
   
   Wrapping a global lock around MT-unsafe code will generally let you
   call it from within a multithreaded program, but since this does not
   permit concurrent access to that code, it is not considered to make it
   MT-safe.
   
   If you are trying to write MT-safe code using POSIX threads, you need
   to worry about a few issues such as dealing correctly with locks
   across calls to fork(2) (if you are wondering what to do, read about
   the pthread_atfork(3) library call).
   
5.7.  Protection boundary

   A protection boundary protects one software subsystem on a computer
   from another, in such a way that only data that is explicitly shared
   across such a boundary is accessible to the entities on both sides. In
   general, all code within a protection boundary will have access to all
   data within that boundary.
   
   The canonical example of a protection boundary on most modern systems
   is that between processes and the kernel. The kernel is protected from
   processes, so that they can only examine or change its internal state
   in certain strictly-defined ways.
   
   Protection boundaries also exist between individual processes on most
   modern systems. This prevents one buggy or malicious process from
   wreaking havoc on others.
   
   Why are protection boundaries interesting? Because transferring
   control across them is expensive; it takes a lot of time and work.
   
5.8.  Scheduling

   Scheduling involves deciding what thread should execute next on a
   particular CPU. It is usually also taken as involving the context
   switch to that thread.
   
6.    What are the different kinds of threads?
                                       
   There are two main kinds of threads implementations:
     * User-space threads, and
     * Kernel-supported threads.
       
   There are several sets of differences between these different threads
   implementations.
   
6.1.  Architectural differences

   User-space threads live without any support from the kernel; they
   maintain all of their state in user space. Since the kernel does not
   know about them, they cannot be scheduled to run on multiple
   processors in parallel.
   
   Kernel-supported threads fall into two classes.
     * In a "pure" kernel-supported system, the kernel is responsible for
       scheduling all threads.
     * Systems in which the kernel cooperates with a user-level library
       to do scheduling are known as _two-level_, or _hybrid_, systems.
       Typically, the kernel schedules LWPs, and the user-level library
       schedules threads onto LWPs.
       
   Because of its performance problems (caused by the need to cross the
   user/kernel protection boundary twice for _every_ thread context
   switch), the former class has fewer members than does the latter (at
   least on Unix variants). Both classes allow threads to be run across
   multiple processors in parallel.
   
6.2.  Performance differences

   In terms of context switch time, user-space threads are the fastest,
   with two-level threads coming next (all other things being equal).
   However, if you have a multiprocessor, user-level threads can only be
   run on a single CPU, while both two-level and pure kernel-supported
   threads can be run on multiple CPUs simultaneously.
   
6.3.  Potential problems with functionality

   Because the kernel does not know about user threads, there is a danger
   that ordinary blocking system calls will block the entire process
   (this is _bad_) rather than just the calling thread. This means that
   user-space threads libraries need to jump through hoops in order to
   provide "blocking" system calls that don't block the entire process.
   
   This problem also exists with two-level kernel-supported threads,
   though it is not as acute as for user-level threads. What usually
   happens here is that system calls block entire LWPs. This means that
   if more threads exist than do LWPs and all of the LWPs are blocked in
   system calls, then other threads that could potentially make forward
   progress are prevented from doing so.
   
   The Solaris threads library provides a reasonable solution to this
   problem. If the kernel notices that all LWPs in a process are blocked,
   it sends a signal to the process. This signal is caught by the
   user-level threads library, which can create another LWP so that the
   process will continue to make progress.
   
7.    Where can I find books on threads?
                                       
   There are several books available on programming with threads, with
   more due out in the near future. Note also that the programmer's
   manuals that come with most systems that provide threads packages will
   have sections on using those threads packages.
   
7.1.  POSIX-style threads

   David R. Butenhof, _Programming with POSIX Threads_. Addison-Wesley,
          ISBN 0-201-63392-2.
          This book gives a comprehensive and well-structured overview of
          programming with POSIX threads, and is a good text for the
          working programming to work from. Detailed examples and
          discussions abound.
          
   Steve Kleiman, Devang Shah and Bart Smaalders, _Programming With
          Threads_. SunSoft Press, ISBN 0-13-172389-8.
          
          This book goes into considerably greater depth than the other
          SunSoft Press offering (see below), and is also recommended for
          the working programmer who expects to deal with threads on a
          day-to-day basis. It includes many detailed examples.
          
   Bil Lewis and Daniel J. Berg, _Threads Primer_. SunSoft Press,
          ISBN 0-13-443698-9.
          
          
          This is a good introduction to programming with threads for
          programmers and managers. It concentrates on UI and POSIX
          threads, but also covers use of OS/2 and WIN32 threads.
          
   Charles J. Northrup, _Programming With Unix Threads_. John Wiley &
          Sons, ISBN 0-471-13751-0.
          
          This book details the UI threads interface, focusing mostly on
          the Unixware implementation. This is an introductory book.
          
7.2.  Microsoft-style threads

   Jim Beveridge, Robert Wiener, _Multithreading Applications in Win32_.
          Addison-Wesley, ISBN 0-201-44234-5.
          .
          Seasoned Win32 programmers, neophytes, and programmers being
          dragged kicking and screaming from the Unix world are all
          likely to find this book a useful resource. It doubles as
          primer and reference on writing and debugging robust
          multithreaded code, and provides a thorough exposition on the
          subject.
          
   Len Dorfman, Marc J. Neuberger, _Effective Multithreading with OS/2_.
          Publisher and ISBN unknown.
          This book covers the OS/2 threads API and contains many
          examples, but doesn't have much by way of concepts.
          
   Thuan Q. Pham, Pankaj K. Garg, _Multithreaded Programming with
          Windows NT_. Prentice Hall, ISBN 0-131-20643-5.
          
          Not surprisingly, this book focuses on WIN32 threads, but it
          also mentions other libraries in passing. It also deals with
          some relatively advanced topics, and has a thorough
          bibliography.
          
7.3.  Books on implementations

   If you are interested in how modern operating systems support threads
   and multiprocessors, there are a few excellent books available that
   may be of interest to you.
   
   Curt Schimmel, _Unix Systems for Modern Architectures_.
          Addison-Wesley, ISBN 0-201-63338-8.
          
          This book gives a lucid account of the work needed to get Unix
          (or, for that matter, more or less anything else) working on a
          modern system that incorporates multiple processors, each with
          its own cache. While it has some overlap with the Vahalia book
          (see below), it has a smaller scope, and thus deals with shared
          topics in more detail.
          
   Uresh Vahalia, _Unix Internals: the New Frontiers_. Prentice Hall,
          ISBN 0-13-101908-2.
          
          This is the best kernel internals book currently available. It
          deals extensively with building multithreaded kernels,
          implementing LWPs, and scheduling on multiprocessors. Given a
          choice, I would buy _both_ this and the Schimmel book.
          
   Ben Catanzaro, _Multiprocessor System Architectures_. SunSoft Press,
          ISBN 0-13-089137-1.
          
          I don't know much about this book, but it deals with both the
          hardware and software (kernel and user) architectures used to
          put together modern multiprocessor systems.
          
7.4.  The POSIX threads standard

   To order ISO/IEC standard 9945-1:1996, which is also known as
   ANSI/IEEE POSIX 1003.1-1995 (and includes 1003.1c, the part that deals
   with threads), you can call +1-908-981-1393. The document reference
   number in the IEEE publications catalogue is SH 94352-NYF, and the
   price to US customers is $120 (shipping overseas costs extra).
   
   Unless you are implementing a POSIX threads package, you should not
   ever need to look at the POSIX threads standard. It is the last place
   you should look if you wish to learn about threads!
   
   Neither IEEE nor ISO makes standards available for free; please do not
   ask whether the POSIX threads standard is available on the Web. It
   isn't.
   
8.    Where can I obtain training on using threads?
                                       
   Organisation Contact Description
   Sun Microsystems 
   +1-408-276-3630 Classes at Sun and on-site classes
   Lambda Computer Science
   (Bil Lewis) 
   +1-415-328-8952 Seminars and on-site classes
   Phoenix Technologies
   (Chris Crenshaw) 
   +1-908-286-2118
   Marc Staveley 
   
9.    (Unix) Are there any freely-available threads packages?
                                       
     * Xavier Leroy  has written a POSIX threads
       implementation for Linux 2.x that uses pure kernel-supported
       threads. While the package is currently in alpha testing, it is
       allegedly very stable. For more information, see
       .
     * Michael T. Peterson  has written a user-space
       POSIX and DCE threads package for Intel-based Linux systems; it is
       called PCthreads. See 
       for more information.
     * Christopher Provenzano  has written a fairly
       portable implementation of draft 8 of the POSIX threads standard.
       See  for
       further details. _Note_: as far as I can see, development of this
       library has halted (at least temporarily), and it still contains
       many serious bugs.
     * Georgia Tech's OS group has a fairly portable user-level threads
       implementation of the Mach C threads package. It is called
       Cthreads, and can be found at
       .
     * Frank Müller, of the POSIX / Ada-Runtime Project (PART) has made
       available an implementation of draft 6 of the POSIX 1003.4a
       Pthreads specification, which runs under SunOS 4, Solaris 2.x,
       SCO Unix, FreeBSD and Linux. For more information, see
       .
     * Elan Feingold has written a threads package called ethreads; I
       don't know anything about it, other than that it is available from
       .
     * QuickThreads is a toolkit for building threads packages, written
       by David Keppel . It is available from
       , with an
       accompanying tech report at
       
       . The code as distributed includes ports for the Alpha, x86,
       88000, MIPS, SPARC, VAX, and KSR1.
       
10.   (DCE, POSIX, UI) Why does my threaded program not handle signals sensibly?
                                       
   Signals and threads do not mix well. A lot of programmers start out by
   writing their code under the mistaken assumption that they can set a
   signal handler for each thread; this is not the way things work. You
   can _block_ or _unblock_ signals on a thread-by-thread basis, but this
   is not the same thing.
   
   When it comes to dealing with signals, the best thing you can do is
   create a thread whose sole purpose is to handle signals for the entire
   process. This thread should loop calling sigwait(2); this allows it to
   deal with signals synchronously. You should also make sure that all
   threads (_including_ the one that calls sigwait) have the signals you
   are interested in handling blocked. Handling signals synchronously in
   this way greatly simplifies things.
   
   Note, also, that sending signals to other threads within your own
   process is not a friendly thing to do, unless you are careful with
   signal masks. For an explanation, see the section on asynchronous
   cancellation.
   
   Finally, using sigwait and installing signals handlers for the signals
   you are sigwaiting for is a bad idea.
   
11.   (DCE?, POSIX) Why does everyone tell me to avoid asynchronous cancellation?
                                       
   Asynchronous cancellation of threads is, in general, evil. The reason
   for this is that it is usually (very) difficult to guarantee that the
   recipient of an asynchronous cancellation request will not be in a
   critical section. If a thread should die in the middle of a critical
   section, this will very likely cause your program to misbehave.
   
   Code that can deal sensibly with asynchronous cancellation requests is
   _not_ referred to as async-safe; that means something else (see the
   terminology section of the FAQ). You won't see much code around that
   handles asynchronous cancellation requests properly, and you shouldn't
   try write any of your own unless you have compelling reasons to do so.
   Deferred cancellation is your friend.
   
12.   Why are reentrant library and system call interfaces good?
                                       
   There are two approaches to providing system calls and library
   interfaces that will work with multithreaded programs. One is to
   simply wrap all the appropriate code with mutexes, thereby
   guaranteeing that only one thread will execute any such routine at a
   time.
   
   While this approach mostly works, it provides terrible performance.
   For functions that maintain state across multiple invocations
   (e.g. strtok() and friends), this approach simply doesn't work at all,
   hence the existence of "_r" interfaces on many Unix systems (see
   below).
   
   A better solution is to ensure that library calls can safely be
   performed by multiple threads at once.
   
12.1. (DCE, POSIX, UI) When should I use thread-safe "_r" library calls?

   If your system provides threads, it will probably provide a set of
   thread-safe variants of standard C library routines. A small number of
   these are mandated by the POSIX standard, and many Unix vendors
   provide their own useful supersets, including functions such as
   gethostbyname_r().
   
   Unfortunately, the supersets that different vendors support do not
   necessarily overlap, so you can only _safely_ use the standard
   POSIX-mandated functions. The thread-safe routines are conceptually
   "cleaner" than their stateful counterparts, though, so it is good
   practice to use them wherever and whenever you can.
   
13.   (POSIX) How can I perform a join on any thread?
                                       
   UI threads allow programmers to join on any thread that happens to
   terminate by passing the appropriate argument to thr_join(). This is
   not possible under POSIX and, yes, there is a rationale behind the
   absence of this feature.
   
   Unix programmers are used to being able to call wait() in such a way
   that it will return when "any" process exits, but expecting this to
   work for threads can cause confusion for programmers trying to use
   threads. The important thing to note here is that Unix processes are
   based around a notion of parent and child; this is a notion that is
   _not_ present in most threads systems. Since threads don't contain
   this notion, joining on "any" thread could have the undesirable effect
   of having the join return once a completely unrelated thread happened
   to exit.
   
   In many (perhaps even most) threaded applications, you do not want to
   be able to join with any thread in your process. Consider, for
   example, a library call that one of your threads might make, which in
   its turn might start a few threads and try to join on them. If another
   of your threads, joining on "any" thread, happened to join on one of
   the library call's threads, that would lead to incorrect program
   behaviour.
   
   If you want to be able to join on any thread so that, for example, you
   can keep track of the number of running threads, you can achieve the
   same functionality by starting detached threads and having them
   decrememnt a (suitably locked, of course) counter as they exit.
   
14.   (DCE, UI, POSIX) After I create a certain number of threads, my program
                                    crashes
                                       
   By default, threads are created non-detached. You need to perform a
   join on each non-detached thread, or else storage will never be freed
   up when they exit. As an alternative, you can create detached threads,
   for which storage will be freed as soon as they exit. This latter

Section 1 of 2 - Prev - Next

Back to category threads-faq - Use Smart Search
Home - Smart Search - About the project - Feedback

© allanswers.org | Terms of use

LiveInternet