Skip to main content
Engineering LibreTexts

11.2: The Varieties of Collections

  • Page ID
    39634
    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    Some of the key collection classes in Pharo.
    Figure \(\PageIndex{1}\): Some of the key collection classes in Pharo.

    To make good use of the collection classes, the reader needs at least a superficial knowledge of the wide variety of collections that they implement, and their commonalities and differences.

    Programming with collections using high-order functions rather than individual elements is an important way to raise the level of abstraction of a program. The Lisp function map, which applies an argument function to every element of a list and returns a new list containing the results is an early example of this style. Following its Smalltalk root, Pharo adopts this collection-based high-order programming as a central tenet. Modern functional programming languages such as ML and Haskell have followed Smalltalk’s lead.

    Why is this a good idea? Suppose you have a data structure containing a collection of student records, and wish to perform some action on all of the students that meet some criterion. Programmers raised to use an imperative language will immediately reach for a loop. But the Pharo programmer will write:

    students
        select: [ :each | each gpa < threshold ]
    

    This expression returns a new collection containing precisely those elements of students for which the block (the bracketed function) returns true. The block can be thought of as a lambda-expression defining an anonymous function x. x gpa < threshold. This code has the simplicity and elegance of a domain-specific query language.

    The message select: is understood by all collections in Pharo. There is no need to find out if the student data structure is an array or a linked list: the select: message is understood by both. Note that this is quite different from using a loop, where one must know whether students is an array or a linked list before the loop can be set up.

    In Pharo, when one speaks of a collection without being more specific about the kind of collection, one means an object that supports well-defined protocols for testing membership and enumerating the elements. All collections understand the testing messages includes:, isEmpty and occurrencesOf:. All collections understand the enumeration messages do:, select:, reject: (which is the opposite of select:), collect: (which is like Lisp’s map), detect:ifNone:, inject:into: (which performs a left fold) and many more. It is the ubiquity of this protocol, as well as its variety, that makes it so powerful.

    The table below summarizes the standard protocols supported by most of the classes in the collection hierarchy. These methods are defined, redefined, optimized or occasionally even forbidden by subclasses of Collection.

    Table \(\PageIndex{1}\): Standard collection protocols.
    Protocol Methods
    accessing size, capacity, at:, at:put:
    testing isEmpty, includes:, contains:, occurrencesOf:
    adding add:, addAll:
    removing remove:, remove:ifAbsent:, removeAll:
    enumerating do:, collect:, select:, reject:
    detect:, detect:ifNone:, inject:into:
    converting asBag, asSet, asOrderedCollection, asSortedCollection, asArray, asSortedCollection:
    creation with:, with:with:, with:with:with:, with:with:with:with:, withAll:

    Beyond this basic uniformity, there are many different kinds of collection either supporting different protocols, or providing different behaviour for the same requests. Let us briefly survey some of the key differences:

    Sequenceable: Instances of all subclasses of SequenceableCollection start from a first element and proceed in a well-defined order to a last element. Instances of Set, Bag and Dictionary, on the other hand, are not sequenceable.

    Sortable: A SortedCollection maintains its elements in sort order.

    Indexable: Most sequenceable collections are also indexable, that is, elements can be retrieved with message at: anIndex. Array is the familiar indexable data structure with a fixed size; anArray at: n retrieves the nth element of anArray, and anArray at: n put: v changes the nth element to v. LinkedLists and SkipLists are sequenceable but not indexable, that is, they understand first and last, but not the message at:.

    Keyed: Instances of Dictionary and its subclasses are accessed by keys instead of indices.

    Mutable: Most collections are mutable, but Intervals and Symbols are not. An Interval is an immutable collection representing a range of Integers. For example, 5 to: 16 by: 2 is an interval that contains the elements 5, 7, 9, 11, 13 and 15. It is indexable with message at: anIndex, but cannot be changed with message at: anIndex put: aValue.

    Growable: Instances of Interval and Array are always of a fixed size. Other kinds of collections (sorted collections, ordered collections, and linked lists) can grow after creation. The class OrderedCollection is more general than Array; the size of an OrderedCollection grows on demand, and it defines messages addFirst: anElement and addLast: anElement as well as messages at: anIndex and at: anIndex put: aValue.

    Accepts duplicates: A Set filters out duplicates, but a Bag does not. Classes Dictionary, Set and Bag use the = method provided by the elements; the Identity variants of these classes use the == method, which tests whether the arguments are the same object, and the Pluggable variants use an arbitrary equivalence relation supplied by the creator of the collection.

    Heterogeneous: Most collections will hold any kind of element. A String, CharacterArray or Symbol, however, only holds Characters. An Array will hold any mix of objects, but a ByteArray only holds Bytes. A LinkedList is constrained to hold elements that conform to the Link accessing protocol.


    This page titled 11.2: The Varieties of Collections is shared under a CC BY-SA 3.0 license and was authored, remixed, and/or curated by via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit history is available upon request.