10.3: Streaming over Collections
- Page ID
- 36386
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)Streams are really useful when dealing with collections of elements. They can be used for reading and writing elements in collections. We will now explore the stream features for the collections.
Reading collections
This section presents features used for reading collections. Using a stream to read a collection essentially provides you a pointer into the collection. That pointer will move forward on reading and you can place it wherever you want. The class ReadStream
should be used to read elements from collections.
Methods next
and next:
are used to retrieve one or more elements from the collection.
stream := ReadStream on: #(1 (a b c) false). stream next. → 1 stream next. → #(#a #b #c) stream next. → false
stream := ReadStream on: 'abcdef'. stream next: 0. → '' stream next: 1. → 'a' stream next: 3. → 'bcd' stream next: 2. → 'ef'
The message peek
is used when you want to know what is the next element in the stream without going forward.
stream := ReadStream on: '-143'. negative := (stream peek = $-). "look at the first element without reading it" negative. → true negative ifTrue: [stream next]. "ignores the minus character" number := stream upToEnd. number. → '143'
This code sets the boolean variable negative according to the sign of the number in the stream and number
to its absolute value. The method upToEnd
returns everything from the current position to the end of the stream and sets the stream to its end. This code can be simplified using peekFor:
, which moves forward if the following element equals the parameter and doesn’t move otherwise.
stream := '-143' readStream. (stream peekFor: $-) → true stream upToEnd → '143'
peekFor:
also returns a boolean indicating if the parameter equals the element.
You might have noticed a new way of constructing a stream in the above example: one can simply send readStream
to a sequenceable collection to get a reading stream on that particular collection.
Positioning. There are methods to position the stream pointer. If you have the index, you can go directly to it using position:
. You can request the current position using position
. Please remember that a stream is not positioned on an element, but between two elements. The index corresponding to the beginning of the stream is 0.
You can obtain the state of the stream depicted in Figure \(\PageIndex{1}\) with the following code:
stream := 'abcde' readStream. stream position: 2. streampeek → $c
To position the stream at the beginning or the end, you can use reset
or setToEnd
. skip:
and skipTo:
are used to go forward to a location relative to the current position: skip:
accepts a number as argument and skips that number of elements whereas skipTo:
skips all elements in the stream until it finds an element equal to its parameter. Note that it positions the stream after the matched element.
stream := 'abcdef' readStream. stream next. → $a "stream is now positioned just after the a" stream skip: 3. "stream is now after the d" stream position. → 4 stream skip: -2. "stream is after the b" stream position. → 2 stream reset. stream position. → 0 stream skipTo: $e. "stream is just after the e now" stream next. → $f stream contents. → 'abcdef'
As you can see, the letter e
has been skipped.
The method contents
always returns a copy of the entire stream.
Testing. Some methods allow you to test the state of the current stream: atEnd
returns true if and only if no more elements can be read whereas isEmpty
returns true if and only if there is no element at all in the collection.
Here is a possible implementation of an algorithm using atEnd
that takes two sorted collections as parameters and merges those collections into another sorted collection:
stream1 := #(1 4 9 11 12 13) readStream. stream2 := #(1 2 3 4 5 10 13 14 15) readStream. "The variable result will contain the sorted collection." result := OrderedCollection new. [stream1 atEnd not & stream2 atEnd not] whileTrue: [stream1 peek < stream2 peek "Remove the smallest element from either stream and add it to the result." ifTrue: [result add: stream1 next] ifFalse: [result add: stream2 next]]. "One of the two streams might not be at its end. Copy whatever remains." result addAll: stream1 upToEnd; addAll: stream2 upToEnd. result. → an OrderedCollection(1 1 2 3 4 4 5 9 10 11 12 13 13 14 15)
Writing to collections
We have already seen how to read a collection by iterating over its elements using a ReadStream
. We’ll now learn how to create collections using WriteStream
s.
WriteStream
s are useful for appending a lot of data to a collection at various locations. They are often used to construct strings that are based on static and dynamic parts as in this example:
stream := String new writeStream. stream nextPutAll: 'This Smalltalk image contains: '; print: Smalltalk allClasses size; nextPutAll: ' classes.'; cr; nextPutAll: 'This is really a lot.'. stream contents. → 'This Smalltalk image contains: 2322 classes. This is really a lot.'
This technique is used in the different implementations of the method printOn:
for example. There is a simpler and more efficient way of creating streams if you are only interested in the content of the stream:
string := String streamContents: [:stream | stream print: #(1 2 3); space; nextPutAll: 'size'; space; nextPut: $=; space; print: 3. ]. string. → '#(1 2 3) size = 3'
The method streamContents:
creates a collection and a stream on that collection for you. It then executes the block you gave passing the stream as a parameter. When the block ends, streamContents:
returns the content of the collection.
The following WriteStream
methods are especially useful in this context:
- nextPut: adds the parameter to the stream;
- nextPutAll: adds each element of the collection, passed as a parameter, to the stream;
- print: adds the textual representation of the parameter to the stream.
There are also methods useful for printing different kinds of characters to the stream like space
, tab
and cr
(carriage return). Another useful method is ensureASpace
which ensures that the last character in the stream is a space; if the last character isn’t a space it adds one.
About Concatenation. Using nextPut:
and nextPutAll:
on a WriteStream
is often the best way to concatenate characters. Using the comma concatenation operator (,) is far less efficient:
[| temp | temp := String new. (1 to: 100000) do: [:i | temp := temp, i asString, ' ']] timeToRun → 115176 "(milliseconds)" [| temp | temp := WriteStream on: String new. (1 to: 100000) do: [:i | temp nextPutAll: i asString; space]. temp contents] timeToRun → 1262 "(milliseconds)"
The reason that using a stream can be much more efficient is that comma creates a new string containing the concatenation of the receiver and the argument, so it must copy both of them. When you repeatedly concatenate onto the same receiver, it gets longer and longer each time, so that the number of characters that must be copied goes up exponentially. This also creates a lot of garbage, which must be collected. Using a stream instead of string concatenation is a well-known optimization. In fact, you can use streamContents:
to help you do this:
String streamContents: [ :tempStream | (1 to: 100000) do: [:i | tempStream nextPutAll: i asString; space]]
Reading and writing at the same time
It’s possible to use a stream to access a collection for reading and writing at the same time. Imagine you want to create an History
class which will manage backward and forward buttons in a web browser. A history would react as in figures from \(\PageIndex{2}\) to \(\PageIndex{8}\).
This behaviour can be implemented using a ReadWriteStream
.
Object subclass: #History instanceVariableNames: 'stream' classVariableNames: '' poolDictionaries: '' category: 'SBE-Streams' History>>initialize super initialize. stream := ReadWriteStream on: Array new.
Nothing really difficult here, we define a new class which contains a stream. The stream is created during the initialize
method.
We need methods to go backward and forward:
History>>goBackward self canGoBackward ifFalse: [self error: 'Already on the first element']. stream skip: -2. ↑ self next. History>>goForward self canGoForward ifFalse: [self error: 'Already on the last element']. ↑ stream next
Until then, the code was pretty straightforward. Now, we have to deal with the goTo:
method which should be activated when the user clicks on a link. A possible solution is:
History>>goTo: aPage stream nextPut: aPage.
This version is incomplete however. This is because when the user clicks on the link, there should be no more future pages to go to, i.e., the forward button must be deactivated. To do this, the simplest solution is to write nil
just after to indicate the history end:
History>>goTo: anObject stream nextPut: anObject. stream nextPut: nil. stream back.
Now, only methods canGoBackward
and canGoForward
have to be implemented.
A stream is always positioned between two elements. To go backward, there must be two pages before the current position: one page is the current page, and the other one is the page we want to go to.
History>>canGoBackward ↑ stream position > 1 History>>canGoForward ↑ stream atEnd not and: [stream peek notNil]
Let us add a method to peek at the contents of the stream:
History>>contents ↑ stream contents
And the history works as advertised:
History new goTo: #page1; goTo: #page2; goTo: #page3; goBackward; goBackward; goTo: #page4; contents → #(#page1 #page4 nil nil)