Many Enumerable Returns
As threatened then, here’s the followup to my last post on the
#Enumerables
section from Ruby Monk, how I felt like a bit of an
idiot for a few hours, and what I learned from that.
tl;dr - enumerable blocks aren’t magic; yield
is magic.
This question is a little further along from the last, and was framed so:
Try implementing a method called occurrences that accepts a string argument and uses inject to build a Hash. The keys of this hash should be unique words from that string. The value of those keys should be the number of times this word appears in that string.
So far so, so good. So I wrote this:
def occurrences(str)
str.scan(/\w+/).inject(Hash.new(0)) do |hashy, i|
hashy[i.downcase] += 1
end
end
Which spat out:
TypeError can't convert String into Integer
And left me confused for a good few minutes. OK, getting on for a quarter of an hour. What was going on? - what I’d written was very similar to the example above:
[4, 8, 15, 16, 23, 42].inject({}) { |a, i| a.update(i => i) }
So I caved and looked at the answer:
def occurrences(str)
str.scan(/\w+/).inject(Hash.new(0)) do |build, word|
build[word.downcase] +=1
build
end
end
Which left me none the wiser. Why was the block re-iterating the accumulator
function at the end? To test this I played around with p
-ing the lines of the
block… and discovered something interesting. Namely,
a.update(i => i) # => a
But…
build[word.downcase] +=1 # => build[word.downcase], the new value of that key
The block needs to return the accumulator - the first example is just lucky that it does so already!
The only reason the accumulator in an Enumerable#inject
accumulates is that
it’s returned from the block on each iteration. In other words, somewhere in
the definition of #inject
for each class that can be made enumerable, the
method yield
s to the block, and then keeps the value returned to be passed in
again as the new accumulator argument.
I’d previously thought of #inject
as working by magic, whereas in fact it
was working by a method I could probably write myself given enough time.
Something like this…
bob = [1,2,3,4,5,6]
def bob.inject(default = nil)
accumulator = default || self[0]
if default
self.each do |element|
accumulator = yield(accumulator, element)
end
else
self.drop(1).each do |element|
accumulator = yield(accumulator, element)
end
end
puts "all adds up to: " # just to prove it's this method being
# called, not the superclasses...
p accumulator
end
Which gives us such fun as:
bob.inject() {|a,e| a += e}
# => all adds up to: 21
bob.inject(10) {|a,e| a += e}
# => all adds up to: 31
bob.inject([]) {|a,e| a << e**2}
# => all adds up to: [1, 4, 9, 16, 25, 36]
bob.inject({}) {|a,e| a[e] = "x"*e; a}
# => {5=>"xxxxx", 6=>"xxxxxx", 1=>"x", 2=>"xx", 3=>"xxx", 4=>"xxxx"}
I relied on #each
here, but we could easily write an each
method using
a for... in...
loop or similar. The genius is in yield
, which is the real
magic that’s going on here.
Ruby Monk has more about the magic of yield, and why it’s weird in a language that professes that everything is an object. Like a lot in Ruby, I discovered a small thing didn’t work, patiently played with it until I found out why, and then ‘worked’ that small new piece of knowledge to give me greater insight into what was going on. I’m finding this to be the most satisfying method to learn by, both because it makes me feel like I’m learning to a deeper degree than I would by just reading the answers out of a book, and in addition, when the books do cover the subject, I can better apply what’s written there to what I’ve seen in action.
postscipt - 27⁄714
Of course, David Black covers the same ground, but better (gets each
off the
ground using a for
loop), in chapter 6 of The Well Grounded Rubyist. Love that
book.