How to use #each_with_object in Ruby, when to use it , and when not to!

Ciaran Morinan
7 min readSep 16, 2018

I am continually delighted at the methods and syntax in Ruby which allow you to carry out complex operations with minimal lines of code.

It will never cease to make me happy that you can achieve, in one clean line, the following: go through a collection and make an array of the results of calling a method on each item in it, then remove duplicate values from that array, then remove any nil value, and return it to me.

my_collection.map(&:some_method).uniq.compact

But today I would like to talk (mainly) about #each_with_object. This handy method offers a superior alternative to using #each — in some situations.

I was curious about how fast #each_with_object was compared to #each, #map, and other alternatives, and so I’ve run a few benchmarks for different use cases.

At the end I finish by suggesting some principles for when to use #each_with_object, and when not to — but I welcome your views!

How to use #each_with_object

#each_with_object, like #each, allows you to iterate over each element of a collection, but also lets you specify an object of your choosing to play with inside the method (a ‘memo object’), and returns that object afterwards.

If you need to create a new object to store the results of your iteration — a hash, say — #each_with_object saves you the bother of assigning one beforehand, and saves you having to return the new object after the method (unlike #each, which returns the original collection you were iterating over). That’s two lines of code saved! It also has the advantage of keeping everything within the scope of the method.

A typical use case is a counter, for example to count the number of times a particular value appears in an array. Here’s a simple counter using #each:

…and here’s the #each_with_object version:

The #each_with_object syntax builds on #each in two ways:

  1. You specify the new object you want as an argument (the ‘memo object’)
  2. You specify a variable name for referring to that object — after the one(s) you’d specify in the code block anyway to refer to items in the collection.

So, if the original collection is an array you specify two local variables. The first refers to each element in the array, and the second to your new object:

arr.each { |element| do_code } #=> arr                  # becomes...arr.each_with_object({}) { |element, new_obj| do_code }  #=> new_obj

And if the original collection is a hash, you end up with three local variables — the (key, value) pair you need to refer to each item in the original hash, wrapped in parentheses, followed by your new object:

hash.each { |key, value| do_something } #=> hash        # becomes...hash.each_with_object({}) { |(key, value), new_obj| do_something } #=> new_obj

Note that:

  1. In the ‘counter’ example above, I’ve used the Hash.new(0) syntax to default each value to 0 so Ruby doesn’t complain when we try and add 1 to something that doesn’t exist — but you can just specify a hash as {} if you don’t need it to have a default value.
  2. The new object doesn’t have to be a hash — but it can’t be numbers, true, or false. If you want to start with a number and perform mathematical operations on it, you are probably looking for #inject (but watch out for the #inject syntax, where the variable for the memo object comes first!)

Consider using #each_with_object to save you having to separately assign and return a new object on either side of a call to #each, and keep the scope tighter— but don’t assume that it if you can use it, you should.

Which is faster: #each_with_object or #each?

When I found out about #each_with_object, I was happy. There I was, separately assigning a new hash, calling #each, and then having to explicitly return the hash like a chump, needlessly adding lines to my code. #each_with_object saved me from all that.

But I was curious as to whether efficiency on the screen translated to efficiency in processing — or if it came at a cost.

So I ran some tests to benchmark #each versus #each_with_object, using the counter example above. We’re counting an array of fruit:

fruit_basket = %w[apple pear banana banana apple pear apple apple banana apple grape grape]   # the w% tells Ruby these are strings

I threw in two other ways of achieving the same thing using #inject, and #inject with #merge!.

I ran the tests with (a) an array 20 times the size of the one above, with 100,000 method calls, and (b) an array 100,000 times the size of the above, with 20 method calls. The rankings didn’t change. Here’s a typical result for the latter. The final column shows how many seconds each approach took.

                           user     system      total         real#each counter          4.440000   0.010000   4.450000  (  4.462592)#each_with_object      5.120000   0.010000   5.130000  (  5.152266)#inject counter        5.270000   0.020000   5.290000  (  5.326747)#inject with #merge!  23.040000   0.870000  23.910000  ( 24.101621)

#each was consistently the fastest. #each_with_object was usually faster than #inject, but not by much. Using #merge! within #inject was atrocious. You could also achieve the same thing using #merge within #inject (without the ! bang operator), if you didn’t have anywhere to be in the next few months.

I also tested #each versus #each_with_object on a more complex collection, counting through nested hashes. #each was consistently faster there too.

You’re not using Ruby because you need to shave every last millisecond off processing time, and #each_with_object is still a welcome tool for creating beautiful code — but if you’re going to be working with large collections or need to handle a high volume of method calls, it’s worth testing out different approaches for your specific usage to confirm their time cost.

I was a bit disappointed that #each_with_object tested slower than #each — if anything, I assumed they’d work out basically the same. This got me worried about the performance of a favourite in Ruby’s abstraction arsenal: #map.

Should you use #each_with_object, #each or #map?

#map is great for making an array out of the results of executing some code on a collection, and automatically returning that array at the end. It’s a bit like #each_with_object except the object is always an array, and you don’t explicitly add things to the array inside the code block — whatever the code block returns on each iteration will automatically be added to it.

my_collection.map { |element| whatever this block of code returns gets added to a new array }   #=> the new array is returned

A typical use case is to get a list of the all the values for one particular property in a collection. For example, to get every ‘name’ for a bunch of objects:

my_collection.map { |element| element.name }

…and if all your code block will do is call one method on each element (like the one above), you can just send that method as a symbol using the & syntax:

my_collection.map(&:name)

I love #map, and I want to be able to love it unconditionally. After #each tested faster than #each_with_object, I became worried about #map and whether its beauty came at a cost. So I ran some more tests. We want to create and return an array that lists all the lengths of words in our fruit basket.

As before, I tested with collections and volumes big and small. The rankings were consistent. Here is a typical result:

                           user     system      total         real#each mapper           4.010000   0.270000   4.280000  (  4.298923)#each_with_object      5.520000   0.280000   5.800000  (  5.835564)#map approach          3.380000   0.090000   3.470000  (  3.480796)short #map approach    3.480000   0.080000   3.560000  (  3.570483)

Hooray! #map consistently outperforms #each, at least for this operation. Poor old #each_with_object lags behind.

(Side note: the ‘short #map approach’ using the &:method syntax is a little slower than calling #map and sending the method call inside a { code block } — this is to be expected. The & is for sending a Proc to a method where you’d normally send a block. When you use & to send something that isn’t a Proc, like #length, Ruby first has to call to_proc on it to convert it. This adds time.)

So #map not only allows for cleaner, shorter code than #each, but also gets the job done faster too. The same can’t be said for #each_with_object, sadly.

If you want to do something to a collection and store the results in a new array, #map is probably going to do the job for you. Remember that you can execute any code you want inside a #map code block, so long as whatever it returns (the last line evaluated) is what you want added to the new array.

There may be situations where you want more control over what is added to the array — only adding to it if a conditional is met, for example. You can still achieve this with #map by using an if statement, but would need to strip out the nil values it would add when the conditional wasn’t met (because the line would evaluate to nil). These return the same result:

For more complex requirements, you might need to use #each_with_object (or #each) rather than #map. But be sure that you do actually need to use it!

When to use #each_with_object — my conclusions

It may not be the fastest approach, but I still like #each_with_object.

I like the fact that the creation, manipulation and return of the object is contained entirely within the method — avoiding the potential for conflicts introduced by having to assign and return a variable outside of #each to achieve the same thing (maybe a minor risk, but a real one nonetheless).

It is also pleasing to be able to use a method which does exactly the thing you want it to do, without needing to bolt instructions on either side of it. But given its speed penalty, and the alternatives available, I suggest the following principles for its use:

  1. Do use #each_with_object if you want to create and return a hash which stores the results of iterating over a collection — and speed isn’t an issue.
  2. Don’t use #each_with_object if you want to create an array for storing results — unless you’re absolutely sure that #map won’t do the job.
  3. Do test different approaches if speed is an issue!

You may disagree — please let me know if so! I am still in the early stages of learning to code and so I welcome feedback, thoughts and corrections.

You can access the code used to run the tests here:

Thank you for reading.

— Ciaran

--

--