Preface
Using yield and blocks is what makes Ruby so different from other scripting languages. But in some cases yield can lead to unpredictable behavior and it’s crucial to understand what can go wrong.
Let’s consider next code:
File.open("/etc/hosts", "r") do |f|
content << f.read
end
File is opened, used in block and automatically closed after leaving it. What can be wrong with it?
Threat from return
Let’s create with_file function which mimics File.open behavior:
def with_file(name, &block)
puts "Open file"
f = File.open(name, "r")
yield f
puts "Close file"
f.close
end
And test_yield to use it:
def test_yield
content = ""
with_file("/etc/hosts") do |f|
puts "Read content"
content << f.read
end
content
end
test_yield produces next output:
Open file Read content Close file
Nothing special. Now, more complicated test:
def test_yield_with_return
content = ""
with_file("/etc/hosts") do |f|
puts "Read content"
content << f.read
return content
end
puts "I will be skipped"
end
Run test_yield_with_return and output isn’t so predictable:
Open file Read content
Quite weird. Why post-yield action isn’t triggered? The answer is in quirk behavior of return-statement in blocks. Return from block immediately unwinds stack and exits from surrounding method. In our case it’s test_yield_with_return.
To make it more clear, let’s discuss how everything works in both cases. Consider what happens in case without return(running test_yield):
1. Enter test_yield # def test_yield
2. Execute code till with_file # content = ""
3. .... Enter with_file # with_file("/etc/hosts") do |f|
.... #
.... # def with_file(name, &block)
4. .... Execute code till yield # puts "Open file"
.... # f = File.open(name, "r")
5. ........ Enter block via yield # yield f
6. ........ Execute code in block # puts "Read content"
........ # content << f.read
7. ........ Leave block #
8. .... Execute code after yield # puts "Close file"
.... # f.close
9. .... Leave with_file #
10. Execute code after with_file # content
11. Leave test_yield #
Now how it works with return in block (running test_yield_with_return):
1. Enter test_yield_with_return # def test_yield_with_return
2. Execute code till with_file # content = ""
3. .... Enter with_file # with_file("/etc/hosts") do |f|
.... #
.... # def with_file(name, &block)
4. .... Execute code till yield # puts "Open file"
.... # f = File.open(name, "r")
5. ........ Enter block via yield # yield f
6. ........ Execute till return # puts "Read content"
........ # content << f.read
7. ........ Leave block via return # return content
8. .... Leave with_file #
9. Leave test_yield_with_return #
Now it’s clear, how return affects the whole pipeline. After step 6 it immediately unwinds execution stack and returns control to the point, where test_yield_with_return is called skipping desired post-yield actions.
Such code can easily lead to resource leakage, when file isn’t closed, or database connection isn’t got back to connection-pool.
Let’s put ensure after yield to make it work properly:
def ensured_with_file(name, &block)
puts "Open file"
f = File.open(name, "r")
yield f
ensure
puts "Close file"
f.close
end
def test_yield_and_return_again
ensured_with_file("/etc/hosts") do |f|
puts "Read content"
return f.read
end
end
Now everything looks fine:
Open file Read content
Close file
What’s about exception?
You can say this example is quite contrived because return in block is used rarely. It can be true, but same behavior can be obtained when something in block raises exception:
def test_yield_with_exception
with_file("/etc/hosts") do |f|
puts "Read content"
1 / 0 # oops
end
end
Result:
Open file Read content ZeroDivisionError: divided by 0
File isn’t closed again, and ensure fix this issue as well:
def test_yield_with_exception_handling
ensured_with_file("/etc/hosts") do |f|
puts "Read content"
1 / 0 # oops
end
end
test_yield_with_exception_handling output:
Open file Read content
Close file
ZeroDivisionError: divided by 0
Ensuring everything
But why Matz didn’t make ensuring strategy default for yield? Unfortunately such behavior can be an issue as well. Consider ActiveRecord::Base.create method.
Quite usual code:
User.create do |u|
u.firstname = "Chuck"
u.lastname = "Norris"
u.balance = 1 / 0 #oops
u.email = "gmail@chucknorris.com"
end
If ensure were put into create method, it would create user Chuck record without balance and email fields. So, for such cases yield-by-default aren’t suitable.
Ensure in the wild
It’s interested whether yield is properly handled in real-world Ruby-code.
We took off grep and applied it on Rails-related gems and Ruby Stdlib. The result was surprisingly good. All resource-sensitive code is decorated with ensure and properly handles resource freeing.
We found only two places with non-critical issues. The first is in activerecord/lib/active_record/connection_adapters/mysql_adapter.rb, method exec_stmt. At the bottom it has next snippet:
result = yield [cols, stmt]
stmt.result_metadata.free if cols
stmt.free_result
stmt.close if binds.empty?
result
Looks like stmt.* piece should be put under ensure protection.
Another one is in activesupport/lib/active_support/core_ext/file/atomic.rb, method self.atomic_write:
temp_file = Tempfile.new(basename(file_name), temp_dir)
temp_file.binmode
yield temp_file
temp_file.close
This part is not critical, because GC closes all Tempfile objects properly. But for predictable behavior it’s better to wrap this part in ensure.
We glanced across other popular gems like redis, unicorn, resque and others, but didn’t find anything suspicious. Anyway, you can check by yourself most production critical gems and validate their safety.
Summary
1. When using yield, decide which behavior is preferable for you: with or without ensure.
- How does your code behave in case of return?
- How does your code behave in case of exception?
2. Using third-party gems with block-based API, check whether gem author properly handle resource cleaning.
3. Looks like Ruby Stdlib, Rails and most popular gems handle yield properly.