Log slow Rails SQL queries
Rails framework comes with a instrumentation API which allows developers to hook into various internal events, being that view rendering pipeline, request processing or any custom application events. Obviously there's a lot of different instrumentation frameworks/products on the market that work with Rails, however, the majority of tools are still utilizing the core API, powered by ActiveSupport Notifications.
In the context of this blog post we will instrument the slow SQL queries that take more than certain amount of time to execute and print them out to STDOUT or into the JSON-formatted log file, whichever suits your needs.
First, let's take a look on how the ActiveSupport Notification subscriptions work. The easiest way to create a subscription is via a Rails initializer file. Let's have a look at the example from the documentation:
ActiveSupport::Notifications.subscribe('render') do |name, start, finish, id, payload|
name # => String, name of the event (such as 'render' from above)
start # => Time, when the instrumented block started execution
finish # => Time, when the instrumented block ended execution
id # => String, unique ID for the instrumenter that fired the event
payload # => Hash, the payload
end
All subscribers follow the same block signature that includes event name, timestamps
and the payload. Payload container is the hash that holds various metric data that's
defined by the event producer. In our case, we want to subscribe to all SQL queries
executed via ActiveRecord::Base
and log the ones that are too slow.
Create a file config/initializers/rails_log_slow_queries.rb
:
ActiveSupport::Notifications.subscribe("sql.active_record") do |name, start, finish, id, payload|
# Print out payload for testing purposes
pp payload
end
When you restart the Rails server and visit a few pages you might notice there's a few extra lines in the log that will look like this:
{
:sql=>"SELECT \"users\".* FROM \"users\" WHERE \"users\".\"id\" = $1 ORDER BY \"users\".\"id\" ASC LIMIT $2",
:name=>"User Load",
:binds=>[
#<ActiveRecord::Relation::QueryAttribute:0x00007fdfda949d98 @name="id", @value_before_type_cast=1, @type=#<ActiveModel::Type::Integer:0x00007fdfeb20b7a0 @precision=nil, @scale=nil, @limit=nil, @range=-2147483648...2147483648>, @original_attribute=nil, @value=1, @value_for_database=1>, #<ActiveRecord::Attribute::WithCastValue:0x00007fdfda948fd8 @name="LIMIT", @value_before_type_cast=1, @type=#<ActiveModel::Type::Value:0x00007fdfeb127f50 @precision=nil, @scale=nil, @limit=nil>, @original_attribute=nil, @value=1>
],
:type_casted_binds=>[1, 1],
:statement_name=>nil,
:connection_id=>70299906117300
}
This essentially shows all the ActiveRecord SQL queries flowing through the application, and it's great. We have access to the raw query (as well as binds) and we also have timing information (start/finish args). Now, lets go a bit further and setup a logger class that would log the slow queries:
# lib/slow_query_logger.rb
require "json"
require "digest"
require "logger"
class SlowQueryLogger
def initialize(output = nil, opts = {})
# will log any query slower than 500 ms
@threshold = opts.fetch(:threshold, "500").to_i
@logger = Logger.new(output || STDOUT)
@logger.formatter = method(:formatter)
end
def call(name, start, finish, id, payload)
# Skip transaction start/end statements
return if payload[:sql].match(/BEGIN|COMMIT/)
duration = ((finish - start) * 1000).round(4)
return unless duration >= @threshold
data = {
time: start.iso8601,
pid: Process.pid,
pname: $PROGRAM_NAME,
request_id: RequestContext.request_id,
duration: duration,
query: payload[:sql].strip.gsub(/(^([\s]+)?$\n)/, ""),
kind: payload[:sql].match(/INSERT|UPDATE|DELETE/) ? "write" : "read",
length: payload[:sql].size,
cached: payload[:cache] ? true : false,
hash: Digest::SHA1.hexdigest(payload[:sql])
}.compact
@logger.info(data)
end
private
def formatter(severity, time, progname, data)
JSON.dump(data) + "\n"
end
end
The logger class above wraps Ruby's standard logger writer and conforms to ActiveSupport Notification call signature. Each log message includes a bunch of useful info, like process ID, program name, request ID (useful to correlate with HTTP logs), compacted SQL query (for better readability), kind of query (read/write), size and query hash. Query hash is useful since you can use the SHA1 hash and search your logs to see all of the query occurrences over time.
Finally, to make use of the logger we would setup subscription as follows:
# config/initializers/rails_log_slow_queries.rb
if ENV["RAILS_LOG_SLOW_QUERIES"] == "1"
require "slow_query_logger"
# Setup the logger
output = Rails.root.join("log/slow_queries.log")
logger = SlowQueryLogger.new(output, threshold: ENV["RAILS_LOG_SLOW_QUERIES_THRESHOLD"])
# Subscribe to notifications
ActiveSupport::Notifications.subscribe("sql.active_record", logger)
end
As a result, the log/slow_queries.log
file will contain lines like (added formatting):
{
"time": "2020-02-18T23:19:03-06:00",
"pid": 15028,
"pname": "puma: cluster worker 0: 15020 [railsapp]",
"duration": 0.322,
"query": "SELECT \"users\".* FROM \"users\" WHERE \"users\".\"id\" = $1 LIMIT $2",
"kind": "read",
"length": 63,
"cached": false,
"hash": "6e277ee23336967d98ad775ae6b24a3b58163696"
}
That's it, the JSON log file then could be shipped to a log provider for further
analysis and alerting. You could also set RAILS_LOG_SLOW_QUERIES_THRESHOLD
to 0
and log ALL the queries, but that would be unnecessary in most cases.