Posts Tagged ‘sql’

Rails bug?

Friday, April 18th, 2008

I discovered what looks like a bug with has_many :finder_sql.  The docs claim that if you supply :finder_sql and not :counter_sql, it makes the counter by replacing “SELECT * FROM …” with “SELECT COUNT(*) FROM …”.

This does not appear to work if your :finder_sql has a subquery.  To make up a fake example:

class User
    has_many :comments_not_mine,
        :class_name => 'Comment',
        :finder_sql => 'SELECT * FROM comments WHERE user_id != "#{id}" AND EXISTS (SELECT * FROM users)'
end

(The EXISTS part is silly, and just there to show a subquery.)

Now I can say user.comments_not_mine and it works fine, but user.comments_not_mine.count gives an error about the SQL "SELECT COUNT(*) FROM users)". It appears to be using a greedy regexp by mistake. (user.comments_not_mine.to_a.size works fine. It’s just the #count method it adds is trying to be clever, and failing.)

For the life of me, I can’t find it, so: dear lazyweb, please find the offending regexp, so I can write a patch.

String searching in MySQL

Tuesday, March 25th, 2008

So today I ran across a line that looked like:

Merchant.find(:first, :conditions => [”LOWER(url) = ?”,  …])

Since I’m kind of on an optimization binge, I decided to make sure this was indexed.  No point in sucking down an entire table if we just want to look up one word.

We have an index on merchants.url, but if you know SQL, you already know the answer: no, it doesn’t use the index for this.  I ran EXPLAIN to verify that, and then looked up the general rule: apparently if you use an expression, it doesn’t use an index.

No big deal: just make an index on LOWER(url), right?  Sorry, no, SQL doesn’t allow indexes on functions of columns.  (Actually, it’s a trick question: SQL itself doesn’t have indexes at all.  PostgreSQL can create an index on an expression, but support for this is not at all consistent across databases.)

It turns out the answer is really simple: MySQL is case-insensitive by default.  Removing the LOWER() call returns the same result, but using the index, at least for the things I tried.

But what about non-ASCII characters?  We do go to the trouble to say “DEFAULT CHARSET=utf8″, because we don’t want to exclude, say, people with Japanese names.  Then it gets more complicated.  MySQL has “collators” which seem to be responsible for determining how to case-insensitively compare strings.  By default, it’s … “latin1_swedish_ci”.

So if you’re wondering why you’re typing in a Turkish name and it’s not quite working as you expect, that’s because we’re assuming you’re Swedish.  Nothing personal.