Tuesday
12Jan2010

CouchDB makes a great data cache (with help from Ruby)

I was working on pulling in some data from this nasty web service (slow, complex, unreliable) where the data is not structured the way I want it to be (oh yeah, and there’s that too…). To get all the data I need there’s about seven or so queries I have to make. Not wanting to have my main app deal with all these queries and storing the data in its relational (and rigid) structure I thought this seemed like a good use case for CouchDB. The semi-structured data format in CouchDB allows me to layer data on top of existing documents as need. Each query adding more data to the existing data and in the end giving me the full picture of the end objects I am after. “Progressive data enhancement” if you will. Once I’ve got this full picture of the data I can sit my CouchDB database between my main app and this ugly, awful web service. My main app will hit CouchDBs nice, clean RESTful!!!, interface for the data I want. A small Ruby script (on a chron perhaps) deals with all the aforementioned nastiness, grabs data from the web service and loads it into CouchDB. This setup encapsulates the idiosyncrasies of the outside world in one place and allows my main app to operate in reality distortion field of my own design. Sweet!

Ultimately, this data will end up (at least part of it) in a relational DB but setting up the needed tables, columns, etc… require a lot upfront to build out the all that good relational-ness. At this point, it’s pretty exploratory. As I add more webservice calls I’d like to immediately see that data and use it in my app. Not have to write a migration that I may or may not have to roll back if the additional data element is wrong or not needed. I’m trying to keep it agile over here!

So, I have my code query the remote web service and I get back a hash for each object I want to put in my CouchDB database (conveniently I designed this code to turn ugly SOAP mapping objects into plain old arrays of hashes). The initial web service query returned over 12,000 objects so iterating over each one to stuff into the CouchDB seemed silly. Luckily CouchDB can handle bulk document creation (using the CouchRest gen this is exposed as ‘bulk_save’). The first time I did this it worked fine and returned all the new id’s it created. After some thought about how to get the rest of my data in there I decided to use custom document IDs instead of the autogenerated UUIDs in couch assigns. Using the default id would mean I’d have to hit Couch to look up the id on each web service returned object just to get the document. This is an extra lookup in CouchDB which is fine because Couch is very fast but there’s no need for this.

CouchDB bulk create allows you to specify your own unique id to use instead of the auto assigned/generated one. In my situation this would allow me to just blindly post to Couch with my own ID in the URL and have the new data from my other queries go directly to the document. The progressive data enhancement I was after without the redundant lookup to get the right document first.

What I needed to do was wipe out my database that held the CouchDB created document with the CouchIDs, modify the hashes I get back from the web services to have “id”=>”myuniqueid” in there so when I did my bulksave CouchDB would not create IDs for me but just use the ones given to it.

Here’s (finally) where Ruby comes in.

Let’s say that here is the array of hashes I get back from the webservice:

vals =[{:a => "1"},{:a => "2"},{:a => "3"}]

(where :a is the key for the web service unique id value)

I need to modify each hash in this array to add {“id” => “valueof‘a’_”} so we want the first element to look like:

{"_id" => "1",:a => "1"}

for example…

Ruby #map to the rescue!

vals.map{|v| v["_id"]=v[:a]}

Gives us:

vals.inspect
=> "[{"_id"=>"1", :a=>"1"}, {"_id"=>"2", :a=>"2"}, {"_id"=>"3", :a=>"3"}]"

With our hash formatted the right way to use our web service given ID as the CouchDB ID bulk_save will create all the documents we need so that we can access them (using this sample data) like:

curl -X GET 'http://127.0.0.1:5984/test_db/1'

Should give you:

{"_id":"1","_rev":"1-10085f96b70ddbb6155710a391194304","a"=>"1"}

(your rev id will be different of course)

That’s all there is to it!

Tuesday
08Dec2009

PGError: ERROR: permission denied: "RI_ConstraintTrigger_xxxxxxx" is a system trigger

This was a fun one.

Today I set out to get one of my apps setup on the CI server we have use at work when I ran into this problem. It occurred only for a few of the remaining specs in the test suite that still use fixtures. The error looked something like this:

ActiveRecord::StatementInvalid in 'RegistryController should show the current registry for the current user - no params sent'
PGError: ERROR:  permission denied: "RI_ConstraintTrigger_2681229" is a system trigger: ALTER TABLE "schema_migrations" ENABLE TRIGGER ALL;ALTER TABLE "users" ENABLE TRIGGER ALL;ALTER
......
ALL;ALTER TABLE "master_answers" ENABLE TRIGGER ALL;ALTER TABLE "questions" ENABLE TRIGGER ALL;ALTER TABLE "answers" ENABLE TRIGGER ALL;ALTER TABLE "people" ENABLE TRIGGER ALL

Turns out this is a permission issue with the how Postgres handles triggers. Triggers belong to the superuser and if you’ve set up your DB permissions properly your test user on your DB should NOT be a super user.

This author correctly identifies the problem but suggests the solution is to change the permissions on your test user:

Here is a better approach to addressing the actual problem (and thorough explanation):

I deviated from this approach slightly in that I only added the ‘require’ statement to pull in the hack in my CI environment (not for all the environments as the author suggests). I thoughts are that this change only needs to solve a problem in the CI env, so that is where it should live. This could bite me in the butt because it introduces an inconsistency in the adapter behavior across envs. To be honest, I’m not sure which is worse. We’ll see how it plays out.

Wednesday
07Oct2009

Trouble with the Postgres Ruby gem on OSX 10.6

I recently (i.e. earlier today) decided to stop cheating and using the Postgres pure Ruby (postgres-pr) adapter and switch to the real-deal compiled version (formerly known as ‘postgres’ now known as just ‘pg’). I had been putting this off because all prior attempts to sudo gem install postgres had failed with esoteric messages which I will likely never understand.

It also happens that I just upgraded to OS X 10.6 (AKA Snow Leopard). What better time to try again?

Step 1 - Out with the old

gem uninstall postgres-pr

Select gem to uninstall:
1. postgres-pr-0.4.0
2. postgres-pr-0.5.1
3. postgres-pr-0.6.1
4. postgres-pr-0.6.1
5. All versions

I chose option 5

Successfully uninstalled postgres-pr-0.4.0
Successfully uninstalled postgres-pr-0.5.1
Successfully uninstalled postgres-pr-0.6.1
Successfully uninstalled postgres-pr-0.6.1

I don’t know why there were two postgres-pr-0.6.1 perhaps one had been installed in my local .gem directory and the other in the system gem dir.

Step 2 - Installing the newness

sudo gem install pg

Wait for it…

FAIL! `ERROR: Error installing pg:
ERROR: Failed to build gem native extension.

/System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/bin/ruby extconf.rb
extconf.rb:1: command not found: pgconfig —version
ERROR: can’t find pg
config.
HINT: Make sure pg_config is in your PATH `

Crap, that sucks. But there is a glimmer of hope. Note the “HINT”…hmm… thanks fellas!

Step 3? - Where is that pg_config?!

Through the magic of the command line we can search for it. Drop this on the command line.

mdfind pg_config|grep bin|uniq

Your results will probably be different but mine are: /Library/PostgreSQL/8.3/bin/pg_config
/usr/local (from old Mac)/bin/pg_config
/usr/local (from old Mac)/pgsql/bin/pg_config
/usr/local (from old Mac)/src/postgresql-8.2.5/src/bin/pg_config
/usr/local (from old Mac)/src/postgresql-8.2.5/src/bin/pg_config/pg_config
/usr/local (from old Mac)/src/postgresql-8.2.5/src/bin/pg_config/pg_config.c
/usr/local (from old Mac)/src/postgresql-8.2.5/src/bin/pg_config/pg_config.o

Pay attention to the first line that’s where the elusive ‘pg_config’ is hiding. (For the record, I have no idea why this file is needed or what it does. Perhaps some sort of pg config?) See the comments below for an explanation by my friend Rhett.

Note to reader: At this point I got distracted and tried to delete that local (from old mac) folder and almost ran an rf with recursive force on my usr/local directory (by almost… I mean I did run it but thankfully a sudo is required for this operation. For all my complaining about having to use Sudo it totally saved my ass. I will henceforth never complain about having to type it in all the time.)

Step… to hell with the steps because we have derailed.

Time to try the install again. Drop the path on the command as part of the install like so.

PATH=/Library/PostgreSQL/8.3/bin:$PATH sudo gem install pg

This yields a new, somewhat more confusing and more verbose error. The meat of it is:
In file included from compat.c:16:
compat.h:38:2: error: #error PostgreSQL client version too old, requires 7.3 or later.

I have even less of an idea what to do with this… thankfully, after some googling it turns out this is a common issue compiling the pg gem on Snow Leopard and the fix is easy.

So, combining the first fix with this new one we get:
PATH=/Library/PostgreSQL/8.3/bin:$PATH sudo env ARCHFLAGS='-arch i386' gem install pg

Running this we get:

Building native extensions. This could take a while...
Successfully installed pg-0.8.0
1 gem installed

Sweet!!! No more postgres-pr. We are now on the very fast pg gem for Postgres on Ruby.

Friday
08May2009

Vimperator 2.0 crashing Firefox 3.0+

I really like using the Vimperator plugin for FireFox. It’s helped me learn the Vim commands and obviated the mouse for browsing web pages (except for heavy javascript or Flash laden websites).

Unfortunately, the latest version of the Vimperator plugin (version 2.0) causes my install of Firefox to crash like, 80% of the time. Since Firefox is my primary browser having to force quit it all the time became a serious annoyance.

After doing some digging on the web it seemed that this problem was not that common (although a co-worker of mine was having the same problem but with less frequency). There was some hint in a email post that it might have something to do with an old bookmark service bug. I’m not really sure what that means, or if that is the actual problem. However, there is solution that seems to be holding up well, at least for me anyway.

To fix this problem here’s what you need to do:

1) Create a .vimperatorrc file in your home directory (/Users/brian …for example)

2) In this file add the following line

:set nopreload

3) Save the file and restart Firefox.

Note: I’m not entirely sure what the ‘nopreload’ command does. I found out about this setting from this mailing list email.

Wednesday
08Apr2009

Rails environment sanity

Media temple’s hosting support has a neat little pearl script to check the gems on your system. http://kb.mediatemple.net/questions/784/(gs)+How+do+I+check+my+Ruby+gems+for+proper+versions%3F