Pod4 is not an ORM. No, really it isn't. Because that would be ridiculous.
...okay, it kind of acts a bit like an ORM...
Ah, well. Judge for yourself. Here's a model:
class CustomerModel < Pod4::Model
class CustomerInterface < Pod4::PgInterface
set_table :customer
set_id_fld :id
end
set_interface CustomerInterface.new(CONN_STRING)
attr_columns :cust_code, :name, :group
end
And here's a method that uses the model:
def change_customer_group(id, group)
customer = Customer.new(id).read.or_die
customer.group = group
customer.update.or_die
end
Pod4 is a very simple set of classes that sits on top of some other library which gives access to data -- for example, pg, tds, or Sequel (which is an ORM...) It's relatively easy to get it to talk to a new sort of data access library, and you're not limited to databases.
It provides a simple, common framework to talk to all these data sources, using model classes which (to my mind at least) are clean, easy to understand and maintain, using a bare minimum of DSL and vanilla Ruby inheritance.
This is the central motivation behind the project -- to provide a mechanism that allows for model classes which actually represent your data to the rest of your code in a way that you are fully in control of. Because it's your model classes, not the database, which are the canonical representation of the data.
I don't want the people who maintain my code to have to know the differences between ActiveRecord's
update
and update_all
, or Sequel's dataset[]
and dataset.where()
. Pod4::Model has a dozen
or so methods you need to worry about, and six of those are pretty much self-explanatory. Or, you
can inherit from Pod4::BasicModel instead, and do without even that.
I honestly don't think of it as an Object Relational Manager. I think of it as a Way To Have Nice Models.
If you are looking for something with all the features of, say, ActiveRecord, then this isn't for you. I provide basic access to and maintenance of records, with validation. For anything more, you need to be willing to use a very well established existing DSL within your model code -- SQL.
This code was developed, by me, during working hours at James Hall & Co. Ltd. I'm incredibly greatful that they have permitted me to open-source it.
gem install pod4
Of course you will also need to install whatever other gems you need in order to access the data you want Pod4 to see. Currently there are interfaces for:
- Sequel (which itself of course talks to all manner of databases)
- Tiny_tds
- Pg
- Nebulous (my own STOMP wrapper/protocol/what-have-you)
(But, you can add your own interfaces. It's not hard.)
Pod4 uses my Octothorpe gem to pass information around. An Octothorpe is basically a Hash, except the keys are always symbols, and it's read only.
But you don't really need to know that here. If you mentally substitute "Hash" every time I say "Octothorpe", you'll be fine.
Note well that we distinguish between 'models' and 'interfaces':
The model represents the data to your application, in the format that makes most sense to your application: that might be the same format that it is stored in on the database, or it might not. The model doesn't care about where the data comes from. Models are all subclasses of Pod4::Model (or Pod4::BasicModel, but we'll leave that alone for now).
An interface encapsulates the connection to whatever is providing the data. It might be a wrapper for calls to the Sequel ORM, for example. Or it could be a making a series of calls to a set of Nebulous verbs. It only cares about dealing with the data source, and it is only called by the model.
An interface is a seperate class, which is defined for each model. There are parent classes for a number of the sources you will need, but failing that, you can always create one from the ultimate parent, Pod4::Interface.
# find record 14; raise error otherwise. Update and save.
x = ExampleModel.new(14).read.or_die
x.two = "new value"
x.update
# create a new record from the params hash
y = ExampleModel.new
y.set(params)
y.create
A model is a class, each instance of which represents a single record. on that instance you can call the following for basic operation:
create
-- tells the data source to store this new "record"read
-- obtains the "record" from the data sourceupdate
-- updates the "record" on the data sourcedelete
-- deletes the "record" on the data source.set
-- set the column attributes of the object with a hash or Octothorpeto_ot
-- output an Octothorpe of the object's column attributesalerts
-- return an array of Alerts (which I'll explain later)
(Note that we say "record" not record. The data source might not be a database. Your model instance might be represented on the data source as several records, or something else entirely.)
There is one more operation - list
. Call this on the model class itself, and it will return an
array of model objects that match the criteria you pass. What you can pass to list depends on
your model class (of course); by default it also depends on the interface the model uses. But
normally it should except a hash, like so:
ExampleModel.list(:one => "a") #-> Array of ExampleModel where one = "a"
Additionally, you can chain or_die
onto any model method to get it to raise exceptions if
something is wrong on the model. If you don't want exceptions, you can check the model's
model_status attribute, or just look at the alerts.
Those eight (nine) methods are all the methods given by Pod4::Model that you are normally going to want to use, outside of the code actually inside your model.
Here is the model and interface definition that goes with the above example:
require 'pod4'
require 'pod4/pg_interface'
require 'pg'
class ExampleModel < Pod4::Model
class ExampleInterface < Pod4::PgInterface
set_table :example
set_id_fld :id, autoincrement: true
end
set_interface ExampleInterface.new($pg_conn)
attr_columns :one, :two, :three
end
In this example we have a model that relies on the Pg gem to talk to a table 'example'. The table has a primary key field 'id' and columns which correspond to our three attributes one, two and three. There is no validation or error control.
Note that we have to require pg_interface and pg seperately. I won't bother to show this in any more model examples.
Let's start with the interface definition. Remember, the interface class is only there to represent the data source to the model. Yours will most likely be no more complex than the one above. Since they are only accessed by the model, my preference is to define them in an internal class, but if that makes you back away slowly waving your hands placatingly, put it in another file. Pod4 is fine with that.
Inside your interface class you must call some DSLish methods to tell the interface how to talk to the data. What they are depends on the interface, but the ones for PgInterface are pretty common:
set_schema
-- optional -- the name of the schema to find the table inset_table
-- mandatory -- the name of the database table to useset_id_fld
-- mandatory -- the name of the column that makes the record unique
Actually, every interface defines set_id_fld
. Instances of a model must be represented by a
single ID field that provides a unique identifier. Pod4 does not care what it's called or what data
type it is -- if you say that's what makes it unique, that's good enough. Additionally you can
specify whether your key autoincrements or not. If you don't say, we assume that it does.
Internally, Interfaces talk the same basic language of list / create / read / update / delete that models do. Sometimes you might want to add a special interface method for a specific database operation, but otherwise that's probably going to be it for your Interface definition.
Models have two of their own DSLish methods:
set_interface
-- here is where you instantiate your Interface classattr_columns
-- likeattr_accessor
, but letting the model know to care.
You can see that interfaces are instantiated when the model is required. Exactly what you need to pass to the interface to instantiate it depends on the interface. SequelInterface wants the Sequel DB object; the other interfaces only want connection hashes.
(If you are finding it annoying to have to, for example, figure out your database connection hash or get a Sequel DB object before requiring your models, then see "Connections", below.)
Any attributes you define using attr_columns
are treated specially by Pod4::Model. You get all
the effect of the standard Ruby attr_accessor
call, but in addition, the attribute will be passed
to and from the interface, and to and from your external code, by the standard model methods.
In addition to the methods above, we have:
validate
-- override this to provide validationmap_to_model
-- controls how the interface sets attributes on the modelmap_to_interface
-- controls how the model sends data to the interfaceadd_alert
-- adds an alert to the model
A model also has some built-in attributes of its own:
model_id
-- this is the value of the ID column you set in the interface.model_status
-- one of :error :warning :okay :deleted :unknown
We'll deal with all these below.
Built into the model is an array of alerts (Pod4::Alert) which are messages that have been raised
against the instance of the model class. Each alert can have a type of :error, :warning, :info or
:success. If any alert has a type of :error or :warning, then that is reflected in the model's
model_status
attribute. A model that has passed validation with no :error or :warning alerts is
status :okay.
(There are two other possible statuses -- models are :unknown when validation has yet to be run, and :deleted after a call to delete.)
You can raise alerts yourself, and you normally do so by overriding validate
. This method is
called after a read as well as when you write to the database; so that a model object should always
have a model_status reflecting its "correctness" regardless of whether it came from the data source
or your application.
Here's a model with some validation:
class Customer < Pod4::Model
class CustomerInterface < Pod4::PgInterface
set_schema :pod4example
set_table :customer
set_id_fld :id, autoincrement: true
end
set_interface CustomerInterface.new($pg_conn)
attr_columns :cust_code, :name, :group
def validate(mode)
super
return if mode == :delete
add_alert(:error, :name, "Name cannot be empty") \
unless @name && @name =~ \^\s*$\
add_alert(:error, :cust_code, "invalid customer code") \
unless @cust_code && @cust_code.length == 6
end
end
(Note: as a general principal, you should always call super when overriding a method in Pod4 model, unless you have good reason not to.)
Validation is run on create, read, update and delete. If the model has a status of :error, then the validation will fail. (Probably you do not want this on delete; test the parameter passed to validate as in the example above).
In passing I should note that validation is not run on list: every record that list returns
should be complete, but the model_status
will be :unknown because validation has not been run.
(This is partly for the sake of speed.)
You should be aware that validation is not called on set
, either. Because of that, it's entirely
possible to set a model to an invalid state and not raise any alerts against it until you go to
commit to the database. If you want to change the state of the model and then validate it before
that, you must call validate
yourself.
Pod4 will do the basic work for you when it comes to data types. integers, decimals, dates and datatimes should all end up as the right type in the model. (It depends on the Interface. You're going to get tired of me saying that, aren't you?) But maybe you want more than that.
Let's imagine you have a database table in PostreSQL with a column called cost that uses the money
type. (This is a terrible idea, by the way.) And you want it to be a BigDecimal
in the model.
Well, Pod4 won't do that for you -- for all I know someone might have a problem with my requiring
BigDecimal -- but it's not hard to do yourself.
class Product < Pod4::Model
class ProductInterface < Pod4::PgInterface
set_schema :pod4example
set_table :product
set_id_fld :product_id, autoincrement: true
end
set_interface ProductInterface.new($pg_conn)
attr_columns :description, :cost
def map_to_model(ot)
super
@cost = Bigdecimal(@cost)
end
def map_to_interface
super.merge(cost: @cost.to_f)
end
end
map_to_model
gets called when the model wants to write data from the interface on the model; it
takes an Octothorpe from the interface as a parameter. By default it behaves as set
does.
map_to_interface
is the opposite: it gets called when the model wants to write data on the
interface from the model. It returns an Octothorpe to the interface. By default it behaves as
to_ot
does. (Since OTs are read only, you must modify it using merge.)
You might also want to ensure that your data types are honoured when your application updates a
model object; in which case you will need to override set
as well.
If this seems like a lot of work, take a look at the Pod4::TypeCasting mixin. it will handle it for you.
Pod4 does not provide relations. But, I'm not sure that it needs to. Look:
class BlogPost < Pod4::Model
class BlogPostInterface < Pod4::PgInterface
set_table :blogpost
set_id_fld :id, autoincrement: true
end
set_interface BlogPostInterface.new($conn)
attr_columns :text
def comments; Comment.list(post: @id); end
end
class Comment < Pod4::Model
class CommentInterface < Pod4::PgInterface
set_table :comment
set_id_fld :id
end
set_interface CommentInterface.new($conn)
attr_columns :post_id, :text
def blog_post; BlogPost.new(@post_id).read.or_die; end
end
So the BlogPost model has a comments method that returns an array of Comments, and the Comments model has a blog_post method that returns the BlogPost. (You would probably want to add validation to enforce relational integrity.)
Is this approach inefficient? Possibly. But if you don't like it, you can always try:
Sooner or later you will want to do something more than Pod4::Model will give you automatically. There is a perfectly well documented, very popular DSL with lots of examples to solve this problem. It's called SQL.
If your interface is connected to a SQL database, it should provide two more methods: execute
and
select
.
class BlogPost < Pod4::Model
class BlogPostInterface < Pod4::PgInterface
set_table :blogpost
set_id_fld :id, autoincrement: true
end
set_interface BlogPostInterface.new($conn)
attr_columns :text
##
# return an array of hashes where each comment has the post joined to it
#
def post_and_comments
interface.select( %Q|select *
from blogpost b
join comments c on(c.post_id = b.id);| )
end
##
# delete all comments on this post
#
def delete_comments
interface.execute(
%Q|delete from comments where post_id = #{@model_id};| )
end
end
Neither execute
nor select
care about the table or ID field you passed to the interface. They
only run pure SQL. The only difference between them is that select expects to return an array of
results.
To my way of thinking, there is absolutely nothing wrong about using SQL in a model. It will certainly need revisiting if you change database. But how often does that happen, really? And if it ever does, you are likely to need to revisit the effected models anyway...
There are a couple of weird wrinkles in all this which you may or may not have missed. Wrinkle one:
class MyModel < Pod4::Model
class Interface < Pod4::PgInterface
set_id_fld :id, autoincrement: true
set_table :my_table
end
set_interface Interface.new($pgconn) # <- wrinkle
attr_columns :one, two, three
end
You instantiate the interface at the point where you define the model. I admit this is a little odd. It wasn't a specific design decision on my part; rather it was the cleanest way to get where I was going.
In practice this means you need to get your DB connection details from somewhere, maybe create your Sequel DB object; and only then can you require your models.
Leading on from this, wrinkle two: except when using Sequel (which behaves differently) each interface has its own connection to the database. This means that your application has (simplifying a bit here) one database connection for each model class. So if you have a Customer model and a Orders model, your application will hold two connections to the database. All customer enquiries share a single connection, and all order enquiries share a single connection.
I'm finding that, generally, this scales about right. But if you have a lot of different models and a database that runs out of connections easily, it might be problematic.
The solution to both of these wrinkles, if you need one, is to use a Pod4::Connection object:
#
# init.rb -- bootup for my project
#
# Require libraries
require "sequel"
require "pod4"
require "pod4/sequel_interface"
require "pod4/connection"
# require models
$conn = Pod4::Connection.new(interface: Pod4::SequelInterface)
require_relative "models/customer"
# set up database connection
hash = get_sequel_params
$conn.data_layer_options = Sequel.connect(hash)
#############
#
# models/customer.rb ( only a part shown)
#
class Foo < Pod4::Model
class Interface < Pod4::SequelInterface
set_table :foo
set_id_field :id, autoincrement: true
end
set_interface Interface.new($conn)
TL;DR: the only code that needs to go in the middle of your requires is the line defining a Connection object; you pass that to the interface instead. You can set up the parameters the interface needs and pass them to the connection object afterwards.
With TdsInterface and PgInterface you can pass the same connection to multiple models and they will share it. These interfaces take a Pod4::ConnectionPool instead, but otherwise the code looks exactly the same as the above example.
(Technical note: the ConnectionPool object will actually provide multiple connections, one per thread that uses it. This satisfies the requirement of the underlying libraries that connections are not shared between threads, and therefore ensures that Pod4 is more or less thread safe. You get this functionality automatically -- if you don't define a ConnectionPool, then the interface will create one internally. Sequel uses its own thread pool, of course, and we only use the one Sequel DB object for the whole of Pod4, so it doesn't need any of that. That's why it uses Pod4::Connection, instead.)
Sometimes your model needs to represent data in a way which is so radically different from the data source that the whole list, create, read, update, delete thing that Pod4::Model gives you is no use.
Pod4::BasicModel gives you:
set_interface
- the
model_id
,model_status
andalerts
attributes add_alert
...and nothing else. But that's enough to make a model, your way, using the methods on the interface. These are the same CRUDL methods that Pod4::Model provides -- except that on the interface, the CRUD methods take a record id as a key.
A real world example: at James Hall my intranet system has a User model, where each attribute is a parameter that controls how the system behaves for that user -- email address, security settings, etc. Having one object to represent the user is the most sensible thing.
But I don't want to have to add a column to the database each time I change the intranet system and add a user parameter. The logical place to change the parameter is in the User model, not in the database, and certainly not both. So on the database, I have a settings table where the key runs: userid, setting name.
Here's a simplified version of my User model. This one is read only, but it's hopefully enough to get the idea:
class User < Pod4::BasicModel
class UserInterface < ::Pod4::SequelInterface
set_table :settings
set_id_fld :id, autoincrement: false
end
# Here we set what settings always exist for a user
Setting = Struct.new(:setName, :default)
DefaultSettings = [ Setting.new( :depot, nil ),
Setting.new( :store, nil ),
Setting.new( :menu, nil ),
Setting.new( :roles, '' ),
Setting.new( :name, '' ),
Setting.new( :nick, '' ),
Setting.new( :mail, nil ) ]
set_interface UserInterface.new($db)
attr_reader :userid, :depot, :store, :menu, :roles, :name, :nick, :mail
class << self
def keys; DefaultSettings.map{|x| x.setName }; end
def list
array = interface.select(%Q|select distinct userid from settings;|)
array.map {|r| self.new( r[:userid] ).read }
end
end # of class << self
def initialize(userid=nil)
super(userid)
self.class.keys.each do |key|
instance_variable_set( "@#{key}".to_sym, nil )
end
end
def read
lst = interface.list(userid: @model_id)
data = lst.each_with_object({}) do |ot,h|
h[ot.>>.setname] = ot.>>.setvalue
end
@userid = @model_id
set_merge( Octothorpe.new(data) )
validate; @model_status = :okay unless @model_status != :unknown
self
end
def to_ot
hash = self.class.keys.each_with_object({}) do |k,m|
m[k] = instance_variable_get("@#{k}".to_sym)
end
Octothorpe.new(hash)
end
def set_merge(hash)
self.class.keys.each do |key|
value = hash[key]
instance_variable_set( "@#{key}".to_sym, value ) if value
end
end
end
There are some mixins that you can use to extend the functionality of Pod4 models. Have a look at the comments at the top of the mixin in question if you want details.
- typecasting -- force columns to be a specific ruby type, validation helpers, some encoding stuff
- encrypting -- encrypt text columns
- tweaking -- adds DSL commands to support custom methods on the interface.
Some hopefully-not-too-unexpected behaviour:
-
If you change attributes on a record, then call #delete or #read on it, we don't warn you that your changes are lost.
-
If you attempt to update an autoincrement ID field, we don't write that change to the database and we don't warn you about that.
-
As mentioned above, we only run validate on #create, #read, #update and #delete. So you can change the attributes of a record to something invalid without it immediately warning you. (You can always run #validate yourself, though.)
-
Again, as mentioned above, the array of model instances returned by #list will all be status :unknown. This is because we have run neither #read nor #validate against them.
-
I can't stop you writing to
@model_id
or@model_status
in your model. I have no idea what might happen if you do, but I doubt that it would ever be a good idea. (In a non-autoincrement model, write to your ID field directly instead;@model_id
will be updated when you call #create, #update or of course #read.)