Login

The "N+1 selects problem" is generally stated as a problem in Object-Relational mapping (ORM) discussions, and I understand that it has something to do with having to make a lot of database queries for something that seems simple in the object world.

Does anybody have a more detailed explanation of the problem?

Suppose you have COMPANY and EMPLOYEE. COMPANY has many EMPLOYEES (i.e. EMPLOYEE has a field COMPANY_ID).

In some O/R configurations, when you have a mapped Company object and go to access its Employee objects, the O/R tool will do one select for every employee, wheras if you were just doing things in straight SQL, you could `select * from employees where company_id = XX`. Thus N (# of employees) plus 1 (company)

This is how the initial versions of EJB Entity Beans worked. I believe things like Hibernate have done away with this, but I'm not too sure. Most tools usually include info as to their strategy for mapping.

SELECT
table1.*
, table2.*
INNER JOIN table2 ON table2.SomeFkId = table1.SomeId

That gets you a result set where child rows in table2 cause duplication by returning the table1 results for each child row in table2. O/R mappers should differentiate table1 instances based on a unique key field, then use all the table2 columns to populate child instances.

SELECT table1.*

SELECT table2.* WHERE SomeFkId = #

The N+1 is where the first query populates the primary object and the second query populates all the child objects for each of the unique primary objects returned.

Consider:

class House
{
int Id { get; set; }
string Address { get; set; }
Person[] Inhabitants { get; set; }
}

class Person
{
string Name { get; set; }
int HouseId { get; set; }
}

and tables with a similar structure. A single query for the address "22 Valley St" may return:

Id Address Name HouseId
1 22 Valley St Dave 1
1 22 Valley St John 1
1 22 Valley St Mike 1

The O/RM should fill an instance of Home with ID=1, Address="22 Valley St" and then populate the Inhabitants array with People instances for Dave, John, and Mike with just one query.

A N+1 query for the same address used above would result in:

Id Address
1 22 Valley St

with a separate query like

SELECT * FROM Person WHERE HouseId = 1

and resulting in a separate data set like

Name HouseId
Dave 1
John 1
Mike 1

and the final result being the same as above with the single query.

The advantages to single select is that you get all the data up front which may be what you ultimately desire. The advantages to N+1 is query complexity is reduced and you can use lazy loading where the child result sets are only loaded upon first request.

The supplied link has a very simply example of the n + 1 problem. If you apply it to Hibernate it's basically talking about the same thing. When you query for an object, the entity is loaded but any associations (unless configured otherwise) will be lazy loaded. Hence one query for the root objects and another query to load the associations for each of these. 100 objects returned means one initial query and then 100 additional queries to get the association for each, n + 1.

[