Relationship Loading Techniques¶

About this Document

This section presents an in-depth view of how to load related objects. Readers should be familiar with Relationship Configuration and basic use.

Most examples here assume the “User/Address” mapping setup similar to the one illustrated at setup for selects.

A big part of SQLAlchemy is providing a wide range of control over how related objects get loaded when querying. By “related objects” we refer to collections or scalar associations configured on a mapper using relationship(). This behavior can be configured at mapper construction time using the relationship.lazy parameter to the relationship() function, as well as by using ORM loader options with the Select construct.

The loading of relationships falls into three categories; lazy loading, eager loading, and no loading. Lazy loading refers to objects that are returned from a query without the related objects loaded at first. When the given collection or reference is first accessed on a particular object, an additional SELECT statement is emitted such that the requested collection is loaded.

Eager loading refers to objects returned from a query with the related collection or scalar reference already loaded up front. The ORM achieves this either by augmenting the SELECT statement it would normally emit with a JOIN to load in related rows simultaneously, or by emitting additional SELECT statements after the primary one to load collections or scalar references at once.

“No” loading refers to the disabling of loading on a given relationship, either that the attribute is empty and is just never loaded, or that it raises an error when it is accessed, in order to guard against unwanted lazy loads.

Summary of Relationship Loading Styles¶

The primary forms of relationship loading are:

lazy loading - available via lazy='select' or the lazyload() option, this is the form of loading that emits a SELECT statement at attribute access time to lazily load a related reference on a single object at a time. Lazy loading is the default loading style for all relationship() constructs that don’t otherwise indicate the relationship.lazy option. Lazy loading is detailed at Lazy Loading.
select IN loading - available via lazy='selectin' or the selectinload() option, this form of loading emits a second (or more) SELECT statement which assembles the primary key identifiers of the parent objects into an IN clause, so that all members of related collections / scalar references are loaded at once by primary key. Select IN loading is detailed at Select IN loading.
joined loading - available via lazy='joined' or the joinedload() option, this form of loading applies a JOIN to the given SELECT statement so that related rows are loaded in the same result set. Joined eager loading is detailed at Joined Eager Loading.
raise loading - available via lazy='raise', lazy='raise_on_sql', or the raiseload() option, this form of loading is triggered at the same time a lazy load would normally occur, except it raises an ORM exception in order to guard against the application making unwanted lazy loads. An introduction to raise loading is at Preventing unwanted lazy loads using raiseload.
subquery loading - available via lazy='subquery' or the subqueryload() option, this form of loading emits a second SELECT statement which re-states the original query embedded inside of a subquery, then JOINs that subquery to the related table to be loaded to load all members of related collections / scalar references at once. Subquery eager loading is detailed at Subquery Eager Loading.
write only loading - available via lazy='write_only', or by annotating the left side of the Relationship object using the WriteOnlyMapped annotation. This collection-only loader style produces an alternative attribute instrumentation that never implicitly loads records from the database, instead only allowing WriteOnlyCollection.add(), WriteOnlyCollection.add_all() and WriteOnlyCollection.remove() methods. Querying the collection is performed by invoking a SELECT statement which is constructed using the WriteOnlyCollection.select() method. Write only loading is discussed at Write Only Relationships.
dynamic loading - available via lazy='dynamic', or by annotating the left side of the Relationship object using the DynamicMapped annotation. This is a legacy collection-only loader style which produces a Query object when the collection is accessed, allowing custom SQL to be emitted against the collection’s contents. However, dynamic loaders will implicitly iterate the underlying collection in various circumstances which makes them less useful for managing truly large collections. Dynamic loaders are superseded by “write only” collections, which will prevent the underlying collection from being implicitly loaded under any circumstances. Dynamic loaders are discussed at Dynamic Relationship Loaders.

Configuring Loader Strategies at Mapping Time¶

The loader strategy for a particular relationship can be configured at mapping time to take place in all cases where an object of the mapped type is loaded, in the absence of any query-level options that modify it. This is configured using the relationship.lazy parameter to relationship(); common values for this parameter include select, selectin and joined.

The example below illustrates the relationship example at One To Many, configuring the Parent.children relationship to use Select IN loading when a SELECT statement for Parent objects is emitted:

from typing import List

from sqlalchemy import ForeignKey
from sqlalchemy.orm import DeclarativeBase
from sqlalchemy.orm import Mapped
from sqlalchemy.orm import mapped_column
from sqlalchemy.orm import relationship


class Base(DeclarativeBase):
    pass


class Parent(Base):
    __tablename__ = "parent"

    id: Mapped[int] = mapped_column(primary_key=True)
    children: Mapped[List["Child"]] = relationship(lazy="selectin")


class Child(Base):
    __tablename__ = "child"

    id: Mapped[int] = mapped_column(primary_key=True)
    parent_id: Mapped[int] = mapped_column(ForeignKey("parent.id"))

Above, whenever a collection of Parent objects are loaded, each Parent will also have its children collection populated, using the "selectin" loader strategy that emits a second query.

The default value of the relationship.lazy argument is "select", which indicates Lazy Loading.

Relationship Loading with Loader Options¶

The other, and possibly more common way to configure loading strategies is to set them up on a per-query basis against specific attributes using the Select.options() method. Very detailed control over relationship loading is available using loader options; the most common are joinedload(), selectinload() and lazyload(). The option accepts a class-bound attribute referring to the specific class/attribute that should be targeted:

from sqlalchemy import select
from sqlalchemy.orm import lazyload

# set children to load lazily
stmt = select(Parent).options(lazyload(Parent.children))

from sqlalchemy.orm import joinedload

# set children to load eagerly with a join
stmt = select(Parent).options(joinedload(Parent.children))

The loader options can also be “chained” using method chaining to specify how loading should occur further levels deep:

from sqlalchemy import select
from sqlalchemy.orm import joinedload

stmt = select(Parent).options(
    joinedload(Parent.children).subqueryload(Child.subelements)
)

Chained loader options can be applied against a “lazy” loaded collection. This means that when a collection or association is lazily loaded upon access, the specified option will then take effect:

from sqlalchemy import select
from sqlalchemy.orm import lazyload

stmt = select(Parent).options(lazyload(Parent.children).subqueryload(Child.subelements))

Above, the query will return Parent objects without the children collections loaded. When the children collection on a particular Parent object is first accessed, it will lazy load the related objects, but additionally apply eager loading to the subelements collection on each member of children.

Adding Criteria to loader options¶

The relationship attributes used to indicate loader options include the ability to add additional filtering criteria to the ON clause of the join that’s created, or to the WHERE criteria involved, depending on the loader strategy. This can be achieved using the PropComparator.and_() method which will pass through an option such that loaded results are limited to the given filter criteria:

from sqlalchemy import select
from sqlalchemy.orm import lazyload

stmt = select(A).options(lazyload(A.bs.and_(B.id > 5)))

When using limiting criteria, if a particular collection is already loaded it won’t be refreshed; to ensure the new criteria takes place, apply the Populate Existing execution option:

from sqlalchemy import select
from sqlalchemy.orm import lazyload

stmt = (
    select(A)
    .options(lazyload(A.bs.and_(B.id > 5)))
    .execution_options(populate_existing=True)
)

In order to add filtering criteria to all occurrences of an entity throughout a query, regardless of loader strategy or where it occurs in the loading process, see the with_loader_criteria() function.

Added in version 1.4.

Specifying Sub-Options with Load.options()¶

Using method chaining, the loader style of each link in the path is explicitly stated. To navigate along a path without changing the existing loader style of a particular attribute, the defaultload() method/function may be used:

from sqlalchemy import select
from sqlalchemy.orm import defaultload

stmt = select(A).options(defaultload(A.atob).joinedload(B.btoc))

A similar approach can be used to specify multiple sub-options at once, using the Load.options() method:

from sqlalchemy import select
from sqlalchemy.orm import defaultload
from sqlalchemy.orm import joinedload

stmt = select(A).options(
    defaultload(A.atob).options(joinedload(B.btoc), joinedload(B.btod))
)

See also

Using load_only() on related objects and collections - illustrates examples of combining relationship and column-oriented loader options.

Note

The loader options applied to an object’s lazy-loaded collections are “sticky” to specific object instances, meaning they will persist upon collections loaded by that specific object for as long as it exists in memory. For example, given the previous example:

stmt = select(Parent).options(lazyload(Parent.children).subqueryload(Child.subelements))

if the children collection on a particular Parent object loaded by the above query is expired (such as when a Session object’s transaction is committed or rolled back, or Session.expire_all() is used), when the Parent.children collection is next accessed in order to re-load it, the Child.subelements collection will again be loaded using subquery eager loading. This stays the case even if the above Parent object is accessed from a subsequent query that specifies a different set of options. To change the options on an existing object without expunging it and re-loading, they must be set explicitly in conjunction using the Populate Existing execution option:

# change the options on Parent objects that were already loaded
stmt = (
    select(Parent)
    .execution_options(populate_existing=True)
    .options(lazyload(Parent.children).lazyload(Child.subelements))
    .all()
)

If the objects loaded above are fully cleared from the Session, such as due to garbage collection or that Session.expunge_all() were used, the “sticky” options will also be gone and the newly created objects will make use of new options if loaded again.

A future SQLAlchemy release may add more alternatives to manipulating the loader options on already-loaded objects.

Lazy Loading¶

By default, all inter-object relationships are lazy loading. The scalar or collection attribute associated with a relationship() contains a trigger which fires the first time the attribute is accessed. This trigger typically issues a SQL call at the point of access in order to load the related object or objects:

>>> spongebob.addressesSELECT
    addresses.id AS addresses_id,
    addresses.email_address AS addresses_email_address,
    addresses.user_id AS addresses_user_id
FROM addresses
WHERE ? = addresses.user_id
[5]
[<Address(u'spongebob@google.com')>, <Address(u'j25@yahoo.com')>]

The one case where SQL is not emitted is for a simple many-to-one relationship, when the related object can be identified by its primary key alone and that object is already present in the current Session. For this reason, while lazy loading can be expensive for related collections, in the case that one is loading lots of objects with simple many-to-ones against a relatively small set of possible target objects, lazy loading may be able to refer to these objects locally without emitting as many SELECT statements as there are parent objects.

This default behavior of “load upon attribute access” is known as “lazy” or “select” loading - the name “select” because a “SELECT” statement is typically emitted when the attribute is first accessed.

Lazy loading can be enabled for a given attribute that is normally configured in some other way using the lazyload() loader option:

from sqlalchemy import select
from sqlalchemy.orm import lazyload

# force lazy loading for an attribute that is set to
# load some other way normally
stmt = select(User).options(lazyload(User.addresses))

Preventing unwanted lazy loads using raiseload¶

The lazyload() strategy produces an effect that is one of the most common issues referred to in object relational mapping; the N plus one problem, which states that for any N objects loaded, accessing their lazy-loaded attributes means there will be N+1 SELECT statements emitted. In SQLAlchemy, the usual mitigation for the N+1 problem is to make use of its very capable eager load system. However, eager loading requires that the attributes which are to be loaded be specified with the Select up front. The problem of code that may access other attributes that were not eagerly loaded, where lazy loading is not desired, may be addressed using the raiseload() strategy; this loader strategy replaces the behavior of lazy loading with an informative error being raised:

from sqlalchemy import select
from sqlalchemy.orm import raiseload

stmt = select(User).options(raiseload(User.addresses))

Above, a User object loaded from the above query will not have the .addresses collection loaded; if some code later on attempts to access this attribute, an ORM exception is raised.

raiseload() may be used with a so-called “wildcard” specifier to indicate that all relationships should use this strategy. For example, to set up only one attribute as eager loading, and all the rest as raise:

from sqlalchemy import select
from sqlalchemy.orm import joinedload
from sqlalchemy.orm import raiseload

stmt = select(Order).options(joinedload(Order.items), raiseload("*"))

The above wildcard will apply to all relationships not just on Order besides items, but all those on the Item objects as well. To set up raiseload() for only the Order objects, specify a full path with Load:

from sqlalchemy import select
from sqlalchemy.orm import joinedload
from sqlalchemy.orm import Load

stmt = select(Order).options(joinedload(Order.items), Load(Order).raiseload("*"))

Conversely, to set up the raise for just the Item objects:

stmt = select(Order).options(joinedload(Order.items).raiseload("*"))

The raiseload() option applies only to relationship attributes. For column-oriented attributes, the defer() option supports the defer.raiseload option which works in the same way.

Tip

The “raiseload” strategies do not apply within the unit of work flush process. That means if the Session.flush() process needs to load a collection in order to finish its work, it will do so while bypassing any raiseload() directives.

See also

Wildcard Loading Strategies

Using raiseload to prevent deferred column loads

Joined Eager Loading¶

Joined eager loading is the oldest style of eager loading included with the SQLAlchemy ORM. It works by connecting a JOIN (by default a LEFT OUTER join) to the SELECT statement emitted, and populates the target scalar/collection from the same result set as that of the parent.

At the mapping level, this looks like:

class Address(Base):
    # ...

    user: Mapped[User] = relationship(lazy="joined")

Joined eager loading is usually applied as an option to a query, rather than as a default loading option on the mapping, in particular when used for collections rather than many-to-one-references. This is achieved using the joinedload() loader option:

>>> from sqlalchemy importselect>>> from sqlalchemy.orm importjoinedload>>> stmt=select(User).options(joinedload(User.addresses)).filter_by(name="spongebob")>>> spongebob=session.scalars(stmt).unique().all()SELECT
    addresses_1.id AS addresses_1_id,
    addresses_1.email_address AS addresses_1_email_address,
    addresses_1.user_id AS addresses_1_user_id,
    users.id AS users_id, users.name AS users_name,
    users.fullname AS users_fullname,
    users.nickname AS users_nickname
FROM users
LEFT OUTER JOIN addresses AS addresses_1
    ON users.id = addresses_1.user_id
WHERE users.name = ?
['spongebob']

Tip

When including joinedload() in reference to a one-to-many or many-to-many collection, the Result.unique() method must be applied to the returned result, which will uniquify the incoming rows by primary key that otherwise are multiplied out by the join. The ORM will raise an error if this is not present.

This is not automatic in modern SQLAlchemy, as it changes the behavior of the result set to return fewer ORM objects than the statement would normally return in terms of number of rows. Therefore SQLAlchemy keeps the use of Result.unique() explicit, so there’s no ambiguity that the returned objects are being uniqified on primary key.

The JOIN emitted by default is a LEFT OUTER JOIN, to allow for a lead object that does not refer to a related row. For an attribute that is guaranteed to have an element, such as a many-to-one reference to a related object where the referencing foreign key is NOT NULL, the query can be made more efficient by using an inner join; this is available at the mapping level via the relationship.innerjoin flag:

class Address(Base):
    # ...

    user_id: Mapped[int] = mapped_column(ForeignKey("users.id"))
    user: Mapped[User] = relationship(lazy="joined", innerjoin=True)

At the query option level, via the joinedload.innerjoin flag:

from sqlalchemy import select
from sqlalchemy.orm import joinedload

stmt = select(Address).options(joinedload(Address.user, innerjoin=True))

The JOIN will right-nest itself when applied in a chain that includes an OUTER JOIN:

>>> from sqlalchemy importselect>>> from sqlalchemy.orm importjoinedload>>> stmt=select(User).options(...     joinedload(User.addresses).joinedload(Address.widgets, innerjoin=True)... )>>> results=session.scalars(stmt).unique().all()SELECT
    widgets_1.id AS widgets_1_id,
    widgets_1.name AS widgets_1_name,
    addresses_1.id AS addresses_1_id,
    addresses_1.email_address AS addresses_1_email_address,
    addresses_1.user_id AS addresses_1_user_id,
    users.id AS users_id, users.name AS users_name,
    users.fullname AS users_fullname,
    users.nickname AS users_nickname
FROM users
LEFT OUTER JOIN (
    addresses AS addresses_1 JOIN widgets AS widgets_1 ON
    addresses_1.widget_id = widgets_1.id
) ON users.id = addresses_1.user_id

Tip

If using database row locking techniques when emitting the SELECT, meaning the Select.with_for_update() method is being used to emit SELECT..FOR UPDATE, the joined table may be locked as well, depending on the behavior of the backend in use. It’s not recommended to use joined eager loading at the same time as SELECT..FOR UPDATE for this reason.

The Zen of Joined Eager Loading¶

Since joined eager loading seems to have many resemblances to the use of Select.join(), it often produces confusion as to when and how it should be used. It is critical to understand the distinction that while Select.join() is used to alter the results of a query, joinedload() goes through great lengths to not alter the results of the query, and instead hide the effects of the rendered join to only allow for related objects to be present.

The philosophy behind loader strategies is that any set of loading schemes can be applied to a particular query, and the results don’t change - only the number of SQL statements required to fully load related objects and collections changes. A particular query might start out using all lazy loads. After using it in context, it might be revealed that particular attributes or collections are always accessed, and that it would be more efficient to change the loader strategy for these. The strategy can be changed with no other modifications to the query, the results will remain identical, but fewer SQL statements would be emitted. In theory (and pretty much in practice), nothing you can do to the Select would make it load a different set of primary or related objects based on a change in loader strategy.

How joinedload() in particular achieves this result of not impacting entity rows returned in any way is that it creates an anonymous alias of the joins it adds to your query, so that they can’t be referenced by other parts of the query. For example, the query below uses joinedload() to create a LEFT OUTER JOIN from users to addresses, however the ORDER BY added against Address.email_address is not valid - the Address entity is not named in the query:

>>> from sqlalchemy importselect>>> from sqlalchemy.orm importjoinedload>>> stmt=(...     select(User)...     .options(joinedload(User.addresses))...     .filter(User.name == "spongebob")...     .order_by(Address.email_address)... )>>> result=session.scalars(stmt).unique().all()SELECT
    addresses_1.id AS addresses_1_id,
    addresses_1.email_address AS addresses_1_email_address,
    addresses_1.user_id AS addresses_1_user_id,
    users.id AS users_id,
    users.name AS users_name,
    users.fullname AS users_fullname,
    users.nickname AS users_nickname
FROM users
LEFT OUTER JOIN addresses AS addresses_1
    ON users.id = addresses_1.user_id
WHERE users.name = ?
ORDER BY addresses.email_address   <-- this part is wrong !
['spongebob']

Above, ORDER BY addresses.email_address is not valid since addresses is not in the FROM list. The correct way to load the User records and order by email address is to use Select.join():

>>> from sqlalchemy importselect>>> stmt=(...     select(User)...     .join(User.addresses)...     .filter(User.name == "spongebob")...     .order_by(Address.email_address)... )>>> result=session.scalars(stmt).unique().all()SELECT
    users.id AS users_id,
    users.name AS users_name,
    users.fullname AS users_fullname,
    users.nickname AS users_nickname
FROM users
JOIN addresses ON users.id = addresses.user_id
WHERE users.name = ?
ORDER BY addresses.email_address
['spongebob']

The statement above is of course not the same as the previous one, in that the columns from addresses are not included in the result at all. We can add joinedload() back in, so that there are two joins - one is that which we are ordering on, the other is used anonymously to load the contents of the User.addresses collection:

>>> stmt=(...     select(User)...     .join(User.addresses)...     .options(joinedload(User.addresses))...     .filter(User.name == "spongebob")...     .order_by(Address.email_address)... )>>> result=session.scalars(stmt).unique().all()SELECT
    addresses_1.id AS addresses_1_id,
    addresses_1.email_address AS addresses_1_email_address,
    addresses_1.user_id AS addresses_1_user_id,
    users.id AS users_id, users.name AS users_name,
    users.fullname AS users_fullname,
    users.nickname AS users_nickname
FROM users JOIN addresses
    ON users.id = addresses.user_id
LEFT OUTER JOIN addresses AS addresses_1
    ON users.id = addresses_1.user_id
WHERE users.name = ?
ORDER BY addresses.email_address
['spongebob']

What we see above is that our usage of Select.join() is to supply JOIN clauses we’d like to use in subsequent query criterion, whereas our usage of joinedload() only concerns itself with the loading of the User.addresses collection, for each User in the result. In this case, the two joins most probably appear redundant - which they are. If we wanted to use just one JOIN for collection loading as well as ordering, we use the contains_eager() option, described in Routing Explicit Joins/Statements into Eagerly Loaded Collections below. But to see why joinedload() does what it does, consider if we were filtering on a particular Address:

>>> stmt=(...     select(User)...     .join(User.addresses)...     .options(joinedload(User.addresses))...     .filter(User.name == "spongebob")...     .filter(Address.email_address == "someaddress@foo.com")... )>>> result=session.scalars(stmt).unique().all()SELECT
    addresses_1.id AS addresses_1_id,
    addresses_1.email_address AS addresses_1_email_address,
    addresses_1.user_id AS addresses_1_user_id,
    users.id AS users_id, users.name AS users_name,
    users.fullname AS users_fullname,
    users.nickname AS users_nickname
FROM users JOIN addresses
    ON users.id = addresses.user_id
LEFT OUTER JOIN addresses AS addresses_1
    ON users.id = addresses_1.user_id
WHERE users.name = ? AND addresses.email_address = ?
['spongebob', 'someaddress@foo.com']

Above, we can see that the two JOINs have very different roles. One will match exactly one row, that of the join of User and Address where Address.email_address=='someaddress@foo.com'. The other LEFT OUTER JOIN will match all Address rows related to User, and is only used to populate the User.addresses collection, for those User objects that are returned.

By changing the usage of joinedload() to another style of loading, we can change how the collection is loaded completely independently of SQL used to retrieve the actual User rows we want. Below we change joinedload() into selectinload():

>>> stmt=(...     select(User)...     .join(User.addresses)...     .options(selectinload(User.addresses))...     .filter(User.name == "spongebob")...     .filter(Address.email_address == "someaddress@foo.com")... )>>> result=session.scalars(stmt).all()SELECT
    users.id AS users_id,
    users.name AS users_name,
    users.fullname AS users_fullname,
    users.nickname AS users_nickname
FROM users
JOIN addresses ON users.id = addresses.user_id
WHERE
    users.name = ?
    AND addresses.email_address = ?
['spongebob', 'someaddress@foo.com']
# ... selectinload() emits a SELECT in order
# to load all address records ...

When using joined eager loading, if the query contains a modifier that impacts the rows returned externally to the joins, such as when using DISTINCT, LIMIT, OFFSET or equivalent, the completed statement is first wrapped inside a subquery, and the joins used specifically for joined eager loading are applied to the subquery. SQLAlchemy’s joined eager loading goes the extra mile, and then ten miles further, to absolutely ensure that it does not affect the end result of the query, only the way collections and related objects are loaded, no matter what the format of the query is.

Select IN loading¶

In most cases, selectin loading is the most simple and efficient way to eagerly load collections of objects. The only scenario in which selectin eager loading is not feasible is when the model is using composite primary keys, and the backend database does not support tuples with IN, which currently includes SQL Server.

“Select IN” eager loading is provided using the "selectin" argument to relationship.lazy or by using the selectinload() loader option. This style of loading emits a SELECT that refers to the primary key values of the parent object, or in the case of a many-to-one relationship to those of the child objects, inside of an IN clause, in order to load related associations:

>>> from sqlalchemy importselect>>> from sqlalchemy.orm importselectinload>>> stmt=(...     select(User)...     .options(selectinload(User.addresses))...     .filter(or_(User.name == "spongebob", User.name == "ed"))... )>>> result=session.scalars(stmt).all()SELECT
    users.id AS users_id,
    users.name AS users_name,
    users.fullname AS users_fullname,
    users.nickname AS users_nickname
FROM users
WHERE users.name = ? OR users.name = ?
('spongebob', 'ed')
SELECT
    addresses.id AS addresses_id,
    addresses.email_address AS addresses_email_address,
    addresses.user_id AS addresses_user_id
FROM addresses
WHERE addresses.user_id IN (?, ?)
(5, 7)

Above, the second SELECT refers to addresses.user_id IN (5, 7), where the “5” and “7” are the primary key values for the previous two User objects loaded; after a batch of objects are completely loaded, their primary key values are injected into the IN clause for the second SELECT. Because the relationship between User and Address has a simple primary join condition and provides that the primary key values for User can be derived from Address.user_id, the statement has no joins or subqueries at all.

For simple many-to-one loads, a JOIN is also not needed as the foreign key value from the parent object is used:

>>> from sqlalchemy importselect>>> from sqlalchemy.orm importselectinload>>> stmt=select(Address).options(selectinload(Address.user))>>> result=session.scalars(stmt).all()SELECT
    addresses.id AS addresses_id,
    addresses.email_address AS addresses_email_address,
    addresses.user_id AS addresses_user_id
    FROM addresses
SELECT
    users.id AS users_id,
    users.name AS users_name,
    users.fullname AS users_fullname,
    users.nickname AS users_nickname
FROM users
WHERE users.id IN (?, ?)
(1, 2)

Tip

by “simple” we mean that the relationship.primaryjoin condition expresses an equality comparison between the primary key of the “one” side and a straight foreign key of the “many” side, without any additional criteria.

Select IN loading also supports many-to-many relationships, where it currently will JOIN across all three tables to match rows from one side to the other.

Things to know about this kind of loading include:

The strategy emits a SELECT for up to 500 parent primary key values at a time, as the primary keys are rendered into a large IN expression in the SQL statement. Some databases like Oracle Database have a hard limit on how large an IN expression can be, and overall the size of the SQL string shouldn’t be arbitrarily large.
As “selectin” loading relies upon IN, for a mapping with composite primary keys, it must use the “tuple” form of IN, which looks like WHERE (table.column_a, table.column_b) IN ((?, ?), (?, ?), (?, ?)). This syntax is not currently supported on SQL Server and for SQLite requires at least version 3.15. There is no special logic in SQLAlchemy to check ahead of time which platforms support this syntax or not; if run against a non-supporting platform, the database will return an error immediately. An advantage to SQLAlchemy just running the SQL out for it to fail is that if a particular database does start supporting this syntax, it will work without any changes to SQLAlchemy (as was the case with SQLite).

Subquery Eager Loading¶

Legacy Feature

The subqueryload() eager loader is mostly legacy at this point, superseded by the selectinload() strategy which is of much simpler design, more flexible with features such as Yield Per, and emits more efficient SQL statements in most cases. As subqueryload() relies upon re-interpreting the original SELECT statement, it may fail to work efficiently when given very complex source queries.

subqueryload() may continue to be useful for the specific case of an eager loaded collection for objects that use composite primary keys, on the Microsoft SQL Server backend that continues to not have support for the “tuple IN” syntax.

Subquery loading is similar in operation to selectin eager loading, however the SELECT statement which is emitted is derived from the original statement, and has a more complex query structure as that of selectin eager loading.

Subquery eager loading is provided using the "subquery" argument to relationship.lazy or by using the subqueryload() loader option.

The operation of subquery eager loading is to emit a second SELECT statement for each relationship to be loaded, across all result objects at once. This SELECT statement refers to the original SELECT statement, wrapped inside of a subquery, so that we retrieve the same list of primary keys for the primary object being returned, then link that to the sum of all the collection members to load them at once:

>>> from sqlalchemy importselect>>> from sqlalchemy.orm importsubqueryload>>> stmt=select(User).options(subqueryload(User.addresses)).filter_by(name="spongebob")>>> results=session.scalars(stmt).all()SELECT
    users.id AS users_id,
    users.name AS users_name,
    users.fullname AS users_fullname,
    users.nickname AS users_nickname
FROM users
WHERE users.name = ?
('spongebob',)
SELECT
    addresses.id AS addresses_id,
    addresses.email_address AS addresses_email_address,
    addresses.user_id AS addresses_user_id,
    anon_1.users_id AS anon_1_users_id
FROM (
    SELECT users.id AS users_id
    FROM users
    WHERE users.name = ?) AS anon_1
JOIN addresses ON anon_1.users_id = addresses.user_id
ORDER BY anon_1.users_id, addresses.id
('spongebob',)

Things to know about this kind of loading include:

The SELECT statement emitted by the “subquery” loader strategy, unlike that of “selectin”, requires a subquery, and will inherit whatever performance limitations are present in the original query. The subquery itself may also incur performance penalties based on the specifics of the database in use.
“subquery” loading imposes some special ordering requirements in order to work correctly. A query which makes use of subqueryload() in conjunction with a limiting modifier such as Select.limit(), or Select.offset() should always include Select.order_by() against unique column(s) such as the primary key, so that the additional queries emitted by subqueryload() include the same ordering as used by the parent query. Without it, there is a chance that the inner query could return the wrong rows:
```
# incorrect, no ORDER BY
stmt = select(User).options(subqueryload(User.addresses).limit(1))

# incorrect if User.name is not unique
stmt = select(User).options(subqueryload(User.addresses)).order_by(User.name).limit(1)

# correct
stmt = (
    select(User)
    .options(subqueryload(User.addresses))
    .order_by(User.name, User.id)
    .limit(1)
)
```
See also

Why is ORDER BY recommended with LIMIT (especially with subqueryload())? - detailed example
“subquery” loading also incurs additional performance / complexity issues when used on a many-levels-deep eager load, as subqueries will be nested repeatedly.
“subquery” loading is not compatible with the “batched” loading supplied by Yield Per, both for collection and scalar relationships.

For the above reasons, the “selectin” strategy should be preferred over “subquery”.

See also

Select IN loading

What Kind of Loading to Use ?¶

Which type of loading to use typically comes down to optimizing the tradeoff between number of SQL executions, complexity of SQL emitted, and amount of data fetched.

One to Many / Many to Many Collection - The selectinload() is generally the best loading strategy to use. It emits an additional SELECT that uses as few tables as possible, leaving the original statement unaffected, and is most flexible for any kind of originating query. Its only major limitation is when using a table with composite primary keys on a backend that does not support “tuple IN”, which currently includes SQL Server and very old SQLite versions; all other included backends support it.

Many to One - The joinedload() strategy is the most general purpose strategy. In special cases, the immediateload() strategy may also be useful, if there are a very small number of potential related values, as this strategy will fetch the object from the local Session without emitting any SQL if the related object is already present.

Polymorphic Eager Loading¶

Specification of polymorphic options on a per-eager-load basis is supported. See the section Eager Loading of Polymorphic Subtypes for examples of the PropComparator.of_type() method in conjunction with the with_polymorphic() function.

Wildcard Loading Strategies¶

Each of joinedload(), subqueryload(), lazyload(), selectinload(), and raiseload() can be used to set the default style of relationship() loading for a particular query, affecting all relationship() -mapped attributes not otherwise specified in the statement. This feature is available by passing the string '*' as the argument to any of these options:

from sqlalchemy import select
from sqlalchemy.orm import lazyload

stmt = select(MyClass).options(lazyload("*"))

Above, the lazyload('*') option will supersede the lazy setting of all relationship() constructs in use for that query, with the exception of those that use lazy='write_only' or lazy='dynamic'.

If some relationships specify lazy='joined' or lazy='selectin', for example, using lazyload('*') will unilaterally cause all those relationships to use 'select' loading, e.g. emit a SELECT statement when each attribute is accessed.

The option does not supersede loader options stated in the query, such as joinedload(), selectinload(), etc. The query below will still use joined loading for the widget relationship:

from sqlalchemy import select
from sqlalchemy.orm import lazyload
from sqlalchemy.orm import joinedload

stmt = select(MyClass).options(lazyload("*"), joinedload(MyClass.widget))

While the instruction for joinedload() above will take place regardless of whether it appears before or after the lazyload() option, if multiple options that each included "*" were passed, the last one will take effect.

Per-Entity Wildcard Loading Strategies¶

A variant of the wildcard loader strategy is the ability to set the strategy on a per-entity basis. For example, if querying for User and Address, we can instruct all relationships on Address to use lazy loading, while leaving the loader strategies for User unaffected, by first applying the Load object, then specifying the * as a chained option:

from sqlalchemy import select
from sqlalchemy.orm import Load

stmt = select(User, Address).options(Load(Address).lazyload("*"))

Above, all relationships on Address will be set to a lazy load.

Routing Explicit Joins/Statements into Eagerly Loaded Collections¶

The behavior of joinedload() is such that joins are created automatically, using anonymous aliases as targets, the results of which are routed into collections and scalar references on loaded objects. It is often the case that a query already includes the necessary joins which represent a particular collection or scalar reference, and the joins added by the joinedload feature are redundant - yet you’d still like the collections/references to be populated.

For this SQLAlchemy supplies the contains_eager() option. This option is used in the same manner as the joinedload() option except it is assumed that the Select object will explicitly include the appropriate joins, typically using methods like Select.join(). Below, we specify a join between User and Address and additionally establish this as the basis for eager loading of User.addresses:

from sqlalchemy.orm import contains_eager

stmt = select(User).join(User.addresses).options(contains_eager(User.addresses))

If the “eager” portion of the statement is “aliased”, the path should be specified using PropComparator.of_type(), which allows the specific aliased() construct to be passed:

# use an alias of the Address entityadalias=aliased(Address)# construct a statement which expects the "addresses" resultsstmt=(select(User).outerjoin(User.addresses.of_type(adalias)).options(contains_eager(User.addresses.of_type(adalias))))# get results normallyr=session.scalars(stmt).unique().all()SELECT
    users.user_id AS users_user_id,
    users.user_name AS users_user_name,
    adalias.address_id AS adalias_address_id,
    adalias.user_id AS adalias_user_id,
    adalias.email_address AS adalias_email_address,
    (...other columns...)
FROM users
LEFT OUTER JOIN email_addresses AS email_addresses_1
ON users.user_id = email_addresses_1.user_id

The path given as the argument to contains_eager() needs to be a full path from the starting entity. For example if we were loading Users->orders->Order->items->Item, the option would be used as:

stmt = select(User).options(contains_eager(User.orders).contains_eager(Order.items))

Using contains_eager() to load a custom-filtered collection result¶

When we use contains_eager(), we are constructing ourselves the SQL that will be used to populate collections. From this, it naturally follows that we can opt to modify what values the collection is intended to store, by writing our SQL to load a subset of elements for collections or scalar attributes.

Tip

SQLAlchemy now has a much simpler way to do this, by allowing WHERE criteria to be added directly to loader options such as joinedload() and selectinload() using PropComparator.and_(). See the section Adding Criteria to loader options for examples.

The techniques described here still apply if the related collection is to be queried using SQL criteria or modifiers more complex than a simple WHERE clause.

As an example, we can load a User object and eagerly load only particular addresses into its .addresses collection by filtering the joined data, routing it using contains_eager(), also using Populate Existing to ensure any already-loaded collections are overwritten:

stmt = (
    select(User)
    .join(User.addresses)
    .filter(Address.email_address.like("%@aol.com"))
    .options(contains_eager(User.addresses))
    .execution_options(populate_existing=True)
)

The above query will load only User objects which contain at least Address object that contains the substring 'aol.com' in its email field; the User.addresses collection will contain only these Address entries, and not any other Address entries that are in fact associated with the collection.

Tip

In all cases, the SQLAlchemy ORM does not overwrite already loaded attributes and collections unless told to do so. As there is an identity map in use, it is often the case that an ORM query is returning objects that were in fact already present and loaded in memory. Therefore, when using contains_eager() to populate a collection in an alternate way, it is usually a good idea to use Populate Existing as illustrated above so that an already-loaded collection is refreshed with the new data. The populate_existing option will reset all attributes that were already present, including pending changes, so make sure all data is flushed before using it. Using the Session with its default behavior of autoflush is sufficient.

Note

The customized collection we load using contains_eager() is not “sticky”; that is, the next time this collection is loaded, it will be loaded with its usual default contents. The collection is subject to being reloaded if the object is expired, which occurs whenever the Session.commit(), Session.rollback() methods are used assuming default session settings, or the Session.expire_all() or Session.expire() methods are used.

See also

Adding Criteria to loader options - modern API allowing WHERE criteria directly within any relationship loader option

SQLAlchemy 2.1 Documentation

SQLAlchemy 2.1 Documentation

SQLAlchemy ORM

Relationship Loading Techniques¶

Summary of Relationship Loading Styles¶

Configuring Loader Strategies at Mapping Time¶

Relationship Loading with Loader Options¶

Adding Criteria to loader options¶

Specifying Sub-Options with Load.options()¶

Lazy Loading¶

Preventing unwanted lazy loads using raiseload¶

Joined Eager Loading¶

The Zen of Joined Eager Loading¶

Select IN loading¶

Subquery Eager Loading¶

What Kind of Loading to Use ?¶

Polymorphic Eager Loading¶

Wildcard Loading Strategies¶

Per-Entity Wildcard Loading Strategies¶

Routing Explicit Joins/Statements into Eagerly Loaded Collections¶

Using contains_eager() to load a custom-filtered collection result¶

Relationship Loader API¶