I've spent some time to integrate sqlalchemy into grpc-mate this weekend, I want to use this article to write about my feelings about this tool. Since I come from java world, I may compare it with JPA in java, there are some other ORM libs in python, e.g: Django's ORM,peewee, but sqlalchemy is the most popular one

sqlalchemy

DB Engine and Session

sqlalchemy comes with core layer and ORM layer, most of the time for new project ORM layer is recommeded.

Engine object in sqlalchemy is in charge of db connection pool management and dialect selection, in java world it's similar to java datasource , e.g: we could use HikariCP to manage database connection pool. to create an Engine, we need at least a db url in format dialect[+driver]://user:password@host/dbname[?key=value..], it also have a rich set of params to contorl the DB engine behavior . it's also recommeded to have only one engine object per db, so in grpc-mate, I create the engine object in the __init__.py for data_store module

db_url = os.getenv('db_url', 'sqlite:///:memory:')
engine = create_engine(db_url, echo=True)

we also get the db_url from system env, if it's not set it will use sqlite in memory db for unit test

The Session class is simlar to EntitManagerFactory in JPA, it's in charge of create a session object which act as db connection proxy to communicate with database, in python we could have different kind of Session class, sqlalchemy offer a helper class sessionmaker to create customized Session Class for the end user, we could use code Session = sessionmaker(bind=engine) to create the Session class.

Manage Session object lifecycle

sqlalchemy has a full document on how to manage session, below is point I got from the long document

  • The Session is very much intended to be used in a non-concurrent fashion, which usually means in only one thread at a time.
  • One Session could have more than one transactions, but it's recommeded for each transaction to be short lived
  • Session object must be closed after usage
  • by default, autoflush is enabled , but autocommit is not enabled

for most of the use case in grpc-mate, I would like to have a contextmanager to help me to manage session life cycle for me.

@contextmanager
def session_scope():
    """Provide a transactional scope around a series of operations."""
    session = Session()
    try:
        yield session
        session.commit()
    except:
        session.rollback()
        raise
    finally:
        session.close()

with the session_scope method, I could use with key word in python to obtain a session object in the safe way, like below to persist a new project object into database

faker = Faker()
product = DBProduct(product_name=faker.name(), 
                        product_price=Decimal(faker.random_int() / 100),
                        product_status=InStock, 
                        category=faker.name())
with session_scope() as session:
    session.add(product)

DB Mapping

in recenty sqlalchemy version, it provide declarative way to delcar the python <-> DB mapping, the mapping is simlar to JPA's Entity class, but it's kind of simple than JPA Entity, first of all, we need a base class, which could be created by code Base = declarative_base(), then we could use Base as a base class to map a python class to DataBase table., the Column, String are all artifact imported from sqlalchemy

from sqlalchemy import Column, SMALLINT, Integer, String, DECIMAL
from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()

class DBProduct(Base):
    __tablename__ = 'products'
    product_id = Column(Integer, primary_key=True)
    product_name = Column(String(200))
    product_price = Column(DECIMAL(10, 2))
    product_status = Column(SMALLINT)
    category = Column(String(50))

to create the db for unit test, we could have pytest fixture to create a new table before each test is running and we also have a safe guard to make sure this only execute in unit test for sqlite db
the yield keyword will make sure the init_db function pause on yield and let the unit test run, after unit test complete, it will drop all the database, so that we have a clean state

@pytest.fixture(autouse=True, scope='function')
def init_db():
    if engine.url.__str__() == 'sqlite:///:memory:':
        Base.metadata.create_all(engine)
        yield
        Base.metadata.drop_all(engine_pair)

Queries

compare with JPA's JPQL, sqlalchemy's query ability is a little limit, Query is the main entrance, we could create Query object by session.query(Entities), then it has a chain style of call stack to filter more field, when we need to do complex queries, the sqlalchemy query will be very complex and not as clean as JPQL, I think I will switch to use sqlalchemy core to use raw sql if I need complex sql query, I may change my mind when I discover more.