Pydisk-querybuilder (PQB)
The goal of PQB is to offer a programmatic interface to various SQL
dialects.
It aims to replace stringified Schema definitions and Query writing.
It can be used to design custom ORMs or embedded within database-equipped
applications with a lot of queries.
Most of the SQL query builders available within the Python ecosystem are interwoven within ORMs and/or abandoned projects and/or missing some key features. In particular, they often restrict their focus to the DQL language only. Many project that are reluctant to use ORM end up rebuilding their own tailored query builder. The goal of PQB is to offer a sound interface to the SQL syntax.
The advantage of using a querybuilder are manyfold:
- Avoid complex bug-prone string templating
- Avoid syntax errors
- Avoid SQL injections
- Simplify programmatic query generations
- Provide helpers to transform some queries
- Support DDL/DML/TCL languages
Quickstart
from uuid import UUID, uuid4 as uuid_gen
from querybuilder.helpers import table
@table
class person:
id: UUID
name: str
birth_year: int
This code snippet creates a (non-mutable) table person
. From those object we can build
queries
. Those are not tied to any database.
q = Person.select([Person.c.birth_year, Person.c.name]) # Person.c is a shorthand for Person.columns
We can then get the string representation of the query q
by using str(q)
which returns
SELECT Person.birth_year, Person.name FROM Person
We can apply simple transformation to q
. As queries (and most object)
are immutable by design, each operation returns a new updated query object.
q2 = q.add_where(q.c.birth_year.gt(1800)).set_limit(10).set_columns(["name"])
Then, formatting the query with str(q2)
gives:
SELECT Person.name FROM Person WHERE Person.birth_year > 1800 FETCH FIRST 10 ROWS ONLY
Using an actual database backend
The module provides a standardization of access to the database backend with a rather precise transaction management. It should be considered as a wrapper that handles all the specificities of each database management system.
Because of the variety of actual support of some SQL features, it is however possible that some queries raise errors when sent to some DBMSs.
SQLite
from querybuilder.drivers.sqlite import Connector
db = Connector(":memory:") # The ":memory:" here is SQLite-specific for in-memory non-persistent database.
We can create the table person
using person.create()
which produces the following query:
CREATE TABLE Person (id UUID, name TEXT, birth_year INTEGER)
SQLite does not support UUID
.
Hence, when executing the above query in SQLite,
the querybuilder automatically falls back to another type (here TEXT
),
while raising the following warning:
UserWarning: sqlite does not support UUID, using TEXT type instead
Here is how the query is executed in the SQLite database db
:
db.execute(Person.create())
We can then create a query to populate the table:
q_insert = Person.insert_values(in_columns=(Person.c[["name", "birth_year"]]), values=[("Dijkstra", 1930), ("Euler", 1707), ("Steiner", 1796)])
db.execute(q_insert)
Now, our table admit three tuples:
cursor = db.execute(Person.select(Person.c[1:], orderby=Person.c.name))
results = list(cursor.fetchall())
print(results)
gives:
[('Dijkstra', 1930), ('Euler', 1707), ('Steiner', 1796)]
The values are safely embedded within q_insert
(click me to see details).
q_insert
(click me to see details).The query q_insert
above actually embedded the values within its formatting as shown by its pretty printing q_insert.pretty_print()
:
INSERT INTO Person (name, age) VALUES (?, 1930), (?, 1707), (?, 1796)
-- ↖{0: 'Dijkstra', 1: 'Euler', 2: 'Steiner'}
But it is still safe against injection as the string values will be passed as placeholders to an underlying prepared query:
q_inject = Person.insert_values(in_columns=Person.c[1:], values= [("\";DROP TABLE Person;", 0000)])
db.execute(q_inject)
cursor = db.execute(person.select(person.c[1:], where=person.c.birth_year.eq(0)))
results = list(cursor.fetchall())
print(results)
gives:
[('"; DROP TABLE Person;"', 0)]
We can also define updates easily, e.g., with the following query:
q_upd = Person.update(dict(id=uuid_gen()), where=Person.c.name.eq("Dijkstra"))
db.execute(q_upd)
which pretty prints as:
UPDATE Person SET id = ? WHERE Person.name = ?
-- ↖{0: UUID('33525c22-f938-4256-b673-e595ff6df828'), 1: 'Dijkstra'}
Of course, deletions are also possible, e.g.:
q_del = Person.delete(where=Person.c.id.isnull())
db.execute(q_del)
is formatted as:
DELETE FROM Person WHERE Person.id IS NULL
If we throw in more complicated tables, we will see that the querybuilder can handle some DBMS specificities on its own.
from querybuilder.helpers import table, colspec
@table
class tag:
id: colspec(int, primary_key=True, generated_as_identity=True)
name: str
Calling tag.create()
yields the following query:
CREATE TABLE tag (id INTEGER PRIMARY KEY GENERATED ALWAYS AS IDENTITY, name TEXT)
While the generated always as identity constraint is not supported by SQLite,
it is possible to rely on the AUTOINCREMENT
feature for a similar results.
This fallback can be observed using the db.stringify
method
which returns the dialect-aware (here SQLite) raw formatting of the query,
namely the query that would be executed by db.execute
.
db.stringify(tag.create())
CREATE TABLE tag (id INTEGER PRIMARY KEY AUTOINCREMENT, name TEXT)
Note that a warning message notifies the user that the
GENERATED ALWAYS AS IDENTITY
constraint has been imperfectly simulated
by the AUTOINCREMENT
feature of SQLite.
db.execute(tag.create())
We can then populate the table through INSERT queries:
q_tag = tag.insert_values(in_columns=(tag.c.name,), values=[("mathematician",)])
The string value is automatically passed through a placeholder to avoid SQL injections:
q_tag.pretty_print()
INSERT INTO tag (name) VALUES (?) -- ↖{0: 'mathematician'}
Then to actually insert into the database we simply do:
db.execute(q_tag)
We may also want to bulk insert without having to handle values directly by using placeholders.
q_tag_bulk = tag.insert_many(in_columns=(tag.c.name,))
db.executemany(q_tag_bulk, [dict(name=f"value_{i}") for i in range(1000)])
The query q_tag_bulk
contains a placeholder with no embedded values.
q_tag_bulk
contains a placeholder with no embedded values.Looking at q_tag_bulk.pretty_print()
, we can see that it uses a placeholder,
named :name
,
which is associated with the special object MISSING
represented by ø
.
INSERT INTO tag (name) VALUES (:name) -- ↖{name: ø}
Indeed, q_tag_bulk
does not embed the values for the placeholders:
they should be provided at execution time.
Both db.execute
and db.executemany
allows to provide the values
for valueless placeholders.
The former expects one value for each such placeholders,
while the second expects an iterable of these values,
as illustrated by the above execution.
We can also delete those useless tags with the following query:
q_delete = tag.delete(where=tag.c.name.like("value_%"))
db.execute(q_delete)
Finally, we can build a last table with references to connect tag with person:
@table
class tag_person:
person_id: person.c.id
tag_id: tag.c.id
db.execute(tag_person.create())
The query given by str(tag_person.create())
is:
CREATE TABLE tag_person(person_id TEXT REFERENCES Person(id), tag_id INTEGER REFERENCES tag(id))
Again, within SQLite, the UUID
type will fallback to TEXT
.
Remark that it automatically infers the types and add the constraints associated with the table definitions.
Let us label Dijkstra as a Mathematician:
condition = Person.c.name.eq("Dijkstra") & tag.c.name.eq("mathematician")
q_prod = Person.product(tag).select(columns=(Person.c.id, tag.c.id), where=condition)
The query q_prod
is of the following shape:
SELECT Person.id, tag.id
FROM Person, tag
WHERE Person.name = ? AND tag.name = ?
-- ↖{0: 'Dijkstra', 1: 'mathematician'}
We can insert the result in tag_person
with:
db.execute(tag_person.insert(query=q_prod))
Let us check by getting the list of mathematicians using the following complex query
q_math = tag_person\
.inner_join(tag, on=tag.c.id.eq(tag_person.c.tag_id))\
.inner_join(person, on=person.c.id.eq(tag_person.c.person_id))\
.select([person.c.name], where=tag.c.name.eq("mathematician"))
which is formatted as
SELECT Person.name FROM (tag_person INNER JOIN tag ON tag.tag_id = tag_person.id) INNER JOIN Person ON Person.id = tag_person.person_id WHERE tag.name = ?
and whose execution
cursor = db.execute(q_math)
results = list(cursor.fetchall())
print(results)
will return only "Dijkstra"
[("Dijkstra",)]
PostgreSQL
The previous queries work seamlessly on PostgreSQL by simply using the PostgreSQL connector:
from querybuilder.drivers.postgresql import Connector
db = Connector() # Postgresql has default connection information
The only differences with respect to using the SQLite driver are:
- We need to specify that
person.id
is the primary key ofperson
to allow us to use it as a foreign key intag_person
; - We do not have warnings, since PostgreSQL has implementations
of type
UUID
, and ofGENERATED ALWAYS AS IDENTITY
.
@table
class Person:
id : colspec(UUID, primary_key=True)
name : str
birth_year : int
@table
class tag:
id: colspec(int, primary_key=True, generated_as_identity=True)
name: str
@table
class tag_person:
tag_id: tag.c.id
person_id: person.c.id
db.execute(person.create(temporary=True))
db.execute(tag.create(temporary=True))
db.execute(tag_person.create(temporary=True))
Since PostgreSQL has no non-persistent in-memory database,
we indicated temporary=True
when generating the creation query of each table,
so that CREATE TEMPORARY TABLE
queries are produced.
For instance, str(Person.create(temporary=True))
gives the query:
CREATE TEMPORARY TABLE Person (id UUID PRIMARY KEY, name TEXT, birth_year INTEGER)