Hello. Today I share with you a cool macro. It’s neat because it generates code and performs static code checks based on database metadata. Most language’s compilers or type systems are not capable of this, but the Racket compiler is!
The 20ish line macro automatically generates functions for working
with single-table SELECT * ...
queries. It generates one row
accessor function per column in the table. It can be used like this:
#lang racket/base
(require db)
;; Query some data.
(define c (sqlite3-connect #:database "db.sqlite"))
(define todo (query-row c "SELECT * FROM todo LIMIT 1"))
;; Automatically define todo-id, todo-description, etc.
(define-table-accessors "db.sqlite" "todo")
(displayln (todo-id todo))
;; Prints 1
(displayln (todo-description todo))
;; Prints "Take out the trash"
(displayln (todo-dnoe todo))
;; Compile error: Identifier todo-dnoe doesn't exist
(displayln (todo-done todo))
;; Prints 0 (false). Oops!
No external codegen program, no Rails/Django style in-code table schema, and we get static code checks for free. Woohoo!
Let’s learn how this thing works, shall we?
Although we don’t use Lisp at Perpetua today, our team always strives to use and create incredible tools for our customers and in our work. If you want to help build the best e-commerce ad management tools, apply today!
#lang racket/base
(require
(for-syntax
racket/base
racket/syntax
srfi/26
db))
;; (define-table-accessors "<db-path>" "<table-name>")
(define-syntax (define-table-accessors stx)
(syntax-case stx ()
[(_ db-path table)
(begin
;; Query for column names
(define table-name (syntax->datum #'table))
(define c
(sqlite3-connect #:database (syntax->datum #'db-path)))
(define query
(format "select name from pragma_table_info(\"~a\");"
table-name))
(define column-names (map (cut vector-ref <> 0)
(query-rows c query)))
(define num-columns (length column-names))
;; Generate 1 function per column
(with-syntax
([(idx ...)
(datum->syntax stx (build-list num-columns values))]
[(read-column ...)
(datum->syntax stx
(map (cut format-symbol "~a-~a"
table-name <>)
column-names))])
#'(begin
(define (read-column row)
(vector-ref row idx)) ...)))]))
Before explaining this macro line-by-line, let’s take a detour and look at the details of the challenge at hand and the Racket APIs used in the solution.
We are using the Racket db module. When a SELECT *
query is made,
the result is usually one or more vectors. Vector is the Racket/Scheme
lingo for fixed size array. To access elements of the array, we use
the vector-ref function.
;; JS: arr[i]
(vector-ref arr i)
The columns in a table have a default order. The vectors we get from
SELECT *
contain the column data for that row in the default
order. For the todo table, a row vector could look like this:
#(1 "Take out the trash" 0)
This expression is a vector containing the id, description, and done flag for a todo row.
(vector-ref todo 0)
;; 1
(vector-ref todo 1)
;; "Take out the trash"
(vector-ref todo 2)
;; 0
We want a macro which would use the DB during compilation to generate functions to make it easier to pull the data out.
(define (todo-id todo)
(vector-ref todo 0))
(define (todo-description todo)
(vector-ref todo 1))
(define (todo-done todo)
(vector-ref todo 2))
In Racket macro-land, there are informally 3 types of stuff.
First, there’s normal Racket values. Numbers, strings, functions, symbols, lists, vectors (arrays), hash tables, etc.
"string"
'symbol
123 ;; Number
(list 1 2 3)
(vector 1 2 3)
The second type is the syntax object. Syntax objects are like a
serialized version of Racket code, along with optional metadata about
the code, such as where it came from (file, line number, etc). The
syntax->datum
function extracts the serialized code from a syntax
object.
;; This function call
(vector-ref todo 0)
;; Would have a syntax-object datum consisting
;; of a list:
(list 'vector-ref 'todo 0)
The last type of stuff is the pattern variable. Although not strictly necessary, pattern variables are useful for unpacking list-like syntax objects.
An example may be illuminating:
(define-syntax (mixup-list stx)
(syntax-case stx ()
[(mixup-list arg1 args ...)
(syntax (list args ... arg1 arg1 args ...))]))
;; Macro call
(mixup-list 1 2 3)
;; Transforms to this:
(list 2 3 1 1 2 3)
Here, arg1
and args
are the pattern variables. The ellipses after args
lets it capture a list (in this case 2 and 3), while arg1
just grabs
the first argument.
Technically, pattern variables and syntax objects are different things. A pattern variable usually has to be converted to a syntax object using the (syntax …) macro before making use of it.
Also, technically mixup-list
within macro definition is a pattern
variable too. This isn’t often used in code gen since it’s nearly
always just the name of the macro. However, writing it there helps to
show you the similarity between the list of pattern variables and the
actual macro expression.
To avoid unused variable warnings, _
can be used for the first
pattern variable instead of the macro’s name.
(define-syntax (mixup-list stx)
(syntax-case stx ()
[(_ arg1 args ...)
(syntax (list args ... arg1 arg1 args ...))]))
Many of the APIs used to create macros are actually macros themselves. Let’s take a look.
The (syntax XYZ)
stuff is a macro. It’s the syntax macro. It
recursively converts its arguments into a syntax object. The #'
prefix can be used as a shorthand; (syntax XYZ)
and #'XYZ
mean the exact same thing.
The syntax macro is able to recognize pattern variables and substitute
the corresponding syntax. It also recognizes the ellipses to expand
patterns which are bound to lists. The ellipses doesn’t have to come
right after a pattern variable (like (list args ...
before). It can
also come after something which contains a pattern variable—the
whole thing will get repeated with the pattern spliced in.
(define-syntax (list-the-whole-thing stx)
(syntax-case stx ()
[(_ arg ...)
#'(list (list (+ arg 5)) ...)]))
(list-the-whole-thing 1 2 3)
;; Expands to:
(list (list (+ 1 5)) (list (+ 2 5)) (list (+ 3 5)))
When define-syntax
(which is also a macro) is used the way we have
been using it, it defines a compiler-level function which takes the
macro expression’s code as a syntax object. When you return new syntax
from it, that’s what the original syntax gets replaced with.
(define-syntax (silly-macro stx)
(displayln (syntax->datum stx))
#'42)
This silly macro prints itself as it gets compiled, and then replaces itself with 42. What actually gets printed during compilation depends on what arguments it is given. The macro will accepts any number of arguments; it just throws them all away and becomes 42.
(silly-macro 1 2 3)
;; Prints (silly-macro 1 2 3) during compilation.
(silly-macro)
;; Prints (silly-macro) during compilation.
silly-macro
;; Prints silly-macro during compilation.
;; It wasn't even called this time!
(silly-macro (silly-macro 3))
;; Prints (silly-macro (silly-macro 3)) during compilation.
;; (silly-macro 3) is NOT printed! This macro is so silly, the
;; arguments you pass it don't even get compiled.
The syntax-case
macro accepts multiple pattern “clauses”. It finds
the first pattern which matches the macro call, and returns the syntax
object for that pattern. If none match, it’s a syntax error.
(define-syntax (one-two-three stx)
(syntax-case stx ()
[(_ one) #'(displayln "One arg")]
[(_ one two) #'(displayln "Two args")]
[(_ one two three) #'(displayln "Three args")]))
(one-two-three 1 2)
;; Transforms to (displayln "Two args") at runtime
(one-two-three 1 2 3 4)
;; Syntax error
(The extra ()
in syntax-case
needs to be there. It has a purpose
but we aren’t making use of it.)
Lastly, the with-syntax
macro is very similar to syntax-case
in
how it sets up pattern variables. However, instead of picking the
first pattern that matches, it binds all the patterns, and then
evaluates the body expression using them all.
(define-syntax (pair-reverse stx)
(syntax-case stx ()
[(_ arg ...)
(with-syntax ([(forward ...)
(syntax (arg ...))]
[(backward ...)
(datum->syntax stx
(reverse (syntax->datum (syntax (arg ...)))))])
#'(list (list forward backward) ...))]))
(pair-reverse 1 2 3 4)
;; Transforms to (list (list 1 4) (list 2 3) (list 3 2) (list 4 1))
Now that we have most of the prerequisites in place, let’s take a closer look at this 20ish line DB querying compiler extension.
#lang racket/base
(require
;; Imports code at the compile phase
(for-syntax
racket/base
racket/syntax
;; cut
srfi/26
;; sqlite3-connect
db))
;; New macro: define-table-accessors
(define-syntax (define-table-accessors stx)
;; Pattern match on the shape of the arguments
(syntax-case stx ()
;; This pattern clause accepts 2 args
;; db-path and table pattern variables
[(_ db-path table)
;; Begin starts a new block
(begin
;; Extract the table name string from
;; the pattern variable.
;; Same as (syntax->datum (syntax table)).
(define table-name (syntax->datum #'table))
;; Connect to SQLite using extracted db-path.
(define c
(sqlite3-connect #:database (syntax->datum #'db-path)))
;; Construct metadata query and extract list
;; of column names.
(define query
(format "select name from pragma_table_info(\"~a\");"
table-name))
;; Note: the (cut f x <> z) is equivalent to a
;; JS arrow function y => f(x,y,z).
(define column-names (map (cut vector-ref <> 0)
(query-rows c query)))
(define num-columns (length column-names))
(with-syntax
;; This pattern is the indices into the row
;; vector: (list 0 1 2 3 ...)
([(idx ...)
(datum->syntax stx (build-list num-columns values))]
;; This pattern is the list of names for
;; the new functions!
[(read-column ...)
(datum->syntax stx
(map (cut format-symbol "~a-~a"
table-name <>)
column-names))])
;; Create a sequence of functions
#'(begin
(define (read-column row)
(vector-ref row idx)) ...)))]))
Hat tip to real libraries which generate code from the database: SQLDelight from Cash App; Racquel, a Racket ORM which includes a much more advanced OOP version of what we have seen here.
Code generated from DB metadata makes sense compared to Rails or Django. There are clear advantages when a database is shared between codebases. No need to rewrite ORM schemas in a new language, or create dependencies to enable programmatic querying; we just get more from the database dependency we already have. Not to mention playing around with CSVs of unknown provenance.
Considering statically typed Racket exists and type info is in SQLite metadata tables, it is possible to add type hints to the generated functions. Then, not only do we get “dynamic language” static tooling like go-to-definition and unbound-identifier-error, but we get actual mypy/Typescript style type checking too.
And since it’s Racket, why not generate runtime validation functions based on DB types as well? That would be nice. Wouldn’t it be great if the Typescript compiler let us extend it to generate validation code from interface declarations? yeah..oh well.
With SQLite 3.37.0 and above, it is possible to use the
pragma_table_list()
table to generate bindings for every table using
just a reference to a database, not needing to specify any tables:
(define-table-accessors* "db.sqlite")
;; Defines functions for each table
;; using pragma_table_list
;; and pragma_table_info
Of course, if we don’t like the <tablename>-<columnname>
naming
system, we could add a prefix argument:
(define-table-accessors "db.sqlite" "todo" #:prefix t:)
(displayln (t:id todo))
The real heavy hitter would be this: statically checked variable
bindings for ad hoc SELECT
queries. The macro above only works with
SELECT *
queries for one table. It does not work with not ad hoc
SELECT
queries. In nearly every other language, when we have to run
an ad hoc query, we have to expect to pull a dynamically typed value
out of a dictionary using a string, if we are lucky. If we aren’t
lucky, we might have to use resort to using arr[0]
or arr[1]
to
look at query results. The horror! 😉
In Racket, using only a thin layer on top of the sql package, we could support ad hoc queries. Heck, we could put variable declarations right in the automatically-executed query.
(select [t.id tid]
[t.description tdesc]
[a.name assignee]
#:from (as todo t)
(inner-join (as people p)
(on (= p.id t.assignee_id)))
#:where (and (= t.done 0)
(= t.project_id ,project-id)))
(displayln tid)
(displayln assigne) ;; syntax error, typo
That’s all for today. Happy hacking!