I’ve been using the SQLite shell a lot lately to slice and dice data is slightly too large for excel. It’s a very powerful tool to get SQL on the command line without spinning up a large Postres server, and supports most of the SQL standard with syntax similar to Postgres. It’s super fast and supports loading CSV. I want to talk about some quality of life features in it that you probably don’t know about.
What is SQLite?
SQLite is the most widely distributed database in the world. It’s embedded as a library in your application. A lot of programming languages like python include it in the standard library. It stores application data in a single file that you can easily move around 1.
Installation of SQLite is available through your distribution’s package manager. If it’s not available there, go ahead and download the latest version here.
To open a SQLite file, at the command line, in the proper directory for your file run:
sqlite3 file
This opens up a prompt that you can run SQL and other commands at.
Importing Data
You can import CSV data at the command line using the .import file [table]
command where [table]
is the optional table name. If your CSV uses fancy
separators you can use the .sep CHAR
command to set your separator before
importing 2. By default it will load with the header as the column names.
Formatting The Output
You can format the output of your SQL in several ways. The default output
method is just some pipe separators called list
which is difficult to read on
a good day. For interactive use I like to set my .mode
to either table
or
column
. You can even have it generate tables in markdown by setting it to markdown
.
sqlite> .mode table
sqlite> select * from tag limit 10;
+----+-------------------------------------+--------------------------------+
| id | created_at | name |
+----+-------------------------------------+--------------------------------+
| 1 | 2024-06-19 14:40:31.672184981-04:00 | Bills and Statutes |
| 2 | 2024-06-19 14:40:32.487863971-04:00 | Inspector General |
| 3 | 2024-06-19 14:40:32.495319881-04:00 | House of Representatives |
| 4 | 2024-06-19 14:40:32.503659091-04:00 | Senate |
| 5 | 2024-06-19 14:40:32.511587207-04:00 | Department of Veterans Affairs |
| 6 | 2024-06-19 14:40:32.519278835-04:00 | Attorney General |
| 7 | 2024-06-19 14:40:32.52756259-04:00 | Fiscal year |
| 8 | 2024-06-19 14:40:33.103108651-04:00 | Sequoia National Park |
| 9 | 2024-06-19 14:40:33.111942766-04:00 | Kings Canyon National Park |
| 10 | 2024-06-19 14:40:33.121450233-04:00 | Interior Secretary |
+----+-------------------------------------+--------------------------------+
It’s also sometimes useful to format certain output from a query before you
display it. You can use printf
with the standard format specifiers to say how
you want it. When I’m summing up values I find it helpful to use this
printf('%,.2f', sum(x))
. Many times I wish I could define a single argument
function within the shell itself to take care of this, but say la vie.
You can use .output file
to send the results of the next command to a file.
This is especially useful if you’re including the results of the next command
in your blog post, or need to use the results elsewhere.
Examining Your Schema
.schema --indent TBL
shows you a pretty printed version of your tables sql
definition.
sqlite> .schema --indent tag
CREATE TABLE `tag`(
`id` integer PRIMARY KEY AUTOINCREMENT,
`created_at` datetime,
`name` text,
`short_line` text,
`hidden` numeric
);
CREATE UNIQUE INDEX `idx_tag_name` ON `tag`(`name`);
You can list your tables with .tables
and your indexes with .indexes
. Both
of these commands support using like pattern globs to identify tables and
indexes by name.
Profiling Your Queries
SQLite provides a lot of options for profiling your queries and seeing exact
statistics about the query. For instance, you might be aware of EXPLAIN
in
other databases, in SQLite that actually shows you the virtual machine op codes
it runs to execute your query. For a more human readable form you generally
want to EXPLAIN QUERY PLAN
.
However, that’s a lot to write and a lot to move your cursor around instead,
you can use the .eqp on
command to automatically explain your query before
you run the query.
The next command to be familiar is .timer on
. This shows you approximately how
long your query took to execute using wall clock time.
Finally, I find .stats on
to be very useful if you’re analyzing the memory
usage of your queries, and it’s just straight up extremely interesting to see
the details of your query execution. (Although this will not uncover the overhead of your language’s FFI interface or overhead within your program).
sqlite> select * from tag order by name limit 10;
Memory Used: 122240 (max 130544) bytes
Number of Outstanding Allocations: 448 (max 473)
Number of Pcache Overflow Bytes: 4104 (max 4104) bytes
Largest Allocation: 87200 bytes
Largest Pcache Allocation: 4104 bytes
Lookaside Slots Used: 0 (max 0)
Successful lookaside attempts: 0
Lookaside failures due to size: 0
Lookaside failures due to OOM: 0
Pager Heap Usage: 83168 bytes
Page cache hits: 3
Page cache misses: 19
Page cache writes: 0
Page cache spills: 0
Schema Heap Usage: 11720 bytes
Statement Heap/Lookaside Usage: 3560 bytes
Fullscan Steps: 9
Sort Operations: 0
Autoindex Inserts: 0
Virtual Machine Steps: 98
Reprepare operations: 0
Number of times run: 1
Memory used by prepared stmt: 3560
Bytes received by read(): 108797
Bytes sent to write(): 2083
Read() system calls: 130
Write() system calls: 113
Bytes read from storage: 0
Bytes written to storage: 32768
Cancelled write bytes: 0
I’m not going to cover exactly what all these fields mean, but certain things are extremely intersting to look and see the number of allocations. How the statement was prepared. How many bytes were read from disk. How big the prepared statement was. How many times it was used.
Alternative Shells
Although the builtin SQLite shell is pretty great there are many alternative ways to work your databases. Here’s a couple options.
litecli is a smart CLI with automatic suggestions baked right in. It provides syntax highlighting and readline integration.
DB Browser for SQLite is pretty great too. It provides a GUI front end to your database and allows you to visually browse your database. I like it for more advanced queries and interactive usage.
Wrapping Up
I hope you find these commands and in-depth explanations useful when you’re working with SQLite or slicing data up that is just ever slightly too big to use your standard GUI tools and are best handled via SQL. There’s more to the shell than what I covered here, so please go and take a look at the official documentation, it’s way more in depth than what I covered here.