Using the SQLite Shell the Right Way
2024 July 04

I’ve been using the SQLite shell a lot lately to slice and dice data is slightly too large for excel. It’s a very powerful tool to get SQL on the command line without spinning up a large Postres server, and supports most of the SQL standard with syntax similar to Postgres. It’s super fast and supports loading CSV. I want to talk about some quality of life features in it that you probably don’t know about.

What is SQLite?

SQLite is the most widely distributed database in the world. It’s embedded as a library in your application. A lot of programming languages like python include it in the standard library. It stores application data in a single file that you can easily move around 1.

Installation of SQLite is available through your distribution’s package manager. If it’s not available there, go ahead and download the latest version here.

To open a SQLite file, at the command line, in the proper directory for your file run:

sqlite3 file

This opens up a prompt that you can run SQL and other commands at.

Importing Data

You can import CSV data at the command line using the .import file [table] command where [table] is the optional table name. If your CSV uses fancy separators you can use the .sep CHAR command to set your separator before importing 2. By default it will load with the header as the column names.

Formatting The Output

You can format the output of your SQL in several ways. The default output method is just some pipe separators called list which is difficult to read on a good day. For interactive use I like to set my .mode to either table or column. You can even have it generate tables in markdown by setting it to markdown.

sqlite> .mode table
sqlite> select * from tag limit 10;
| id |             created_at              |              name              |
| 1  | 2024-06-19 14:40:31.672184981-04:00 | Bills and Statutes             |
| 2  | 2024-06-19 14:40:32.487863971-04:00 | Inspector General              |
| 3  | 2024-06-19 14:40:32.495319881-04:00 | House of Representatives       |
| 4  | 2024-06-19 14:40:32.503659091-04:00 | Senate                         |
| 5  | 2024-06-19 14:40:32.511587207-04:00 | Department of Veterans Affairs |
| 6  | 2024-06-19 14:40:32.519278835-04:00 | Attorney General               |
| 7  | 2024-06-19 14:40:32.52756259-04:00  | Fiscal year                    |
| 8  | 2024-06-19 14:40:33.103108651-04:00 | Sequoia National Park          |
| 9  | 2024-06-19 14:40:33.111942766-04:00 | Kings Canyon National Park     |
| 10 | 2024-06-19 14:40:33.121450233-04:00 | Interior Secretary             |

It’s also sometimes useful to format certain output from a query before you display it. You can use printf with the standard format specifiers to say how you want it. When I’m summing up values I find it helpful to use this printf('%,.2f', sum(x)). Many times I wish I could define a single argument function within the shell itself to take care of this, but say la vie.

You can use .output file to send the results of the next command to a file. This is especially useful if you’re including the results of the next command in your blog post, or need to use the results elsewhere.

Examining Your Schema

.schema --indent TBL shows you a pretty printed version of your tables sql definition.

sqlite> .schema --indent tag
  `created_at` datetime,
  `name` text,
  `short_line` text,
  `hidden` numeric
CREATE UNIQUE INDEX `idx_tag_name` ON `tag`(`name`);

You can list your tables with .tables and your indexes with .indexes. Both of these commands support using like pattern globs to identify tables and indexes by name.

Profiling Your Queries

SQLite provides a lot of options for profiling your queries and seeing exact statistics about the query. For instance, you might be aware of EXPLAIN in other databases, in SQLite that actually shows you the virtual machine op codes it runs to execute your query. For a more human readable form you generally want to EXPLAIN QUERY PLAN.

However, that’s a lot to write and a lot to move your cursor around instead, you can use the .eqp on command to automatically explain your query before you run the query.

The next command to be familiar is .timer on. This shows you approximately how long your query took to execute using wall clock time.

Finally, I find .stats on to be very useful if you’re analyzing the memory usage of your queries, and it’s just straight up extremely interesting to see the details of your query execution. (Although this will not uncover the overhead of your language’s FFI interface or overhead within your program).

sqlite> select * from tag order by name limit 10;
Memory Used:                         122240 (max 130544) bytes
Number of Outstanding Allocations:   448 (max 473)
Number of Pcache Overflow Bytes:     4104 (max 4104) bytes
Largest Allocation:                  87200 bytes
Largest Pcache Allocation:           4104 bytes
Lookaside Slots Used:                0 (max 0)
Successful lookaside attempts:       0
Lookaside failures due to size:      0
Lookaside failures due to OOM:       0
Pager Heap Usage:                    83168 bytes
Page cache hits:                     3
Page cache misses:                   19
Page cache writes:                   0
Page cache spills:                   0
Schema Heap Usage:                   11720 bytes
Statement Heap/Lookaside Usage:      3560 bytes
Fullscan Steps:                      9
Sort Operations:                     0
Autoindex Inserts:                   0
Virtual Machine Steps:               98
Reprepare operations:                0
Number of times run:                 1
Memory used by prepared stmt:        3560
Bytes received by read():            108797
Bytes sent to write():               2083
Read() system calls:                 130
Write() system calls:                113
Bytes read from storage:             0
Bytes written to storage:            32768
Cancelled write bytes:               0

I’m not going to cover exactly what all these fields mean, but certain things are extremely intersting to look and see the number of allocations. How the statement was prepared. How many bytes were read from disk. How big the prepared statement was. How many times it was used.

Alternative Shells

Although the builtin SQLite shell is pretty great there are many alternative ways to work your databases. Here’s a couple options.

litecli is a smart CLI with automatic suggestions baked right in. It provides syntax highlighting and readline integration.

DB Browser for SQLite is pretty great too. It provides a GUI front end to your database and allows you to visually browse your database. I like it for more advanced queries and interactive usage.

Wrapping Up

I hope you find these commands and in-depth explanations useful when you’re working with SQLite or slicing data up that is just ever slightly too big to use your standard GUI tools and are best handled via SQL. There’s more to the shell than what I covered here, so please go and take a look at the official documentation, it’s way more in depth than what I covered here.

  1. Although when running you may have multiple files present for shared memory and the write ahead log. 

  2. For some reason the Federal government really likes to use pipes (|) for separators in the data files they distribute. 

Remember you can also subscribe using RSS at the top of the page!

Share this on → Mastodon Twitter LinkedIn Reddit

A selected list of related posts that you might enjoy:

Written by Henry J Schmale on 2024 July 04
Hit Counter