This is a lightning talk I gave to the CPLUG November 2020 meeting on how to use jq with bash for dealing with JSON in modern tooling.
NOTE: use the space bar to scroll.
More and more programs are using JSON as a data interchange format.
See docker inspect
JSON is not easily parsed using grep, sed, and awk in shell pipelines.
JSON is easily consumed via many programming languages, and this makes it
very popular as an interchange format.
Shell pipelines are great for explorations of certain APIs and running programs.
Composable, Fast, Parallel by default.
The Solution: jq
jq can be thought of as a stream editor
(sed) for JSON data. It can slice and filter, map, and transform this
data very easily.
Extract elements out of an array.
Extract things matching a boolean expression
Apply some operation to the resulting value. Ex. add, subtract, concatenate.
Take one JSON input, and move the fields around to become a different JSON object.
Real World Usages
Parsing out health check information. Many services have a health check endpoint.
Not every business has setup proper monitoring for these end
Extracting fields from your security IDS.
You find your company had a data breach over the VPN. You have a
JSON log of where everyone logged in from. You need to find the
Working with object stores on the command line.
Ex. Pumping information out of MongoDB, and working on it with the
I initially learned about it through hacker news, and learned it to
compete in a capture the flag event.
Usages of jq
Direct file input:
From a pipe:
Sending the output to another operation:
I know this is a useless use of cat, but it’s for example purposes.
Examples of jq operations
The Identity Transformation
An identity transformation just shunts the input to the output. jq will pretty print to help the human read the output.
Get the value of a key
We want to extract a value from some path of the json object’s keys.
Get the value of multiple keys
Sometimes we want to get more than 1 key out at a time. A comma can separate the multiple keys to pull.
Extracting a key from an array
In order to access the objects in a given array, they must first be unwraped with ., we can the pipe them to the key expressions.
Indexing an Array
jq uses the standard array syntax to pull out single elements. In this case, the last object is grabbed.
Slicing a range of elements from an array
It might be neccessary to pull a contigious series of elements. That is a slicing operation. A range is in the pattern of start:end_exclusive. In this example we get the middle 2 elements.
Slicing a specific elements from an array
In this example, the first and last elements in our example are pulled by index.
Transforming a series of objects
Sometimes we want to change how a JSON object is formatted. Such as creating an array where the first is the name, and the remaining indexes are their hobbies. The first operation unpacks the array. Then generates an array by taking the name, and splatting the hobbies array.
Mapping to a new value
Arithmatic and basic string operations can be applied using jq. It prints out the modified object. In this case, we are doubling every person’s age.
Sometimes we want to filter things. We can use the select function in jq. This allows us to apply a boolean expression to the input array.
Building a histogram
A histogram is a very useful tool for determining frequency of values. jq doesn’t have a reduce, but we can use other shell commands to get us there. In this example we use the last 5 commits from the jq repository to determine who commited the most in that time period. Awk could be used as an alternative for sort uniq pattern.
Adding a Downloaded Field to a JSON Object
It can be useful to timestamp a downloaded JSON object. For example, we are pulling from a random endpoint that needs to be cached locally on the web server, and the end user needs to be told when the file was last updated. In this example, the updated_at field is added to an object stored in a file.
Useful jq Command Flags
Cut down on the white space and the pretty printing. Pretty printing
is some what expensive with the excess of bytes it produces. Useful when
chaining jq calls.
Are you reading from a slow source? This sends stuff as soon as it’s
ready to the next pipe or output.
--arg name value
jq can use variables in your expression. If you call it with jq
--arg foo 123, the value "123" will be bound to $foo in your
Sort the fields in each output object by the keys.
Do NOT output any color. By default jq will colorize output.