Analyzing JSON from the command line
We’ve previously met jq
, a command-line utility for parsing JSON
data, and used it to translate JSON documents into CSV files for use
in spreadsheet applications. Sometimes, though, we don’t want to fire up a
spreadsheet just to take a peek at our data. Fortunately, the UNIX command line
comes loaded with tools for wrangling tabular data–and jq
can help get our
data there.
For a simple example, let’s explore the data returned by npm’s downloads API.
$ curl -s https://api.npmjs.org/downloads/range/last-week
{"downloads":[{"day":"2015-02-27","downloads":43445588},...]}
Raw JSON. Easily processed by a script; not so easy for mere humans to read.
Simply piping into jq
for formatting is a huge help:
$ curl -s https://api.npmjs.org/downloads/range/last-week \
| jq .
{
"downloads": [
{
"day": "2015-02-27",
"downloads": 43445588
},
...
]
}
But we can do better, even with this nearly-trivial data set. Let’s refine that
jq
invocation to flatten out the "downloads"
collection into rows consisting
of the day
and the number of downloads
. We can then escape it for use with
other command-line utilities using the @sh
formatter:
$ curl -s https://api.npmjs.org/downloads/range/last-week \
| jq -r '.downloads[] | [.day, .downloads] | @sh'
'2015-02-27' 43445588
'2015-02-28' 20085417
'2015-03-01' 17694831
'2015-03-02' 40264588
'2015-03-03' 32050972
'2015-03-04' 32772539
'2015-03-05' 38008592
Much easier to read! With a quick glance, we can scan the columns for changes in
downloads over time and notice big drops on the weekend (02-28
- 03-31
). And
now that we’re space-delimited, we can use any of our favorite command-line
tools to parse, process, and shape the results:
$ curl -s https://api.npmjs.org/downloads/range/last-week \
| jq -r '.downloads[] | [.day, .downloads] | @sh' \
| awk '{print $2,$1}' \
| sort -r
43445588 '2015-02-27'
40264588 '2015-03-02'
38008592 '2015-03-05'
32772539 '2015-03-04'
32050972 '2015-03-03'
20085417 '2015-02-28'
17694831 '2015-03-01'
And there it is. Using awk
to swap columns and sort
to put
things in order, we’ve turned the JSON results into the top daily downloads for
the past week–all without leaving the comfort of the command line!