Analyzing JSON from the command line

We’ve previously met jq, a command-line utility for parsing JSON data, and used it to translate JSON documents into CSV files for use in spreadsheet applications. Sometimes, though, we don’t want to fire up a spreadsheet just to take a peek at our data. Fortunately, the UNIX command line comes loaded with tools for wrangling tabular data–and jq can help get our data there.

For a simple example, let’s explore the data returned by npm’s downloads API.

$ curl -s

Raw JSON. Easily processed by a script; not so easy for mere humans to read. Simply piping into jq for formatting is a huge help:

$ curl -s \
  | jq .
  "downloads": [
      "day": "2015-02-27",
      "downloads": 43445588

But we can do better, even with this nearly-trivial data set. Let’s refine that jq invocation to flatten out the "downloads" collection into rows consisting of the day and the number of downloads. We can then escape it for use with other command-line utilities using the @sh formatter:

$ curl -s \
  | jq -r '.downloads[] | [.day, .downloads] | @sh'
'2015-02-27' 43445588
'2015-02-28' 20085417
'2015-03-01' 17694831
'2015-03-02' 40264588
'2015-03-03' 32050972
'2015-03-04' 32772539
'2015-03-05' 38008592

Much easier to read! With a quick glance, we can scan the columns for changes in downloads over time and notice big drops on the weekend (02-28 - 03-31). And now that we’re space-delimited, we can use any of our favorite command-line tools to parse, process, and shape the results:

$ curl -s \
  | jq -r '.downloads[] | [.day, .downloads] | @sh' \
  | awk '{print $2,$1}' \
  | sort -r
43445588 '2015-02-27'
40264588 '2015-03-02'
38008592 '2015-03-05'
32772539 '2015-03-04'
32050972 '2015-03-03'
20085417 '2015-02-28'
17694831 '2015-03-01'

And there it is. Using awk to swap columns and sort to put things in order, we’ve turned the JSON results into the top daily downloads for the past week–all without leaving the comfort of the command line!