There are a number of tools specifically designed for the purpose of manipulating JSON from the command line, and will be a lot easier and more reliable than doing it with Awk, such as jq:

curl -s 'https://api.github.com/users/lambda' | jq -r '.name'

You can also do this with tools that are likely already installed on your system, like Python using the json module, and so avoid any extra dependencies, while still having the benefit of a proper JSON parser. The following assume you want to use UTF-8, which the original JSON should be encoded in and is what most modern terminals use as well:

Python 3:

curl -s 'https://api.github.com/users/lambda' | \
    python3 -c "import sys, json; print(json.load(sys.stdin)['name'])"

Python 2:

export PYTHONIOENCODING=utf8
curl -s 'https://api.github.com/users/lambda' | \
    python2 -c "import sys, json; print json.load(sys.stdin)['name']"

Frequently Asked Questions

Why not a pure shell solution?

The standard POSIX/Single Unix Specification shell is a very limited language which doesn't contain facilities for representing sequences (list or arrays) or associative arrays (also known as hash tables, maps, dicts, or objects in some other languages). This makes representing the result of parsing JSON somewhat tricky in portable shell scripts. There are somewhat hacky ways to do it, but many of them can break if keys or values contain certain special characters.

Bash 4 and later, zsh, and ksh have support for arrays and associative arrays, but these shells are not universally available (macOS stopped updating Bash at Bash 3, due to a change from GPLv2 to GPLv3, while many Linux systems don't have zsh installed out of the box). It's possible that you could write a script that would work in either Bash 4 or zsh, one of which is available on most macOS, Linux, and BSD systems these days, but it would be tough to write a shebang line that worked for such a polyglot script.

Finally, writing a full fledged JSON parser in shell would be a significant enough dependency that you might as well just use an existing dependency like jq or Python instead. It's not going to be a one-liner, or even small five-line snippet, to do a good implementation.

Why not use awk, sed, or grep?

It is possible to use these tools to do some quick extraction from JSON with a known shape and formatted in a known way, such as one key per line. There are several examples of suggestions for this in other answers.

However, these tools are designed for line based or record based formats; they are not designed for recursive parsing of matched delimiters with possible escape characters.

So these quick and dirty solutions using awk/sed/grep are likely to be fragile, and break if some aspect of the input format changes, such as collapsing whitespace, or adding additional levels of nesting to the JSON objects, or an escaped quote within a string. A solution that is robust enough to handle all JSON input without breaking will also be fairly large and complex, and so not too much different than adding another dependency on jq or Python.

I have had to deal with large amounts of customer data being deleted due to poor input parsing in a shell script before, so I never recommend quick and dirty methods that may be fragile in this way. If you're doing some one-off processing, see the other answers for suggestions, but I still highly recommend just using an existing tested JSON parser.

Historical notes

This answer originally recommended jsawk, which should still work, but is a little more cumbersome to use than jq, and depends on a standalone JavaScript interpreter being installed which is less common than a Python interpreter, so the above answers are probably preferable:

curl -s 'https://api.github.com/users/lambda' | jsawk -a 'return this.name'

This answer also originally used the Twitter API from the question, but that API no longer works, making it hard to copy the examples to test out, and the new Twitter API requires API keys, so I've switched to using the GitHub API which can be used easily without API keys. The first answer for the original question would be:

curl 'http://twitter.com/users/username.json' | jq -r '.text'
Answer from Brian Campbell on Stack Overflow
Top answer
1 of 16
1791

There are a number of tools specifically designed for the purpose of manipulating JSON from the command line, and will be a lot easier and more reliable than doing it with Awk, such as jq:

curl -s 'https://api.github.com/users/lambda' | jq -r '.name'

You can also do this with tools that are likely already installed on your system, like Python using the json module, and so avoid any extra dependencies, while still having the benefit of a proper JSON parser. The following assume you want to use UTF-8, which the original JSON should be encoded in and is what most modern terminals use as well:

Python 3:

curl -s 'https://api.github.com/users/lambda' | \
    python3 -c "import sys, json; print(json.load(sys.stdin)['name'])"

Python 2:

export PYTHONIOENCODING=utf8
curl -s 'https://api.github.com/users/lambda' | \
    python2 -c "import sys, json; print json.load(sys.stdin)['name']"

Frequently Asked Questions

Why not a pure shell solution?

The standard POSIX/Single Unix Specification shell is a very limited language which doesn't contain facilities for representing sequences (list or arrays) or associative arrays (also known as hash tables, maps, dicts, or objects in some other languages). This makes representing the result of parsing JSON somewhat tricky in portable shell scripts. There are somewhat hacky ways to do it, but many of them can break if keys or values contain certain special characters.

Bash 4 and later, zsh, and ksh have support for arrays and associative arrays, but these shells are not universally available (macOS stopped updating Bash at Bash 3, due to a change from GPLv2 to GPLv3, while many Linux systems don't have zsh installed out of the box). It's possible that you could write a script that would work in either Bash 4 or zsh, one of which is available on most macOS, Linux, and BSD systems these days, but it would be tough to write a shebang line that worked for such a polyglot script.

Finally, writing a full fledged JSON parser in shell would be a significant enough dependency that you might as well just use an existing dependency like jq or Python instead. It's not going to be a one-liner, or even small five-line snippet, to do a good implementation.

Why not use awk, sed, or grep?

It is possible to use these tools to do some quick extraction from JSON with a known shape and formatted in a known way, such as one key per line. There are several examples of suggestions for this in other answers.

However, these tools are designed for line based or record based formats; they are not designed for recursive parsing of matched delimiters with possible escape characters.

So these quick and dirty solutions using awk/sed/grep are likely to be fragile, and break if some aspect of the input format changes, such as collapsing whitespace, or adding additional levels of nesting to the JSON objects, or an escaped quote within a string. A solution that is robust enough to handle all JSON input without breaking will also be fairly large and complex, and so not too much different than adding another dependency on jq or Python.

I have had to deal with large amounts of customer data being deleted due to poor input parsing in a shell script before, so I never recommend quick and dirty methods that may be fragile in this way. If you're doing some one-off processing, see the other answers for suggestions, but I still highly recommend just using an existing tested JSON parser.

Historical notes

This answer originally recommended jsawk, which should still work, but is a little more cumbersome to use than jq, and depends on a standalone JavaScript interpreter being installed which is less common than a Python interpreter, so the above answers are probably preferable:

curl -s 'https://api.github.com/users/lambda' | jsawk -a 'return this.name'

This answer also originally used the Twitter API from the question, but that API no longer works, making it hard to copy the examples to test out, and the new Twitter API requires API keys, so I've switched to using the GitHub API which can be used easily without API keys. The first answer for the original question would be:

curl 'http://twitter.com/users/username.json' | jq -r '.text'
2 of 16
338

To quickly extract the values for a particular key, I personally like to use "grep -o", which only returns the regex's match. For example, to get the "text" field from tweets, something like:

grep -Po '"text":.*?[^\\]",' tweets.json

This regex is more robust than you might think; for example, it deals fine with strings having embedded commas and escaped quotes inside them. I think with a little more work you could make one that is actually guaranteed to extract the value, if it's atomic. (If it has nesting, then a regex can't do it of course.)

And to further clean (albeit keeping the string's original escaping) you can use something like: | perl -pe 's/"text"://; s/^"//; s/",$//'. (I did this for this analysis.)

To all the haters who insist you should use a real JSON parser -- yes, that is essential for correctness, but

  1. To do a really quick analysis, like counting values to check on data cleaning bugs or get a general feel for the data, banging out something on the command line is faster. Opening an editor to write a script is distracting.
  2. grep -o is orders of magnitude faster than the Python standard json library, at least when doing this for tweets (which are ~2 KB each). I'm not sure if this is just because json is slow (I should compare to yajl sometime); but in principle, a regex should be faster since it's finite state and much more optimizable, instead of a parser that has to support recursion, and in this case, spends lots of CPU building trees for structures you don't care about. (If someone wrote a finite state transducer that did proper (depth-limited) JSON parsing, that would be fantastic! In the meantime we have "grep -o".)

To write maintainable code, I always use a real parsing library. I haven't tried jsawk, but if it works well, that would address point #1.

One last, wackier, solution: I wrote a script that uses Python json and extracts the keys you want, into tab-separated columns; then I pipe through a wrapper around awk that allows named access to columns. In here: the json2tsv and tsvawk scripts. So for this example it would be:

json2tsv id text < tweets.json | tsvawk '{print "tweet " text}'

This approach doesn't address #2, is more inefficient than a single Python script, and it's a little brittle: it forces normalization of newlines and tabs in string values, to play nice with awk's field/record-delimited view of the world. But it does let you stay on the command line, with more correctness than grep -o.

🌐
Medium
medium.com › @alwaysHopeGood › json-in-shell-script-e6651fab9c88
JSON in Shell Script. Declare shell variable with JSON… | by Rohit | Medium
December 1, 2023 - #!/bin/bash id=$RANDOM json_object_multiple_line_cat=$(cat <<EOF { "status": "success", "message": "Employee list", "start": 0, "total_results": 1, "data": [ { "empId": "$id", "name": "Tim", "designation": "Engineer" } ] } EOF ) echo $json_object_multiple_line_cat · *please remove ‘\n\r’ character before test above bash script, while copy paste.
Discussions

text processing - How to parse JSON with shell scripting in Linux? - Unix & Linux Stack Exchange
Now, maybe you have a single command that give you one json blob for all instances with more items in that "Instances" array. Well, if that is the case, you'll just need to modify the script a bit to iterate through the array rather than simply using the first item. In the end, the way to solve this problem, is the way to solve many problems in Unix. Break it down into easier problems. Find or write tools to solve the easier problem. Combine those tools with your shell ... More on unix.stackexchange.com
🌐 unix.stackexchange.com
March 27, 2014
bash - Read JSON data in a shell script - Stack Overflow
If applied to the JSON-string from ... found in ${json_Messages_Body[0]}, just try it, it's just too obvious. And if you downvote because of my warning, well, shoot the messenger, but this still does not change the fact that my warning is true for most shell scripts.... More on stackoverflow.com
🌐 stackoverflow.com
python - How to make a big shell script in a single line shell script in json document? - Stack Overflow
I need to execute shell script from the Python script. I am successfully able to do that.. I need to have a shell script in a single line in a JSON String.. Below is an example which is working fine for a simple shell script which I have made it in a JSON document... More on stackoverflow.com
🌐 stackoverflow.com
Generating JSON in a shell script - Unix & Linux Stack Exchange
That lets you build up a list from a shell array. ... xo(1) is a great utility for generating JSON on the command line. It comes from the libxo project developed by Juniper. Conveniently, it has been preinstalled on FreeBSD since 11.0-RELEASE and can output JSON and other formats like XML. This is the mechanism used by many FreeBSD administration tools (e.g., ps(1)) to support machine-friendly output formats. Here are output examples ... More on unix.stackexchange.com
🌐 unix.stackexchange.com
🌐
Baeldung
baeldung.com › home › scripting › parsing, validating, and printing json in shell scripts
Parsing, Validating, and Printing JSON in Shell Scripts | Baeldung on Linux
March 18, 2024 - Now, let’s go into examples of each individual package. The jsonlint npm package is based on the service at jsonlint.com and makes pretty prints and validations of JSON in the shell trivial.
Top answer
1 of 13
89

The availability of parsers in nearly every programming language is one of the advantages of JSON as a data-interchange format.

Rather than trying to implement a JSON parser, you are likely better off using either a tool built for JSON parsing such as jq or a general purpose script language that has a JSON library.

For example, using jq, you could pull out the ImageID from the first item of the Instances array as follows:

jq '.Instances[0].ImageId' test.json

Alternatively, to get the same information using Ruby's JSON library:

ruby -rjson -e 'j = JSON.parse(File.read("test.json")); puts j["Instances"][0]["ImageId"]'

I won't answer all of your revised questions and comments but the following is hopefully enough to get you started.

Suppose that you had a Ruby script that could read a from STDIN and output the second line in your example output[0]. That script might look something like:

#!/usr/bin/env ruby
require 'json'

data = JSON.parse(ARGF.read)
instance_id = data["Instances"][0]["InstanceId"]
name = data["Instances"][0]["Tags"].find {|t| t["Key"] == "Name" }["Value"]
owner = data["Instances"][0]["Tags"].find {|t| t["Key"] == "Owner" }["Value"]
cost_center = data["Instances"][0]["SubnetId"].split("-")[1][0..3]
puts "#{instance_id}\t#{name}\t#{cost_center}\t#{owner}"

How could you use such a script to accomplish your whole goal? Well, suppose you already had the following:

  • a command to list all your instances
  • a command to get the json above for any instance on your list and output it to STDOU

One way would be to use your shell to combine these tools:

echo -e "Instance id\tName\tcost centre\tOwner"
for instance in $(list-instances); do
    get-json-for-instance $instance | ./ugly-ruby-scriptrb
done

Now, maybe you have a single command that give you one json blob for all instances with more items in that "Instances" array. Well, if that is the case, you'll just need to modify the script a bit to iterate through the array rather than simply using the first item.

In the end, the way to solve this problem, is the way to solve many problems in Unix. Break it down into easier problems. Find or write tools to solve the easier problem. Combine those tools with your shell or other operating system features.

[0] Note that I have no idea where you get cost-center from, so I just made it up.

2 of 13
17

You can use following python script to parse that data. Lets assume that you have JSON data from arrays in files like array1.json, array2.json and so on.

import json
import sys
from pprint import pprint

jdata = open(sys.argv[1])

data = json.load(jdata)

print "InstanceId", " - ", "Name", " - ", "Owner"
print data["Instances"][0]["InstanceId"], " - " ,data["Instances"][0]["Tags"][1]["Value"], " - " ,data["Instances"][0]["Tags"][2]["Value"] 

jdata.close()

And then just run:

$ for x in `ls *.json`; do python parse.py $x; done
InstanceId  -  Name  -  Owner
i-1234576  -  RDS_Machine (us-east-1c)  -  Jyoti Bhanot

I haven't seen cost in your data, that's why I didn't include that.

According to discussion in comments, I have updated parse.py script:

import json
import sys
from pprint import pprint

jdata = sys.stdin.read()

data = json.loads(jdata)

print "InstanceId", " - ", "Name", " - ", "Owner"
print data["Instances"][0]["InstanceId"], " - " ,data["Instances"][0]["Tags"][1]["Value"], " - " ,data["Instances"][0]["Tags"][2]["Value"] 

You can try to run following command:

#ec2-describe-instance <instance> | python parse.py
🌐
Educative
educative.io › answers › how-to-parse-json-with-bash
How to parse JSON with Bash
#!/bin/sh json=$(cat names.json) number=$(echo "$json" | jq -r '.number') names=$(echo "$json" | jq -r '.people[].name' | tr '\n' ' ') echo "Number of people: $number" echo "Names: $names" ...
🌐
LinuxOPsys
linuxopsys.com › read-json-file-in-shell-script
How to Read JSON file in Shell Script
August 7, 2023 - Now, we can use all of this to write a simple bash script. #!/bin/bash input_file="example.json" data_age=`jq -r '.age' "$input_file"` echo $data_age
Find elsewhere
🌐
Cameronnokes
cameronnokes.com › blog › working-with-json-in-bash-using-jq
Working with JSON in bash using jq - Cameron Nokes
August 1, 2020 - Also like sed or awk, it basically has it’s own domain specific language (DSL) for querying JSON. Luckily, it’s really intuitive (unlike awk 😜). ... To print out the foo property, we use the . operator followed by the property name. ... That will print out 123 , as expected. This works with nesting too. So .a.b.c.d will traverse down nested objects’ properties. This, all by itself, is pretty useful. For a realistic and totally useful example, let’s write a script that gets the Astronomy Picture of the Day and sets it as our wallpaper (this is macOS only).
🌐
Benjamin Rancourt
benjaminrancourt.ca › blog › how to easily create a json file in bash
How to Easily Create a JSON File in Bash - Benjamin Rancourt
July 12, 2022 - # Generating a JSON string (https://stackoverflow.com/a/68591585) json_string=$( jq --null-input \ --arg base64 "${base64}" \ --arg height "${height}" \ --arg width "${width}" \ --arg filename "${filename}" \ '$ARGS.named' )
Top answer
1 of 1
3

First, consider whether you really need JSON. In your example code, you create a JSON string and then immediately decode it into a Python dict. Would it be simpler to just use a dict directly?

The problem with your current string is that you're not escaping your quotation marks properly. To avoid confusing multi-level escaping, use a triple-quoted string to represent the shell script and json.dumps to convert a dict into a JSON string:

import json
jsonstr = json.dumps({"script": """\
#!/bin/bash
set -e

readonly PRIMARY=/tech01/primary
readonly SECONDARY=/tech02/secondary
readonly LOCATION=(machineA machineB)
readonly MAPPED_LOCATION=/bat/data/snapshot
HOSTNAME=$hostname

dir1=$(ssh -o "StrictHostKeyChecking no" david@${LOCATION[0]} ls -dt1 "$MAPPED_LOCATION"/[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9] | head -n1)
dir2=$(ssh -o "StrictHostKeyChecking no" david@${LOCATION[1]} ls -dt1 "$MAPPED_LOCATION"/[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9] | head -n1)

echo $dir1
echo $dir2

length1=$(ssh -o "StrictHostKeyChecking no" david@${LOCATION[0]} "ls '$dir1' | wc -l")
length2=$(ssh -o "StrictHostKeyChecking no" david@${LOCATION[1]} "ls '$dir2' | wc -l")

echo $length1
echo $length2

if [ "$dir1" = "$dir2" ] && [ "$length1" -gt 0 ] && [ "$length2" -gt 0 ]
then
    rm -rf $PRIMARY/*
    rm -rf $SECONDARY/*
    for el in $primary_partition
    do
        scp david@${LOCATION[0]}:$dir1/weekly_8880_"$el"_5.data $PRIMARY/. || scp david@${LOCATION[1]}:$dir2/weekly_8880_"$el"_5.data $PRIMARY/.
    done
fi"""})

Alternatively, you could put the shell script in its own file and open the file to read the string from it.

Top answer
1 of 4
6

One suggestion is to use --args with jq to create the two arrays and then collect these in the correct location in the main document. Note that --args is required to be the last option on the command line and that all the remaining command line arguments will become elements of the $ARGS.positional array.

{
    jq -n --arg key APP-Service1-Admin '{($key): $ARGS.positional}' --args a b
    jq -n --arg key APP-Service1-View  '{($key): $ARGS.positional}' --args c d
} |
jq -s --arg key 'AD Accounts' '{($key): add}' |
jq --arg Service service1-name --arg 'AWS account' service1-dev '$ARGS.named + .'

The first two jq invocations create a set of two JSON objects:

{
  "APP-Service1-Admin": [
    "a",
    "b"
  ]
}
{
  "APP-Service1-View": [
    "c",
    "d"
  ]
}

The third jq invocation uses -s to read that set into an array, which becomes a merged object when passed through add. The merged object is assigned to our top-level key:

{
  "AD Accounts": {
    "APP-Service1-Admin": [
      "a",
      "b"
    ],
    "APP-Service1-View": [
      "c",
      "d"
    ]
  }
}

The last jq adds the remaining top-level keys and their values to the object:

{
  "Service": "service1-name",
  "AWS account": "service1-dev",
  "AD Accounts": {
    "APP-Service1-Admin": [
      "a",
      "b"
    ],
    "APP-Service1-View": [
      "c",
      "d"
    ]
  }
}

With jo:

jo -d . \
    Service=service1-name \
    'AWS account'=service1-dev  \
    'AD Accounts.APP-Service1-Admin'="$(jo -a a b)" \
    'AD Accounts.APP-Service1-View'="$(jo -a c d)"

The "internal" object is created using .-notation (enabled with -d .), and a couple of command substitutions for creating the arrays.

Or you can drop the -d . and use a form of array notation:

jo  Service=service1-name \
    'AWS account'=service1-dev \
    'AD Account[APP-Service1-Admin]'="$(jo -a a b)" \
    'AD Account[APP-Service1-View]'="$(jo -a c d)"
2 of 4
4

I often use heredocs when creating complicated json objects in bash:

service=$(thing-what-gets-service)
account=$(thing-what-gets-account)
admin=$(jo -a $(thing-what-gets-admin))
view=$(jo -a $(thing-what-gets-view))

read -rd '' json <<EOF
[
    {
        "Service": "$service",
        "AWS Account": "$account",
        "AD Accounts": {
            "APP-Service1-Admin": $admin,
            "APP-Service1-View": $view
        }
    }
]
EOF

This uses jo to create the arrays as it's a pretty simple way to do it but it could be done differently if needed.

🌐
Unix Community
community.unix.com › shell programming and scripting
Building JSON command with bash script - Shell Programming and Scripting - Unix Linux Community
September 17, 2019 - Hello. I’m new to bash script and I’m learning the basics by writing some scripts. Recently a friend of mine asked me if I could try to write a script to him to automate a couple of processes that uses JSON RPCs. I’ll…
🌐
Squash
squash.io › how-to-import-json-from-a-bash-script-on-linux
How to Import JSON from a Bash Script on Linux - Squash Labs
October 22, 2023 - Example usage: #!/bin/bash source json.sh json='{ "name": "John Doe", "age": 30, "email": "[email protected]" }' name=$(echo $json | parse_json | extract_value name) age=$(echo $json | parse_json | extract_value age) email=$(echo $json | ...
🌐
GitHub
github.com › dominictarr › JSON.sh
GitHub - dominictarr/JSON.sh: a pipeable JSON parser written in Bash
$ json_parse < package.json ["name"] "JSON.sh" ["version"] "0.0.0" ["description"] "" ["homepage"] "http://github.com/dominictarr/JSON.sh" ["repository","type"] "git" ["repository","url"] "https://github.com/dominictarr/JSON.sh.git" ["repository"] ...
Starred by 2K users
Forked by 267 users
Languages   Shell 90.7% | Python 9.3% | Shell 90.7% | Python 9.3%
🌐
iO Flood
ioflood.com › blog › bash-parse-json
Bash and JSON: Your Guide to Parsing JSON in Bash
December 4, 2023 - When these scripts need to interact with web APIs, they often have to deal with JSON data. Being able to parse this data in Bash can make your scripts more powerful and flexible. # An example of an automation script interacting with a web API response=$(curl -s 'https://api.example.com/data') name=$(echo "$response" | jq '.name') echo "The name is $name" # Output: # The name is John
🌐
Stegard
stegard.net › 2021 › 07 › how-to-make-a-shell-script-log-json-messages
How to make a shell script log JSON-messages – Øyvind Stegard
July 2, 2021 - This is setup at the beginning of the script, and subsequent commands’ output will automatically be wrapped as JSON-messages. It requires the jq command to be present on the system where script runs. #!/usr/bin/env bash # 1 Make copies of shell's current stdout and stderr file # descriptors: exec 100>&1 200>&2 # 2 Define function which logs arguments or stdin as JSON-message to stdout: log() { if [ "$1" = - ]; then jq -Rsc '{"@timestamp": now|strftime("%Y-%m-%dT%H:%M:%S%z"), "message":.}' 1>&100 2>&200 else jq --arg m "$*" -nc '{"@timestamp": now|strftime("%Y-%m-%dT%H:%M:%S%z"), "message":$m
🌐
Medium
medium.com › swlh › different-ways-to-handle-json-in-a-linux-shell-81183bc2c9bc
Different Ways to Handle JSON in a Linux Shell | by Tate Galbraith | The Startup | Medium
February 25, 2021 - By default shells like Bash do not have a standard JSON parser included. You would either have to drop into a programming language interpreter or install a small dedicated utility. If you’re trying to stay within the confines of the shell, then adding a program is arguably the most straightforward option.
🌐
codestudy
codestudy.net › blog › iterating-through-json-array-in-shell-script
How to Iterate Through a JSON Array in Shell Script Using jq: Step-by-Step Guide to Loop and Extract Values — codestudy.net
While shell scripts excel at text processing, JSON’s structured format requires specialized tools. jq is a lightweight, powerful command-line JSON processor that makes it easy to query, filter, and transform JSON data. In this guide, we’ll focus on a common task: iterating through a JSON array in a shell script using jq, with step-by-step examples to loop through elements and extract values.