Very simple:
import json
data = json.loads('{"one" : "1", "two" : "2", "three" : "3"}')
print(data['two']) # or `print data['two']` in Python 2
Answer from John Giotta on Stack OverflowVery simple:
import json
data = json.loads('{"one" : "1", "two" : "2", "three" : "3"}')
print(data['two']) # or `print data['two']` in Python 2
For URL or file, use json.load(). For string with .json content, use json.loads().
#! /usr/bin/python
import json
# from pprint import pprint
json_file = 'my_cube.json'
cube = '1'
with open(json_file) as json_data:
data = json.load(json_data)
# pprint(data)
print "Dimension: ", data['cubes'][cube]['dim']
print "Measures: ", data['cubes'][cube]['meas']
Python Parse JSON array - Stack Overflow
Handling JSON files with ease in Python
How to parse different part of json file (Newbie here)
Better way to parse insanely complex nested json data
Videos
In your for loop statement, Each item in json_array is a dictionary and the dictionary does not have a key store_details. So I modified the program a little bit
import json
input_file = open ('stores-small.json')
json_array = json.load(input_file)
store_list = []
for item in json_array:
store_details = {"name":None, "city":None}
store_details['name'] = item['name']
store_details['city'] = item['city']
store_list.append(store_details)
print(store_list)
If you arrived at this question simply looking for a way to read a json file into memory, then use the built-in json module.
with open(file_path, 'r') as f:
data = json.load(f)
If you have a json string in memory that needs to be parsed, use json.loads() instead:
data = json.loads(my_json_string)
Either way, now data is converted into a Python data structure (list/dictionary) that may be (deeply) nested and you'll need Python methods to manipulate it.
If you arrived here looking for ways to get values under several keys as in the OP, then the question is about looping over a Python data structure. For a not-so-deeply-nested data structure, the most readable (and possibly the fastest) way is a list / dict comprehension. For example, for the requirement in the OP, a list comprehension does the job.
store_list = [{'name': item['name'], 'city': item['city']} for item in json_array]
# [{'name': 'Mall of America', 'city': 'Bloomington'}, {'name': 'Tempe Marketplace', 'city': 'Tempe'}]
Other types of common data manipulation:
For a nested list where each sub-list is a list of items in the
json_array.store_list = [[item['name'], item['city']] for item in json_array] # [['Mall of America', 'Bloomington'], ['Tempe Marketplace', 'Tempe']]For a dictionary of lists where each key-value pair is a category-values in the
json_array.store_data = {'name': [], 'city': []} for item in json_array: store_data['name'].append(item['name']) store_data['city'].append(item['city']) # {'name': ['Mall of America', 'Tempe Marketplace'], 'city': ['Bloomington', 'Tempe']}For a "transposed" nested list where each sub-list is a "category" in
json_array.store_list = list(store_data.values()) # [['Mall of America', 'Tempe Marketplace'], ['Bloomington', 'Tempe']]
I have finished writing the third article in the Data Engineering with Python series. This is about working with JSON data in Python. I have tried to cover every necessary use case. If you have any other suggestions, let me know.
Working with JSON in Python
Data Engineering with Python series
Hi all,
Long time programmer here, but new to Python. This will be a long and, I think, complicated issue, so appreciate anyone who reads through it all and has any suggestions. I've looked up different ways to pull this data and don't seem to be making any progress. I'm sure there's a much better way.
I'm writing a program that will connect to our library to pull a list of everything we have checked out and I want to output a sorted list by due date and whether it has holds or not. I've got the code working to log in and pull a json data structure, but I cannot get it to export the data in the correct order. The json data is(to me) hideously complex with some data(due date) in one section and other data in another section. I'm able to pull the fields I want, but keeping them together is proving challenging.
For example, the title and subtitle are in the 'bibs/briefinfo' section with a key value of 'title' or 'subtitle'. Due Date is also in the 'checkouts' section with a key value of 'dueDate'. When I loop through them, though, the Titles are in one order, the due dates are in another order and the subtitles another.
I used BeautifulSoup because it's a webpage with json in it, so used BS to read the webpage.
I'm wanting to pull the following fields for each book so I can display the info for each book:
title, subtitle, contentType from briefinfo section
duedate from checkouts section heldcopies and availablecopies from the availability section
Here's the pertinent section of my code:
soup = BeautifulSoup(index_page.text, 'html.parser')
all_scripts = soup.find_all('script', {"type":"application/json"})
for script in all_scripts:
jsondata = json.loads(script.text)
print(jsondata)
output = []
for i in item_generator(jsondata, "bibTitle"):
ans = {i}
print(i)
output.append(ans)
for i in item_generator(jsondata, "dueDate"):
ans = {i}
output.append(ans)
print("Subtitle----------------------")
for i in item_generator(jsondata, "subtitle"):
ans = {i}
print(i)
output.append(ans)
print(output)Here's the json output from my print statement so I can see what I'm working with. I tried to format it so it's easier to read. I removed a lot of other elements to keep the size down. Hopefully I didn't break any of the brackets.
{
'app':
{
'coreCssFingerprint': '123123123',
'coreAssets':
{
'cdnHost': 'https://xyz.com',
'cssPath': '/dynamic_stylesheet',
'defaultStylesheet': 'xyz.css'
},
},
'entities':
{
'listItems': {},
'cards': {},
'accounts':
{
'88888888':
{'barcode': '999999999',
'expiryDate': None,
'id': 88888888,
}
},
'shelves':
{'88888888':
{'1111222222':
{
'id': 1111222222,
'metadataId': 'S00A1122334',
'shelf': 'for_later',
'privateItem': True,
'dateAdded': '2023-12-30',
},
}
},
'users':
{'88888888':
{ 'accounts': \[88888888\],
'status': 'A',
'showGroupingDebug': False,
'avatarUrl': '',
'id': 88888888,
}
},
'eventPrograms': {}, 'checkouts':
{'112233445566778899':
{
'checkoutId': '112233445566778899',
'materialType': 'PHYSICAL',
'dueDate': '2024-08-26',
'metadataId': 'S99Z000000',
'bibTitle': "The Lord of the Rings"
},
'998877665544332211':
{
'checkoutId': 998877665544332211',
'materialType': 'PHYSICAL',
'dueDate': '2024-08-26',
'metadataId': 'S88Y00000',
'bibTitle': 'The Lord of the Rings'
},
},
'eventSeries': {}, 'catalogBibs': {},'bibs':
{'S88Y00000':
{
'id': 'S88Y00000',
'briefInfo':
{
'superFormats': ['BOOKS', 'MODERN_FORMATS'],
'genreForm': [],
'callNumber': '123.456',
'authors': ['Tolkien, J.R.R.'],
'metadataId': 'S88Y00000',
'jacket':
{
'type': 'hardcover',
'local_url': None
},
'contentType': 'FICTION',
'format': 'BK',
'subtitle': 'The Two Towers',
'title': 'The Lord of the Rings',
'id': 'S88Y00000',
},
'availability':
{
'heldCopies': 0,
'singleBranch': False,
'metadataId': 'S88Y00000',
'statusType': 'AVAILABLE',
'totalCopies': 3,
'availableCopies': 2
}
},
'S77X12345':
{
'id': 'S77X12345',
'briefInfo':
{
'superFormats': ['BOOKS', 'MODERN_FORMATS'],
'genreForm': [],
'callNumber': '123.457',
'authors': ['Tolkien, J.R.R.'],
'metadataId': 'S77X12345',
'jacket':
{
'type': 'hardcover',
'local_url': None
},
'contentType': 'FICTION',
'format': 'BK',
'subtitle': 'The Fellowship of the Ring',
'title': 'The Lord of the Rings',
'id': 'S77X12345',
},
'availability':
{
'heldCopies': 0,
'singleBranch': False,
'metadataId': 'S77X12345',
'statusType': 'AVAILABLE',
'totalCopies': 2,
'availableCopies': 1
}
}
Anyone know of a better way to parse this data? Thanks!
So I do cloud devops and have managed to create a lot of automations using BASH. For example, using AWS cli tools with the default output format of json, I have written many scripts using the AWS cli where I pipe the output to jq and get the results I am looking for. Combined with tools like jqplay, I have accomplished a lot. But there is a limit to BASH's usefulness when you are doing more complex operations. For that reason I have tried to lean in and do more stuff in python. I have gotten pretty good at modifying existing code and have written some pretty useful smaller python scripts.
But several times over the last few months, I keep trying and failing to really comprehend pythons handling of json. Such that I have given up and gone back to bash to complete a project.
So I am asking for help with two things from r/learnpython.
Solving the particular problem I am having right now.
Finally understanding how to parse any json with python.
ONE - - - - - - My current problem.
So using the code below, I have learned how to just get my data using boto3, convert it to json using json.dumps() and pretty print the json. (By the way, I need to use the standard json library here)
import boto3, json
from sys import argv
account = argv[1]
##THE FUNCTION BELOW WORKS FINE AND IS NOT REALLY RELEVANT TO MY QUESTION
def get_app_vpc_name(account):
if 'sbx' in account:
return 'sbx-app-' + account
elif 'dev' in account:
return 'dev-app-' + account
elif 'tst' in account:
return 'tst-app-' + account
elif 'prd' in account:
return 'prd-app-' + account
## SETUP boto3 FOR AWS API
boto3.setup_default_session(profile_name=account)
ec2 = boto3.client('ec2')
## GET A PARTICULAR VPC OUTPUT
def get_app_vpc_cidr_block(account):
app_vpc_cidr_blk_name = '-'.join([account, 'app-vpc-cidr-block'])
vpc = ec2.describe_vpcs(
Filters=[
{
'Name': 'tag:Name',
'Values': [
get_app_vpc_name(account)
]
}
]
)
## CONVERT PYTHON DICTIONARY TO JSON USING json.dumps
vpc_json = json.dumps(vpc, indent=6)
## PRETTY PRINT THE JSON
print(vpc_json)---- output FROM ABOVE CODE
{
"Vpcs": [
{
"CidrBlock": "10.215.188.0/22",
"DhcpOptionsId": "dopt-a370e999",
"State": "available",
"VpcId": "vpc-046b1f660f8337999",
"OwnerId": "999092819999",
"InstanceTenancy": "default",
"CidrBlockAssociationSet": [
{
"AssociationId": "vpc-cidr-assoc-027d7fe136117b999",
"CidrBlock": "10.215.188.0/22",
"CidrBlockState": {
"State": "associated"
}
}
],
"IsDefault": false,
"Tags": [
{
"Key": "network_tier",
"Value": "app"
},
{
"Key": "network_name",
"Value": "llh-devapp"
},
{
"Key": "ingress_support",
"Value": "false"
},
{
"Key": "usage",
"Value": "central-network"
},
{
"Key": "network_environment",
"Value": "dev"
},
{
"Key": "Name",
"Value": "dev-app-llh-devapp"
},
{
"Key": "account_name",
"Value": "N/A"
}
]
}
],
"ResponseMetadata": {
"RequestId": "9619218c-be04-4e36-bc4c-e8c8411a7999",
"HTTPStatusCode": 200,
"HTTPHeaders": {
"x-amzn-requestid": "9619218c-be04-4e36-bc4c-e8c8411a7999",
"cache-control": "no-cache, no-store",
"strict-transport-security": "max-age=31536000; includeSubDomains",
"content-type": "text/xml;charset=UTF-8",
"content-length": "1966",
"date": "Fri, 30 Sep 2022 19:40:09 GMT",
"server": "AmazonEC2"
},
"RetryAttempts": 0
}
}So I have made a lot of progress, but where I have struggled for weeks is parsing the json (or even the dictionary before I convert it to json) to get the exact data I need. Over and over and over, I keep running into type and other errors, but I have not succeeded in just parsing and get just the data I need. from the json/dictionary.
In this particular case, all I want is to get the CidrBlock from the data. I have had partial success by working with the code below, appended to the above script. (I will just show the function with the extra code)
def get_app_vpc_cidr_block(account):
app_vpc_cidr_blk_name = '-'.join([account, 'app-vpc-cidr-block'])
vpc = ec2.describe_vpcs(
Filters=[
{
'Name': 'tag:Name',
'Values': [
get_app_vpc_name(account)
]
}
]
)
vpc_json = json.dumps(vpc, indent=6)
print(vpc_json)
## EXTRA CODE. USING DICTIONARY ITEMS. JSON CODE ABOVE IS IRELEVANT
for key, value in vpc.items():
results = value
print(results[0]['CidrBlock'])I would like to add that I have tried a LOT of differnet things to simply retrieve the CidrBlock data. Depending on whether or not I dumped it to json, I have gotten so many errors. (Often type errors, but no real success.
Here is what the code above returns after the json print. (NOTE that it does return the CidrBlock before the error).
10.215.188.0/22
Traceback (most recent call last):
File "./caller.py", line 6, in <module>
get_app_vpc_cidr_block(aws_acct)
File "/var/lib/jenkins/testscripts/getAppVpcCidrBlock.py", line 35, in get_app_vpc_cidr_block
print(results[0]['CidrBlock'])
KeyError: 0Regarding this problem in particular, my only question is this.
What is the correct pythonic way to retrieve the damned CidrBlock value from the above dictionary/json and assign the value to a variable?
2----MORE GENERAL JSON QUESTIONS
Do I even need to convert output like above from a dictionary to json? (After retrieving the data using boto3, it is of the dictionary type.)
Once you have an object (json or dictionary), what is the proper way to parse it and get particular data from it. In my case I will almost always want to retrieve certain values from the data and assign variables to those values.
To illustrate the above question, suppose I wanted to retrive the following values from the json and assign each value to a variable
CidrBLock
VpcId
AssociationId (from CidrBlockAssociationSet)
Name Value (from tags)
Is there a python tool similar to jqplay that I can use to play with json to find the right python to get the data I want?
Thanks and I appreciate any help y'all can give me on this.