I notice this question was asked a few years ago but if someone else find this, here are some newer projects trying to address this same problem:

  • ObjectPath (for Python and Javascript): http://objectpath.org/
  • jsonpath (Python reimplementation of the Javascript equivalent): https://pypi.org/project/jsonpath/
  • yaql: https://yaql.readthedocs.io/en/latest/readme.html
  • pyjq (Python bindings for jq https://stedolan.github.io/jq/): https://pypi.org/project/pyjq/
  • JMESPath: https://github.com/jmespath/jmespath.py

I personally went with pyjq because I use jq all the time for data exploration but ObjectPath seems very attractive and not limited to json.

Answer from Josep Valls on Stack Overflow
🌐
Python.org
discuss.python.org › python help
SQL query to JSON formatted result - Python Help - Discussions on Python.org
April 23, 2024 - Hi All, My first task on Python, I would like to convert a SQL query (not query result) into JSON, I’m searching for some pre-build libraries and found few like mo_sql_parsing, sqlparse but need more structured JSON. Like below, any quick help is appreciated INPUT: SELECT A, B FROM (SELECT A, B FROM FROMTABLE WHERE FROMCONDITION = ‘FROMCONDITION’) WHERE C = (SELECT C FROM WHERETABLE WHERE WHERECONDITION = ‘WHERECONDITION’) ORDER BY A EXPECTED OUTPUT: { “CRUD”: [“SELECT”], “COLUMN”: [ ...
Discussions

return SQL table as JSON in python - Stack Overflow
What's the best way to convert a SQL table to JSON using python? ... if you use Postgres, use the to_json capabilities, which outputs the data directly as a python object you can dump as a json string easily. ... import json import psycopg2 def db(database_name='pepe'): return psycopg2.connect(database=database_name) def query... More on stackoverflow.com
🌐 stackoverflow.com
The fastest tool for querying large JSON files is written in Python! (benchmark)
Normally c code is faster than Python code assuming same programmer experience. But I could imagine that the Python JSON reader module written in C received many more performance optimizations than some generic C JSON reader because there are too many competing ones to focus on optimizing each. More on reddit.com
🌐 r/Python
28
149
April 22, 2022
Converting JSON to SQL Table
Have you considered the pandas library? You can read JSON and then dump it to a flat file to upload into your database or write it directly to your database. More on reddit.com
🌐 r/learnpython
38
177
June 24, 2020
How to parse json output into SQL Server in python
We understand that you are trying to store JSON output to Azure SQL using Python script, correct me if my understanding is not accurate. For working with Python I would suggest using pyodbc and try if it helps. We do have some examples of queries using pyodbc on this site please check Here More on learn.microsoft.com
🌐 learn.microsoft.com
1
0
February 9, 2022
🌐
Medium
medium.com › @PyGuyCharles › python-sql-to-json-and-beyond-3e3a36d32853
Python: SQL to JSON and beyond!. Getting your data out of your database… | by Charles Stauffer | Medium
September 30, 2017 - Just in the Python language alone we have the Django REST Framework, Flask-RESTful, the now depricated simplejson and it’s new replacement, the json builtin function. Django REST will rip your rows and columns out of your database and return some very neatly formatted JSON. Using Flask-RESTful with SQLAlchemy achieves a very similar effect. However, what if you want to write some raw queries...
🌐
Ploomber
jupysql.ploomber.io › en › latest › howto › json.html
Run SQL on JSON files — Python documentation - JupySQL
Running query in 'duckdb://' Or using our .jsonl file: %%sql --save clean_data_jsonl SELECT json ->> '$.name' AS name, json ->> '$.friends[0]' AS first_friend, json ->> '$.likes.pizza' AS likes_pizza, json ->> '$.likes.tacos' AS likes_tacos FROM read_json_objects('people.jsonl', format="auto") Running query in 'duckdb://' %%sql SELECT * FROM clean_data_jsonl ·
Top answer
1 of 15
116

Here is a really nice example of a pythonic way to do that:

import json
import psycopg2

def db(database_name='pepe'):
    return psycopg2.connect(database=database_name)

def query_db(query, args=(), one=False):
    cur = db().cursor()
    cur.execute(query, args)
    r = [dict((cur.description[i][0], value) \
               for i, value in enumerate(row)) for row in cur.fetchall()]
    cur.connection.close()
    return (r[0] if r else None) if one else r

my_query = query_db("select * from majorroadstiger limit %s", (3,))

json_output = json.dumps(my_query)

You get an array of JSON objects:

>>> json_output
'[{"divroad": "N", "featcat": null, "countyfp": "001",...

Or with the following:

>>> j2 = query_db("select * from majorroadstiger where fullname= %s limit %s",\
 ("Mission Blvd", 1), one=True)

you get a single JSON object:

>>> j2 = json.dumps(j2)
>>> j2
'{"divroad": "N", "featcat": null, "countyfp": "001",...
2 of 15
44
import sqlite3
import json

DB = "./the_database.db"

def get_all_users( json_str = False ):
    conn = sqlite3.connect( DB )
    conn.row_factory = sqlite3.Row # This enables column access by name: row['column_name'] 
    db = conn.cursor()

    rows = db.execute('''
    SELECT * from Users
    ''').fetchall()

    conn.commit()
    conn.close()

    if json_str:
        return json.dumps( [dict(ix) for ix in rows] ) #CREATE JSON

    return rows

Callin the method no json...

print get_all_users()

prints:

[(1, u'orvar', u'password123'), (2, u'kalle', u'password123')]

Callin the method with json...

print get_all_users( json_str = True )

prints:

[{"password": "password123", "id": 1, "name": "orvar"}, {"password": "password123", "id": 2, "name": "kalle"}]
🌐
SQL Shack
sqlshack.com › working-with-json-data-in-python
Working with JSON data in Python
April 2, 2021 - JSON data can be structured, semi-structured, or completely unstructured. It is also used in the responses generated by the REST APIs and represents objects in key-value pairs just like the python dictionary object. ... Aveek is an experienced Data and Analytics Engineer, currently working in Dublin, Ireland. His main areas of technical interest include SQL ...
🌐
PyPI
pypi.org › project › jsonquery
jsonquery
JavaScript is disabled in your browser. Please enable JavaScript to proceed · A required part of this site couldn’t load. This may be due to a browser extension, network issues, or browser settings. Please check your connection, disable any ad blockers, or try using a different browser
Find elsewhere
🌐
Rittman Mead
rittmanmead.com › blog › 2024 › 05 › tip-tuesday-use-python-for-querying-json-files
Tip Tuesday | Use Python For Querying JSON Files
May 14, 2024 - To load the file in Python, only a few lines of code are required: import json with open("./data.json","r") as json_file: data = json.load(json_file) customers = data['customers'] The resulting "customers" variable is a list of dictionaries, analogous to a relational table, enabling SQL-like querying operations:
🌐
GitHub
github.com › s1s1ty › py-jsonq
GitHub - s1s1ty/py-jsonq: A simple Python package to Query over Json Data
py-jsonq is a simple, elegant Python package to Query over any type of JSON Data. It'll make your life easier by giving the flavour of an ORM-like query on your JSON.
Starred by 129 users
Forked by 21 users
Languages   Python 100.0% | Python 100.0%
🌐
Bacancy Technology
bacancytechnology.com › qanda › python › return-sql-data-in-json-format-python
How to return sql data in json format Python
July 31, 2023 - Execute SQL Query cursor = conn.cursor() sql_query = "SELECT * FROM your_table;" cursor.execute(sql_query) Fetch Data columns = [column[0] for column in cursor.description] data = [dict(zip(columns, row)) for row in cursor.fetchall()] #Convert to JSON json_data = json.dumps(data, indent=4) Save or Use JSON Data # save JSON data to a file: with open('data.json', 'w') as json_file: json_file.write(json_data) # Alternatively, you can use the JSON data directly in your program: print(json_data) Work with our skilled Python developers to accelerate your project and boost its performance.
🌐
GitHub
github.com › pythonql › pythonql
GitHub - pythonql/pythonql: A Query Language extension for Python: Query files, objects, SQL and NoSQL databases with a built-in query language
PythonQL is an extension to Python that allows language-integrated queries against relational, XML and JSON data, as well an Python's collections · Python has pretty advanced comprehensions, that cover a big chunk of SQL, to the point where ...
Starred by 245 users
Forked by 20 users
Languages   Python 100.0% | Python 100.0%
🌐
Reddit
reddit.com › r/python › the fastest tool for querying large json files is written in python! (benchmark)
r/Python on Reddit: The fastest tool for querying large JSON files is written in Python! (benchmark)
April 22, 2022 -

spyql is a tool (and python lib) for querying and transforming data. It is fully written in Python.

In the latest benchmark, spyql outperformed all other tools, including jq, one of the most popular tools written in C.

Here is one example extracted from the benchmark that shows spyql achieving the lowest processing time while keeping memory requirements low when the dataset size is >= 100MB.

Processing time and memory requirements vs size of input JSON data

IMO, these results might questions some preconceived opinions about Python’s performance and interpreted languages in general.

The benchmark is very easy to reproduce without installing any software since it runs on a google colab notebook.

Happy to hear your thoughts!

UPDATE 2022/04/22

Thank you all for your feedback. The benchmark was updated and the fastest tool is NOT written in Python. Here are the highlights:

  • Added ClickHouse (written in C++) to the benchmark: I was unaware that the clickhouse-local tool would handle these tasks. ClickHouse is now the fastest (together with OctoSQL);

  • OctoSQL (written in Go) was updated as a response to the benchmark: updates included switching to fastjson, short-circuiting LIMIT, and eagerly printing when outputting JSON and CSV. Now, OctoSQL is one of the fastest and memory is stable;

  • SPyQL (written in Python) is now third: SPyQL leverages orjson (Rust) to parse JSONs, while the query engine is written in Python. When processing 1GB of input data, SPyQL takes 4x-5x more time than the best, while still achieving up to 2x higher performance than jq (written in C);

  • I removed Pandas from the benchmark and focused on command-line tools. I am planning a separate benchmark on Python libs where Pandas, Polars and Modin (and eventually others) will be included.

This benchmark is a living document. If you are interested in receiving updates, please subscribe to the following issue: https://github.com/dcmoura/spyql/issues/72

Thank you!

🌐
Readthedocs
jupysql.readthedocs.io › en › latest › howto › json.html
Run SQL on JSON files — Python documentation
Running query in 'duckdb://' Or using our .jsonl file: %%sql --save clean_data_jsonl SELECT json ->> '$.name' AS name, json ->> '$.friends[0]' AS first_friend, json ->> '$.likes.pizza' AS likes_pizza, json ->> '$.likes.tacos' AS likes_tacos FROM read_json_objects('people.jsonl', format="auto") Running query in 'duckdb://' %%sql SELECT * FROM clean_data_jsonl ·
🌐
Dimkoug
dimkoug.com › post › python › sql-query-to-json-list-in-python
SQL query to json list in python | Dive into Code
import pyodbc import json MS_SQL_Host = "host" MS_SQL_Db = "db_name" MS_SQL_User = "user" MSSQL_Paswd = "pass" class DecimalEncoder(json.JSONEncoder): def default(self, obj): # 👇️ if passed in object is instance of Decimal # convert it to a string if isinstance(obj, Decimal): return str(obj) # 👇️ otherwise use the default behavior return json.JSONEncoder.default(self, obj) def write_json_data(data, filename): with open("{}.json".format(filename), "w", encoding="utf-8") as f: # json.dump(data, f, ensure_ascii=False, indent=4, cls=DecimalEncoder) json.dump(data, f, ensure_ascii=False,
🌐
Data Science Learner
datasciencelearner.com › python › return-sql-data-in-json-format-in-python
Return SQL Data in JSON Format in Python : Stepwise Solution
March 9, 2023 - This will create the query output into a python dict object. After it, we will use Python JSON Library to convert the same JSON Format. Here is the complete implementation. import json dict_obj = { 'key': obj } json_obj = json.dumps(dict_obj) ...
🌐
Aspose
products.aspose.com › aspose.cells › python via java › conversion › json to sql
Python JSON to SQL - JSON to SQL Converter | products.aspose.com
November 13, 2025 - Aspose Excel. This comprehensive solution provides Python developers with a fully integrated approach to convert JSON to SQL format, enabling seamless saving of JSON data into SQL format using the Aspose.Cells library, all through efficient ...
🌐
AskPython
askpython.com › home › python: converting sql data to json
Python: Converting SQL Data to JSON - AskPython
August 29, 2023 - It enables you to deserialize JSON data back into Python objects and serialize Python objects (such as dictionaries and lists) into JSON format. You must employ the json module in Python to convert SQL data to JSON.
🌐
Microsoft Learn
learn.microsoft.com › en-us › answers › questions › 728100 › how-to-parse-json-output-into-sql-server-in-python
How to parse json output into SQL Server in python - Microsoft Q&A
February 9, 2022 - We understand that you are trying to store JSON output to Azure SQL using Python script, correct me if my understanding is not accurate. For working with Python I would suggest using pyodbc and try if it helps. We do have some examples of queries using pyodbc on this site please check Here
Top answer
1 of 2
13

I would do it this way:

Copyfn = r'D:\temp\.data\40450591.json'

with open(fn) as f:
    data = json.load(f)

# some of your records seem NOT to have `Tags` key, hence `KeyError: 'Tags'`
# let's fix it
for r in data['Volumes']:
    if 'Tags' not in r:
        r['Tags'] = []

v = pd.DataFrame(data['Volumes']).drop(['Attachments', 'Tags'],1)
a = pd.io.json.json_normalize(data['Volumes'], 'Attachments', ['VolumeId'], meta_prefix='parent_')
t = pd.io.json.json_normalize(data['Volumes'], 'Tags', ['VolumeId'], meta_prefix='parent_')

v.to_sql('volume', engine)
a.to_sql('attachment', engine)
t.to_sql('tag', engine)

Output:

CopyIn [179]: v
Out[179]:
                      AvailabilityZone                CreateTime    Iops  Size              SnapshotId      State VolumeType
VolumeId
vol-049df61146c4d7901       us-east-1a  2013-12-18T22:35:00.084Z     NaN     8  snap-1234567890abcdef0     in-use   standard
vol-1234567890abcdef0       us-east-1a  2014-02-27T00:02:41.791Z  1000.0   100                    None  available        io1

In [180]: a
Out[180]:
                 AttachTime DeleteOnTermination     Device           InstanceId     State               VolumeId        parent_VolumeId
0  2013-12-18T22:35:00.000Z                True  /dev/sda1  i-1234567890abcdef0  attached  vol-049df61146c4d7901  vol-049df61146c4d7901
1  2013-12-18T22:35:11.000Z                True  /dev/sda1  i-1234567890abcdef1  attached  vol-049df61146c4d7111  vol-049df61146c4d7901

In [217]: t
Out[217]:
         Key              Value        parent_VolumeId
0       Name  DBJanitor-Private  vol-049df61146c4d7901
1      Owner          DBJanitor  vol-049df61146c4d7901
2    Product           Database  vol-049df61146c4d7901
3  Portfolio         DB Janitor  vol-049df61146c4d7901
4    Service         DB Service  vol-049df61146c4d7901

Test JSON file:

{
    "Volumes": [
        {
            "AvailabilityZone": "us-east-1a",
            "Attachments": [
                {
                    "AttachTime": "2013-12-18T22:35:00.000Z",
                    "InstanceId": "i-1234567890abcdef0",
                    "VolumeId": "vol-049df61146c4d7901",
                    "State": "attached",
                    "DeleteOnTermination": true,
                    "Device": "/dev/sda1"
                },
                {
                    "AttachTime": "2013-12-18T22:35:11.000Z",
                    "InstanceId": "i-1234567890abcdef1",
                    "VolumeId": "vol-049df61146c4d7111",
                    "State": "attached",
                    "DeleteOnTermination": true,
                    "Device": "/dev/sda1"
                }
            ],
            "Tags": [
                {
                    "Value": "DBJanitor-Private",
                    "Key": "Name"
                },
                {
                    "Value": "DBJanitor",
                    "Key": "Owner"
                },
                {
                    "Value": "Database",
                    "Key": "Product"
                },
                {
                    "Value": "DB Janitor",
                    "Key": "Portfolio"
                },
                {
                    "Value": "DB Service",
                    "Key": "Service"
                }
            ],
            "VolumeType": "standard",
            "VolumeId": "vol-049df61146c4d7901",
            "State": "in-use",
            "SnapshotId": "snap-1234567890abcdef0",
            "CreateTime": "2013-12-18T22:35:00.084Z",
            "Size": 8
        },
        {
            "AvailabilityZone": "us-east-1a",
            "Attachments": [],
            "VolumeType": "io1",
            "VolumeId": "vol-1234567890abcdef0",
            "State": "available",
            "Iops": 1000,
            "SnapshotId": null,
            "CreateTime": "2014-02-27T00:02:41.791Z",
            "Size": 100
        }
    ]
}
2 of 2
2

Analog to this example: https://github.com/zolekode/json-to-tables/blob/master/example.py

Use the following script:

The following script exports the data as HTML, but you might as well export it as SQL.

Copytable_maker.save_tables(YOUR_PATH, export_as="sql", sql_connection=YOUR_CONNECTION)
# See the code below
Copyimport json
from extent_table import ExtentTable
from table_maker import TableMaker

Volumes = [
    {
        "AvailabilityZone": "us-east-1a",
        "Attachments": [
            {
                "AttachTime": "2013-12-18T22:35:00.000Z",
                "InstanceId": "i-1234567890abcdef0",
                "VolumeId": "vol-049df61146c4d7901",
                "State": "attached",
                "DeleteOnTermination": "true",
                "Device": "/dev/sda1"
            }
        ],
        "Tags": [
            {
                "Value": "DBJanitor-Private",
                "Key": "Name"
            },
            {
                "Value": "DBJanitor",
                "Key": "Owner"
            },
            {
                "Value": "Database",
                "Key": "Product"
            },
            {
                "Value": "DB Janitor",
                "Key": "Portfolio"
            },
            {
                "Value": "DB Service",
                "Key": "Service"
            }
        ],
        "VolumeType": "standard",
        "VolumeId": "vol-049df61146c4d7901",
        "State": "in-use",
        "SnapshotId": "snap-1234567890abcdef0",
        "CreateTime": "2013-12-18T22:35:00.084Z",
        "Size": 8
    },
    {
        "AvailabilityZone": "us-east-1a",
        "Attachments": [],
        "VolumeType": "io1",
        "VolumeId": "vol-1234567890abcdef0",
        "State": "available",
        "Iops": 1000,
        "SnapshotId": "null",
        "CreateTime": "2014-02-27T00:02:41.791Z",
        "Size": 100
    }
]

volumes = json.dumps(Volumes)
volumes = json.loads(volumes)

extent_table = ExtentTable()
table_maker = TableMaker(extent_table)
table_maker.convert_json_objects_to_tables(volumes, "volumes")
table_maker.show_tables(8)
table_maker.save_tables("./", export_as="html") # you can also pass in export_as="sql" or "csv". In the case of sql, there is a parameter to pass the engine.

Output in HTML:

Copy<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th>ID</th>
      <th>AvailabilityZone</th>
      <th>VolumeType</th>
      <th>VolumeId</th>
      <th>State</th>
      <th>SnapshotId</th>
      <th>CreateTime</th>
      <th>Size</th>
      <th>Iops</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>0</td>
      <td>us-east-1a</td>
      <td>standard</td>
      <td>vol-049df61146c4d7901</td>
      <td>in-use</td>
      <td>snap-1234567890abcdef0</td>
      <td>2013-12-18T22:35:00.084Z</td>
      <td>8</td>
      <td>None</td>
    </tr>
    <tr>
      <td>1</td>
      <td>us-east-1a</td>
      <td>io1</td>
      <td>vol-1234567890abcdef0</td>
      <td>available</td>
      <td>null</td>
      <td>2014-02-27T00:02:41.791Z</td>
      <td>100</td>
      <td>1000</td>
    </tr>
    <tr>
      <td>2</td>
      <td>None</td>
      <td>None</td>
      <td>None</td>
      <td>None</td>
      <td>None</td>
      <td>None</td>
      <td>None</td>
      <td>None</td>
    </tr>
  </tbody>
</table>
<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th>ID</th>
      <th>PARENT_ID</th>
      <th>is_scalar</th>
      <th>scalar</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>0</td>
      <td>0</td>
      <td>False</td>
      <td>None</td>
    </tr>
  </tbody>
</table>

<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th>ID</th>
      <th>AttachTime</th>
      <th>InstanceId</th>
      <th>VolumeId</th>
      <th>State</th>
      <th>DeleteOnTermination</th>
      <th>Device</th>
      <th>PARENT_ID</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>0</td>
      <td>2013-12-18T22:35:00.000Z</td>
      <td>i-1234567890abcdef0</td>
      <td>vol-049df61146c4d7901</td>
      <td>attached</td>
      <td>true</td>
      <td>/dev/sda1</td>
      <td>0</td>
    </tr>
    <tr>
      <td>1</td>
      <td>None</td>
      <td>None</td>
      <td>None</td>
      <td>None</td>
      <td>None</td>
      <td>None</td>
      <td>None</td>
    </tr>
  </tbody>
</table>

<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th>ID</th>
      <th>PARENT_ID</th>
      <th>is_scalar</th>
      <th>scalar</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>0</td>
      <td>0</td>
      <td>False</td>
      <td>None</td>
    </tr>
    <tr>
      <td>1</td>
      <td>0</td>
      <td>False</td>
      <td>None</td>
    </tr>
    <tr>
      <td>2</td>
      <td>0</td>
      <td>False</td>
      <td>None</td>
    </tr>
    <tr>
      <td>3</td>
      <td>0</td>
      <td>False</td>
      <td>None</td>
    </tr>
    <tr>
      <td>4</td>
      <td>0</td>
      <td>False</td>
      <td>None</td>
    </tr>
  </tbody>
</table>

<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th>ID</th>
      <th>Value</th>
      <th>Key</th>
      <th>PARENT_ID</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>0</td>
      <td>DBJanitor-Private</td>
      <td>Name</td>
      <td>0</td>
    </tr>
    <tr>
      <td>1</td>
      <td>DBJanitor</td>
      <td>Owner</td>
      <td>1</td>
    </tr>
    <tr>
      <td>2</td>
      <td>Database</td>
      <td>Product</td>
      <