Videos
» pip install jsonschema-pydantic
» pip install json-schema-to-pydantic
🚀 I built Jambo, a tool that converts JSON Schema definitions into Pydantic models — dynamically, with zero config!
✅ What my project does:
Takes JSON Schema definitions and automatically converts them into Pydantic models
Supports validation for strings, integers, arrays, nested objects, and more
Enforces constraints like
minLength,maximum,pattern, etc.Built with AI frameworks like LangChain and CrewAI in mind — perfect for structured data workflows
🧪 Quick Example:
from jambo.schema_converter import SchemaConverter
schema = {
"title": "Person",
"type": "object",
"properties": {
"name": {"type": "string"},
"age": {"type": "integer"},
},
"required": ["name"],
}
Person = SchemaConverter.build(schema)
print(Person(name="Alice", age=30))🎯 Target Audience:
Developers building AI agent workflows with structured data
Anyone needing to convert schemas into validated models quickly
Pydantic users who want to skip writing models manually
Those working with JSON APIs or dynamic schema generation
🙌 Why I built it:
My name is Vitor Hideyoshi. I needed a tool to dynamically generate models while working on AI agent frameworks — so I decided to build it and share it with others.
Check it out here:
GitHub: https://github.com/HideyoshiNakazone/jambo
PyPI: https://pypi.org/project/jambo/
Would love to hear what you think! Bug reports, feedback, and PRs all welcome! 😄
#ai #crewai #langchain #jsonschema #pydantic
This has been discussed some time ago and Samuel Colvin said he didn't want to pursue this as a feature for Pydantic.
If you are fine with code generation instead of actual runtime creation of models, you can use the datamodel-code-generator.
To be honest, I struggle to see the use case for generating complex models at runtime, seeing as their main purpose is validation, implying that you think about correct schema before running your program. But that is just my view.
For simple models I guess you can throw together your own logic for this fairly quickly.
If do you need something more sophisticated, the aforementioned library does offer some extensibility. You should be able to import and inherit from some of their classes like the JsonSchemaParser. Maybe that will get you somewhere.
Ultimately I think this becomes non-trivial very quickly, which is why Pydantic's maintainer didn't want to deal with it and why there is a whole separate project for this.
Updated @Alon's answer to handle nested modals:
from typing import Any, Type, Optional
from enum import Enum
from pydantic import BaseModel, Field, create_model
def json_schema_to_base_model(schema: dict[str, Any]) -> Type[BaseModel]:
type_mapping: dict[str, type] = {
"string": str,
"integer": int,
"number": float,
"boolean": bool,
"array": list,
"object": dict,
}
properties = schema.get("properties", {})
required_fields = schema.get("required", [])
model_fields = {}
def process_field(field_name: str, field_props: dict[str, Any]) -> tuple:
"""Recursively processes a field and returns its type and Field instance."""
json_type = field_props.get("type", "string")
enum_values = field_props.get("enum")
# Handle Enums
if enum_values:
enum_name: str = f"{field_name.capitalize()}Enum"
field_type = Enum(enum_name, {v: v for v in enum_values})
# Handle Nested Objects
elif json_type == "object" and "properties" in field_props:
field_type = json_schema_to_base_model(
field_props
) # Recursively create submodel
# Handle Arrays with Nested Objects
elif json_type == "array" and "items" in field_props:
item_props = field_props["items"]
if item_props.get("type") == "object":
item_type: type[BaseModel] = json_schema_to_base_model(item_props)
else:
item_type: type = type_mapping.get(item_props.get("type"), Any)
field_type = list[item_type]
else:
field_type = type_mapping.get(json_type, Any)
# Handle default values and optionality
default_value = field_props.get("default", ...)
nullable = field_props.get("nullable", False)
description = field_props.get("title", "")
if nullable:
field_type = Optional[field_type]
if field_name not in required_fields:
default_value = field_props.get("default", None)
return field_type, Field(default_value, description=description)
# Process each field
for field_name, field_props in properties.items():
model_fields[field_name] = process_field(field_name, field_props)
return create_model(schema.get("title", "DynamicModel"), **model_fields)
Example Schema
schema = {
"title": "User",
"type": "object",
"properties": {
"name": {"type": "string"},
"age": {"type": "integer"},
"is_active": {"type": "boolean"},
"address": {
"type": "object",
"properties": {
"street": {"type": "string"},
"city": {"type": "string"},
"zipcode": {"type": "integer"},
},
},
"roles": {
"type": "array",
"items": {
"type": "string",
"enum": ["admin", "user", "guest"]
}
}
},
"required": ["name", "age"]
}
Generate the Pydantic model
DynamicModel = json_schema_to_base_model(schema)
Example usage
print(DynamicModel.schema_json(indent=2))
One solution is to hack the utils out of datamodel-code-generator, specifically their JsonSchemaParser. This generates an intermediate text representation of all pydantic models which you can then dynamically import. You might reasonably balk at this, but it does allow for self-referencing and multi-model setups at least:
import importlib.util
import json
import re
import sys
from contextlib import contextmanager
from pathlib import Path
from tempfile import NamedTemporaryFile
from types import ModuleType
from datamodel_code_generator.parser.jsonschema import JsonSchemaParser
from pydantic import BaseModel
NON_ALPHANUMERIC = re.compile(r"[^a-zA-Z0-9]+")
UPPER_CAMEL_CASE = re.compile(r"[A-Z][a-zA-Z0-9]+")
LOWER_CAMEL_CASE = re.compile(r"[a-z][a-zA-Z0-9]+")
class BadJsonSchema(Exception):
pass
def _to_camel_case(name: str) -> str:
if any(NON_ALPHANUMERIC.finditer(name)):
return "".join(term.lower().title() for term in NON_ALPHANUMERIC.split(name))
if UPPER_CAMEL_CASE.match(name):
return name
if LOWER_CAMEL_CASE.match(name):
return name[0].upper() + name[1:]
raise BadJsonSchema(f"Unknown case used for {name}")
def _load_module_from_file(file_path: Path) -> ModuleType:
spec = importlib.util.spec_from_file_location(
name=file_path.stem, location=str(file_path)
)
module = importlib.util.module_from_spec(spec)
sys.modules[file_path.stem] = module
spec.loader.exec_module(module)
return module
@contextmanager
def _delete_file_on_completion(file_path: Path):
try:
yield
finally:
file_path.unlink(missing_ok=True)
def json_schema_to_pydantic_model(json_schema: dict, name_override: str) -> BaseModel:
json_schema_as_str = json.dumps(json_schema)
pydantic_models_as_str: str = JsonSchemaParser(json_schema_as_str).parse()
with NamedTemporaryFile(suffix=".py", delete=False) as temp_file:
temp_file_path = Path(temp_file.name).resolve()
temp_file.write(pydantic_models_as_str.encode())
with _delete_file_on_completion(file_path=temp_file_path):
module = _load_module_from_file(file_path=temp_file_path)
main_model_name = _to_camel_case(name=json_schema["title"])
pydantic_model: BaseModel = module.__dict__[main_model_name]
# Override the pydantic model/parser name for nicer ValidationError messaging and logging
pydantic_model.__name__ = name_override
pydantic_model.parse_obj.__func__.__name__ = name_override
return pydantic_model
Main drawback as I see it- datamodel-code-generator has non-dev dependencies isort and black- not ideal to have in your deployments.
If I understand correctly, you are looking for a way to generate Pydantic models from JSON schemas. Here is an implementation of a code generator - meaning you feed it a JSON schema and it outputs a Python file with the Model definition(s). It is not "at runtime" though. For this, an approach that utilizes the create_model function was also discussed in this issue thread a while back, but as far as I know there is no such feature in Pydantic yet.
If you know that your models will not be too complex, it might be fairly easy to implement a crude version of this yourself. Essentially the properties in a JSON schema are reflected fairly nicely by the __fields__ attribute of a model. You could write a function that takes a parsed JSON schema (i.e. a dictionary) and generates the Field definitions to pass to create_model.
After digging a bit deeper into the pydantic code I found a nice little way to prevent this. There is a method called field_title_should_be_set(...) in GenerateJsonSchema which can be subclassed and provided to model_json_schema(...).
I'm not sure if the way I've overwritten the method is sufficient for each edge case but at least for this little test class it works as intended.
from pydantic import BaseModel
from pydantic._internal._core_utils import is_core_schema, CoreSchemaOrField
from pydantic.json_schema import GenerateJsonSchema
class Test(BaseModel):
a: int
class GenerateJsonSchemaWithoutDefaultTitles(GenerateJsonSchema):
def field_title_should_be_set(self, schema: CoreSchemaOrField) -> bool:
return_value = super().field_title_should_be_set(schema)
if return_value and is_core_schema(schema):
return False
return return_value
json_schema = Test.model_json_schema(schema_generator=GenerateJsonSchemaWithoutDefaultTitles)
assert "title" not in json_schema["properties"]["a"]
You can do it in following way with Pydantic v2:
from pydantic import BaseModel, ConfigDict
def my_schema_extra(schema: dict[str, Any]) -> None:
for prop in schema.get('properties', {}).values():
prop.pop('title', None)
class Model(BaseModel):
a: int
model_config = ConfigDict(
json_schema_extra=my_schema_extra,
)
print(Model.schema_json())