forked from ARMmbed/mbed-os
-
Notifications
You must be signed in to change notification settings - Fork 28
Use Pydantic schemas to validate Mbed's JSON files (part 1) #516
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
multiplemonomials
wants to merge
28
commits into
master
Choose a base branch
from
dev/pydantic-schemas-part-1
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
tools/python/mbed_tools/targets/_internal/targets_json_parsers/accumulating_attribute_parser.py
Outdated
Show resolved
Hide resolved
tools/python_tests/mbed_tools/targets/_internal/test_target_attributes.py
Show resolved
Hide resolved
4709771 to
10efa70
Compare
27ff3cd to
9626898
Compare
10efa70 to
426c039
Compare
VictorWTang
reviewed
Dec 1, 2025
tools/python/mbed_tools/targets/_internal/targets_json_parsers/overriding_attribute_parser.py
Show resolved
Hide resolved
tools/python/mbed_tools/targets/_internal/targets_json_parsers/accumulating_attribute_parser.py
Outdated
Show resolved
Hide resolved
…s now being used)
426c039 to
606e650
Compare
VictorWTang
approved these changes
Dec 6, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary of changes
Ever since I started using Mbed, one of the biggest pain points has been working with JSON configuration files. Mbed uses these files as its configuration system, and they are used to store information such as the list of known targets and their properties, the options that can be configured for each macro and library, and the memory bank info for each target. This system isn't bad in concept, and basically any project as large and configurable as Mbed OS needs something like it.
However, the implementation has never been super solid: Mbed relies on python scripts that load tons of these JSON files, combine them together and munge the data in various ways into a single dict with every concievable setting in it, and then pass that dict off to the code that generates build flags and other configuration (which used to also be in python, but is now in CMake). This implementation meant that the valid properties for each JSON file, and what effect that they'd have within the configuration system, was in large part a mystery to everyone who did not directly work on the Mbed build system.
ARM did at least make some attempt to document the valid settings, but looking at the real JSON files shows that there were many more undocumented ones that are not mentioned in those pages. When you factor in that the way the config system merges JSONs means that target JSON attributes can be put in mbed_app.json and will "work" (though they might override other attributes!), and the fact that it has never issued warnings for unknown/unrecognized attributes, AND the fact that all of this stuff was JSON until recently and didn't allow comments, you get... a system made almost radioactive by a decade of cruft that no one wants to change for fear of breaking something. Even to relatively seasoned users, like me a few years ago, there were plenty of JSON settings that seemed like borderline magic incantations -- you put em in and something happens but you have no idea why,
Well, I am here to tell you that this ends today. Well, mostly. I've had the pleasure of using pydantic at my day job, and it's a super cool library -- it basically lets you define a schema for structured data as a python class, and then use that schema to parse, validate, and dump data. Me, and @VictorWTang as well, thought that this would be a great use for it.
For this PR, I read through many, many existing JSON files as well as nearly all the existing config code, and reverse engineered a schema for all of Mbed's JSON files. This schema should cover the large, large majority of existing use cases for these files, but removes all the legacy attributes which haven't had meaning since Mbed CLI 1 was being used. It also provides, at last, real documentation for every single legal field of each JSON file (right now it's in the form of a Python class, but we also have options to convert this to a JSON schema and, from there, markdown docs if we'd like to). Right now, the schema is being enforced for all mbed_lib.json and targets/custom_targets.json files, as these are mostly used only within Mbed, mbed_app.json, meanwhile, is validated against the schema but validation errors only are treated as a warning. This way, compatibility will not be broken for projects using mbed_app.json in unexpected ways.
For this PR, I did some refactoring of how the configuration is processed internally, mostly to keep things in the pydantic-model format instead of a dict where it makes sense. However, I did not change the fundamental method used to generate the final configuration (stuffing everything into a single god dict). This was both out of fear of making breaking changes, and because I didn't want to break mbed_app.json compatibility (since, as I said, this file was the total Wild West that could override any configuration setting). I called this PR "part 1" because eventually (years down the road), I'd like to go through and replace the god dictionary (
config.Config) with a proper data model that stores each thing individually. But that will have to wait until users have gotten used to the new schema rules (and we've made any changes to the schema that we end up wanting!).Oh, also, since I was in the guts of this code anyway, I took the chance to conquer one of the smaller evils of Mbed programming: the lack of any naming standard for config settings. I am defining here and now that all settings shall be in lowercase-skewer-case, and may not contain uppercase letters or underscores. Mbed will now print a warning if it sees an underscore in a setting name, and transform it into a hyphen. This means it's no longer possible to screw up your configuration by writing
target.application_profileinstead oftarget.application-profileand the like. This removes one of the easiest ways to make config mistakes and the biggest thing that kept me from remembering these names without having to check each time.Impact of changes
overridesandtarget_overridesin the same JSON file would conflict with each other. This could cause settings that were previously not being applied to now be applied.Migration actions required
Documentation
Pull request type
Test results