I see Python used to do a fair amount of code generation for C/C++ header and source files. Usually, the input files which store parameters are in JSON or YAML format, although most of what I see is YAML. However, why not just use Python files directly? Why use YAML at all in this case?
That also got me thinking: since Python is a scripted language, its files, when containing only data and data structures, could literally be used the same as XML, JSON, YAML, etc. Do people do this? Is there a good use case for it?
What if I want to import a configuration file into a C or C++ program? What about into a Python program? In the Python case it seems to me there is no sense in using YAML at all, as you can just store your configuration parameters and variables in pure Python files. In the C or C++ case, it seems to me you could still store your data in Python files and then just have a Python script import that and auto-generate header and source files for you as part of the build process. Again, perhaps there's no need for YAML or JSON in this case at all either.
Thoughts?
Here's an example of storing some nested key/value hash table pairs in a YAML file:
---
dict_key1:
dict_key2:
dict_key3a: my string message
dict_key3b: another string message
And the same exact thing in a pure Python file:
data = {
"dict_key1": {
"dict_key2": {
"dict_key3a": "my string message",
"dict_key3b": "another string message",
}
}
}
And to read in both the YAML and Python data and print it out:
import yaml # Module for reading in YAML files
import json # Module for pretty-printing Python dictionary types
# See: https://stackoverflow.com/a/34306670/4561887
# 1) import .yml file
with open("my_params.yml", "r") as f:
data_yml = yaml.load(f)
# 2) import .py file
from my_params import data as data_py
# OR: Alternative method of doing the above:
# import my_params
# data_py = my_params.data
# 3) print them out
print("data_yml = ")
print(json.dumps(data_yml, indent=4))
print("\ndata_py = ")
print(json.dumps(data_py, indent=4))
Reference for using json.dumps
: https://stackoverflow.com/a/34306670/4561887
SAMPLE OUTPUT of running python3 import_config_file.py
:
data_yml =
{
"dict_key1": {
"dict_key2": {
"dict_key3a": "my string message",
"dict_key3b": "another string message"
}
}
}
data_py =
{
"dict_key1": {
"dict_key2": {
"dict_key3a": "my string message",
"dict_key3b": "another string message"
}
}
}
Yes people do this, and have been doing this for years.
But many make the mistake you do and make it unsafe to by using import my_params.py
. That would be the same as loading YAML using YAML(typ='unsafe')
in ruamel.yaml
(or yaml.load()
in PyYAML, which is unsafe).
What you should do is using the ast
package that comes with Python to parse your "data" structure, to make such an import safe. My package pon
has code to update these kind of structures, and in each of my __init__.py
files there is such an piece of data named _package_data
that is read by some code (function literal_eval
) in the setup.py
for the package. The ast
based code in setup.py takes around ~100 lines.
The advantage of doing this in a structured way are the same as with using YAML: you can programmatically update the data structure (version numbers!), although I consider PON, (Python Object Notation), less readable than YAML and slightly less easy to manually update.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With