Config File
Dorsal uses a TOML file to manage settings, including API authentication and the local Annotation Model Pipeline.
Managing your Config
If you want to manage your config, the easiest and safest way is programmatically using the CLI or Python API.
Config File Locations
Dorsal has a two-tier configuration system. Settings from a file with higher precedence will always override settings from a file with lower precedence.
Precedence Order: Project > Global
-
Project Config (Highest Precedence):
- Location: A file named
dorsal.tomlor.dorsal.tomlin your current directory or any parent directory. - Use Case: This is the recommended place to define project-specific settings, such as a custom
[[model_pipeline]].
- Location: A file named
-
Global Config (User-level):
- Location: A
dorsal.tomlfile inside your user's.dorsaldirectory (e.g.,/home/user/.dorsal/dorsal.tomlon Linux, orC:\Users\user\.dorsal\dorsal.tomlon Windows). - Use Case: This is the base-level configuration for your user. It's the best place to store settings that should apply everywhere, like your
[auth]credentials.
- Location: A
In-Memory Fallback
In addition to these two files, Dorsal maintains a fall-back, in-memory config, identical to the on-dick default configuration. This is used as a fallback to ensure the library can run "out of the box" if the global dorsal.toml file is missing or unreadable. The content of this fallback is what Dorsal writes to your new global dorsal.toml the first time it runs.
Finding Your Config
Dorsal will create a default global dorsal.toml in your user home directory the first time it runs.
When you modify config settings using the CLI, a new dorsal.toml file will be created in your project's root directory.
To see your current config path, and any other config-related settings, run the dorsal config show command in the terminal.
Config Field Reference
auth
This section stores your DorsalHub API credentials. You can manage these settings using dorsal auth.
api_key: Your active API key.email: The email address associated with your account.
ui
This section controls the visual theme for the command-line interface. You can manage thiese settings using dorsal config.
theme: The name of the theme to use (e.g.,"default","dark").
report.collection.panels
This section controls which data panels are enabled by default in the HTML directory report generated by dorsal.api.generate_html_directory_report. To modify these settings, change any of these values by editing the config directly.
[report.collection.panels]
summary_stats = true
collection_overview = true
dynamic_size_histogram = false
duplicates_report = true
file_explorer = true
model_pipeline
This section defines the entire Annotation Model Pipeline as an array of TOML tables. The order of these blocks in the file defines the execution order.
Each pipeline step has several keys:
annotation_model: A list containing two strings: the importable Python module and theAnnotationModelclass name.schema_id: The string ID that will be used as the key for this annotation in theFile Record'sannotationsobject.validation_model: (Optional) A list containing the module and class name of a Pydantic model orJsonSchemaValidatorused to validate the model's output.dependencies: (Optional) A list of dependency tables that act as rules. The model will only run if all dependencies pass.options: (Optional) A dictionary of keyword arguments to be passed directly to the model'smain()method during execution.
Example: Single [model_pipeline] step:
# --- MediaInfo Annotation (Runs on media files) ---
[[model_pipeline]]
annotation_model = ["dorsal.file.annotation_models", "MediaInfoAnnotationModel"]
schema_id = "file/mediainfo"
validation_model = ["dorsal.file.validators.mediainfo", "MediaInfoValidationModel"]
# Dependencies: This model only runs if the file media type matches.
dependencies = [
{ type = "media_type", include = ["audio", "image", "video"], exclude = ["image/svg"] },
]
Adding a Custom Model to the Pipeline
If you want to add your own custom annotation model, you have two options:
- Use
dorsal.api.register_model(Recommended)
This is the safest and easiest method. This function will intelligently find the correct config file (based on the scope argument) and append your new model, preserving any existing models in your pipeline.
If no pipeline is defined in the target file, it automatically includes all the default models (like dorsal/base), preventing common configuration errors.
from dorsal.api import register_model
from my_project.models import MyCustomModel, MyCustomValidator
# This will add the model to the 'dorsal.toml' in your project root
register_model(
model=MyCustomModel,
schema_id="my-project/custom-data",
validator=MyCustomValidator,
media_type_includes=["image/jpeg", "image/png"],
)
-
Edit the
dorsal.tomlfile manuallyIf you choose to edit the file manually, be aware of the following:
Manual Pipeline Edits Replace the Default
If you manually define
[[model_pipeline]]in a project-leveldorsal.toml, it will replace the entire default pipeline. Themodel_pipelinearray itself is not merged with the one from your global config.You must copy the full default pipeline into your project-level config file first, and then append your custom model to that list.
You can find the full default pipeline in this document (under "Full Default Configuration") or in your global config file (e.g.,
/home/user/.dorsal/dorsal.tomlon Linux, orC:\Users\user\.dorsal\dorsal.tomlon Windows).
Full Default Configuration
You can view the default configuration that is built into Dorsal in the github repo: dorsal/src/dorsal/common/config.py.
A copy of this is written to your Global Config file the first time you run dorsal.