Skip to content

Config File

Dorsal uses a TOML file to manage settings, including API authentication and the local Annotation Model Pipeline.

Managing your Config

If you want to manage your config, the easiest and safest way is programmatically using the CLI or Python API.

Config File Locations

Dorsal has a two-tier configuration system. Settings from a file with higher precedence will always override settings from a file with lower precedence.

Precedence Order: Project > Global

  1. Project Config (Highest Precedence):

    • Location: A file named dorsal.toml or .dorsal.toml in your current directory or any parent directory.
    • Use Case: This is the recommended place to define project-specific settings, such as a custom [[model_pipeline]].
  2. Global Config (User-level):

    • Location: A dorsal.toml file inside your user's .dorsal directory (e.g., /home/user/.dorsal/dorsal.toml on Linux, or C:\Users\user\.dorsal\dorsal.toml on Windows).
    • Use Case: This is the base-level configuration for your user. It's the best place to store settings that should apply everywhere, like your [auth] credentials.

In-Memory Fallback

In addition to these two files, Dorsal maintains a fall-back, in-memory config, identical to the on-dick default configuration. This is used as a fallback to ensure the library can run "out of the box" if the global dorsal.toml file is missing or unreadable. The content of this fallback is what Dorsal writes to your new global dorsal.toml the first time it runs.

Finding Your Config

Dorsal will create a default global dorsal.toml in your user home directory the first time it runs.

When you modify config settings using the CLI, a new dorsal.toml file will be created in your project's root directory.

To see your current config path, and any other config-related settings, run the dorsal config show command in the terminal.

Config Field Reference

auth

This section stores your DorsalHub API credentials. You can manage these settings using dorsal auth.

  • api_key: Your active API key.
  • email: The email address associated with your account.
[auth]
api_key = "dorsal_key_..."
email = "user@example.com"

ui

This section controls the visual theme for the command-line interface. You can manage thiese settings using dorsal config.

  • theme: The name of the theme to use (e.g., "default", "dark").
[ui]
theme = "dark"

report.collection.panels

This section controls which data panels are enabled by default in the HTML directory report generated by dorsal.api.generate_html_directory_report. To modify these settings, change any of these values by editing the config directly.

[report.collection.panels]
summary_stats = true
collection_overview = true
dynamic_size_histogram = false
duplicates_report = true
file_explorer = true

model_pipeline

This section defines the entire Annotation Model Pipeline as an array of TOML tables. The order of these blocks in the file defines the execution order.

Each pipeline step has several keys:

  • annotation_model: A list containing two strings: the importable Python module and the AnnotationModel class name.
  • schema_id: The string ID that will be used as the key for this annotation in the File Record's annotations object.
  • validation_model: (Optional) A list containing the module and class name of a Pydantic model or JsonSchemaValidator used to validate the model's output.
  • dependencies: (Optional) A list of dependency tables that act as rules. The model will only run if all dependencies pass.
  • options: (Optional) A dictionary of keyword arguments to be passed directly to the model's main() method during execution.

Example: Single [model_pipeline] step:

# --- MediaInfo Annotation (Runs on media files) ---
[[model_pipeline]]
annotation_model = ["dorsal.file.annotation_models", "MediaInfoAnnotationModel"]
schema_id = "file/mediainfo"
validation_model = ["dorsal.file.validators.mediainfo", "MediaInfoValidationModel"]

# Dependencies: This model only runs if the file media type matches.
dependencies = [
    { type = "media_type", include = ["audio", "image", "video"], exclude = ["image/svg"] },
]

Adding a Custom Model to the Pipeline

If you want to add your own custom annotation model, you have two options:

This is the safest and easiest method. This function will intelligently find the correct config file (based on the scope argument) and append your new model, preserving any existing models in your pipeline.

If no pipeline is defined in the target file, it automatically includes all the default models (like dorsal/base), preventing common configuration errors.

Add a new Annotation Model to the Dorsal config file
from dorsal.api import register_model
from my_project.models import MyCustomModel, MyCustomValidator

# This will add the model to the 'dorsal.toml' in your project root
register_model(
    model=MyCustomModel,
    schema_id="my-project/custom-data",
    validator=MyCustomValidator,
    media_type_includes=["image/jpeg", "image/png"],
)
  • Edit the dorsal.toml file manually

    If you choose to edit the file manually, be aware of the following:

    Manual Pipeline Edits Replace the Default

    If you manually define [[model_pipeline]] in a project-level dorsal.toml, it will replace the entire default pipeline. The model_pipeline array itself is not merged with the one from your global config.

    You must copy the full default pipeline into your project-level config file first, and then append your custom model to that list.

    You can find the full default pipeline in this document (under "Full Default Configuration") or in your global config file (e.g., /home/user/.dorsal/dorsal.toml on Linux, or C:\Users\user\.dorsal\dorsal.toml on Windows).

Full Default Configuration

You can view the default configuration that is built into Dorsal in the github repo: dorsal/src/dorsal/common/config.py.

A copy of this is written to your Global Config file the first time you run dorsal.