Skip to content

Directories

The dorsal dir command group covers operations on a local directory of files.

These commands allow you to scan entire folders, generate metadata, find duplicate files, and push metadata for all files in a directory to DorsalHub.

This guide covers the four sub-commands:

  1. scan: Scan all files, view metadata, and save reports (JSON/CSV).
  2. info: Get a quick summary and high-level overview for a directory.
  3. duplicates: Find files with identical content within a directory.
  4. push: Upload metadata for all files in a directory to DorsalHub.

dorsal dir scan

This is your primary command for local directory inspection. It scans all files in a directory (and optionally, all sub-directories too), displays a summary table, and allows you to save the full metadata for all files as a JSON or CSV report.

Example

  1. Find a directory on your system and copy its path.

  2. Run dorsal dir scan with the path to your directory:

    dorsal dir scan /path/to/my-documents
    

When the scan is complete, you'll see a summary panel and a table of the files found:

✨ Found and processed 4 file(s) in C:\stuff in 1.573 seconds.
╭─ Directory Scan Summary ───────────────────────────────────╮
│        Total Files: 4                                      │
│         Total Size: 74 MiB                                 │
│ Newest Modified File: 2025-09-02 11:58:19 (bill.pdf)       │
│ Oldest Modified File: 2024-06-06 10:57:47 (DorsalHub.pptx) │
│                                                            │
│        Media Types: 4                                      │
╰────────────────────────────────────────────────────────────╯
                                        File Scan Details
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┓
┃ Filename                       ┃   Size ┃ Media Type                     ┃ Modified Date       ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━┩
│ bill.pdf                       │ 80 KiB │ application/pdf                │ 2025-09-02 11:58:19 │
│ desktop-version-latest.msi     │ 21 MiB │ application/x-msi              │ 2025-07-04 17:58:14 │
│ DorsalHub.pptx                 │ 40 KiB │ application/vnd.openxmlformat… │ 2024-06-06 10:57:47 │
│ Game Files.7z                  │ 53 MiB │ application/x-7z-compressed    │ 2024-09-28 22:15:17 │
└────────────────────────────────┴────────┴────────────────────────────────┴─────────────────────┘

Options

  • Scan all sub-directories with -r or --recursive:

    dorsal dir scan /path/to/my-documents -r
    
  • Save a full JSON report of all file metadata with -s or --save:

    dorsal dir scan /path/to/my-documents -s
    
  • Save just the summary table as a CSV file with -c or --csv:

    dorsal dir scan /path/to/my-documents -c
    
  • Specify a custom output file path for your report (the format is inferred from the extension):

    dorsal dir scan /path/to/my-docs -o "my-scan.json"
    
  • Output the full, raw JSON metadata to your terminal with --json:

    dorsal dir scan /path/to/my-documents --json
    
  • Change the number of files in the summary table with -l or --limit:

    dorsal dir scan /path/to/my-documents -l 50
    
  • Sort the summary table by file size, type, or date (default is name):

    dorsal dir scan /path/to/my-documents --sort-by size --sort-order desc
    
  • Run the command without using the local cache:

    dorsal dir scan /path/to/my-documents --skip-cache
    
  • Run the command, and overwrite the local cache entries with the result:

    dorsal dir scan /path/to/my-documents --overwrite-cache
    

CLI Docs

 Usage: dorsal dir scan [OPTIONS] PATH                                         

 Scans a directory, generates metadata for all files, and displays or saves   
 the results.                                                                 

╭─ Arguments ─────────────────────────────────────────────────────────────────╮
│ * path      DIRECTORY  The path to the directory to scan. [required]        │
╰─────────────────────────────────────────────────────────────────────────────╯
╭─ Options ───────────────────────────────────────────────────────────────────╮
│ --help          Show this message and exit.                                 │
╰─────────────────────────────────────────────────────────────────────────────╯
╭─ Output Options ────────────────────────────────────────────────────────────╮
│ --output  -o      PATH   Path to save an output file. Extension (.json or   │
│                          .csv) determines format.                           │
│ --save    -s             Save the full JSON metadata report to the default  │
│                          scan directory.                                    │
│ --csv     -c             Save the summary file table as a CSV report.       │
│ --json                   Output the full metadata as a raw JSON object to   │
│                          stdout for scripting.                              │
╰─────────────────────────────────────────────────────────────────────────────╯
╭─ Scan Options ──────────────────────────────────────────────────────────────╮
│ --recursive  -r         Scan subdirectories recursively.                    │
│ --limit      -l    INTEGER  Limit the number of files displayed in the      │
│                          summary table. [default: 20]                       │
│ --sort-by      TEXT    Column to sort by. One of: name, size, type,         │
│                          date. [default: name]                              │
│ --sort-order   TEXT    Sort order. One of: asc, desc. [default: asc]        │
╰─────────────────────────────────────────────────────────────────────────────╯
╭─ Cache Options ─────────────────────────────────────────────────────────────╮
│ --use-cache           Force the use of the cache, overriding any global     │
│                       setting.                                              │
│ --skip-cache          Bypass the local cache and re-process all files,      │
│                       overriding any global setting.                        │
│ --overwrite-cache     Re-process files and overwrite the local cache        |
|                       entries.                                              │
╰─────────────────────────────────────────────────────────────────────────────╯

dorsal dir info

This command generates a quick summary of a directory.

Example

Run dorsal dir info on a directory.

dorsal dir info /path/to/my-project

Alternatively, use -r or --recursive to include all sub-directories:

dorsal dir info /path/to/my-project -r

When complete, it will display a panel with a summary:

📊 Summary of C:\my-project
╭────────────────────────────────── Directory Summary ───────────────────────────────────╮
│          Total File Count: 10,450                                                      │
│         Total Directories: 1,200                                                       │
│              Hidden Files: 350                                                         │
│                Total Size: 4.51 GiB                                                    │
│             Scan Duration: 0.85 seconds                                                │
│                                                                                        │
│         Average File Size: 453.2 KiB                                                   │
│              Largest File: ./build/output.zip (1.20 GiB)                               │
│             Smallest File: ./src/config.empty (0 B)                                    │
│                                                                                        │
│      Newest Modified File: 2025-10-30 10:15:00 (./src/main.py)                         │
│      Oldest Modified File: 2021-05-10 11:00:00 (./LICENSE)                             │
│      Oldest Creation Date: 2021-05-10 10:00:00 (./.git/config)                         │
│                                                                                        │
│        Executable Files: 50                                                            │
│         Read-Only Files: 12                                                            │
╰────────────────────────────────────────────────────────────────────────────────────────╯

Options

  • Scan all sub-directories with -r or --recursive:

    dorsal dir info /path/to/large-project -r
    
  • Include a summary table of all media types with -m or --media-type. This is slightly slower.

    dorsal dir info /path/to/large-project -r -m
    
  • Save the full JSON report with -s or --save:

    dorsal dir info /path/to/large-project -s
    
  • Output the raw JSON to your terminal with --json:

    dorsal dir info /path/to/large-project --json
    

CLI Docs

 Usage: dorsal dir info [OPTIONS] PATH                                         

 Displays summary of files in a directory.                                    

╭─ Arguments ─────────────────────────────────────────────────────────────────╮
│ * path      DIRECTORY  The directory path to analyze. [required]            │
╰─────────────────────────────────────────────────────────────────────────────╯
╭─ Options ───────────────────────────────────────────────────────────────────╮
│ --help          Show this message and exit.                                 │
╰─────────────────────────────────────────────────────────────────────────────╯
╭─ Scan Options ──────────────────────────────────────────────────────────────╮
│ --recursive / --no-recursive  -r/-R  Scan subdirectories recursively.       │
│ --media-type                 -m      Include Media Type summary table.      │
│                                      Reduces scan speed.                    │
╰─────────────────────────────────────────────────────────────────────────────╯
╭─ Output Options ────────────────────────────────────────────────────────────╮
│ --json                   Output a raw JSON object to stdout for scripting.  │
│ --save     -s              Save the JSON report to the default directory or │
│                          --output path.                                     │
│ --output   -o       PATH   Custom path to save the JSON report (e.g.,       │
│                          'info.json').                                      │
╰─────────────────────────────────────────────────────────────────────────────╯

dorsal dir duplicates

This command scans a directory to find files with identical content. It groups the duplicate files and shows how much space they occupy.

Example

Run dorsal dir duplicates on a directory

dorsal dir duplicates /path/to/my-downloads

When complete, it will display a panel for each set of duplicates found:

Mode: Hybrid. Using fast scan with secure verification.

⚠️ Found 2 set(s) of duplicate files (8 hashes from cache) in 0.276 seconds.

Displaying the 2 largest duplicate sets.
╭─────────────── Duplicate Set 1 of 2 | Size: 2 MiB (each) ────────────────╮
│ SHA256: a840b23dbc0d0bd90d042bedc5ba69d050686663ac402d6d8924ca48c48003dc │
│ 📄 C:\my-project\Secret Project FINAL 1 (2).docx                         │
│ 📄 C:\my-project\2025-11-03.docx                                         │
╰──────────────────────────────────────────────────────────────────────────╯

╭────────────── Duplicate Set 2 of 2 | Size: 43 MiB (each) ────────────────╮
│ SHA256: 4383fb2ab568ca7019834d438f9a14b9d2ccaa2f37f319373848350005779368 │
│ 📄 C:\my-project\build_234911.exe                                        │
│ 📄 C:\my-project\release.exe                                             │
╰──────────────────────────────────────────────────────────────────────────╯

Options

  • Include all sub-directories in the scan with -r or --recursive:

    dorsal dir duplicates /path/to/my-downloads -r
    
  • Use the fast "QUICK" hash only (may lead to rare false positives) via -q or --quick:

    dorsal dir duplicates /path/to/my-downloads -q
    
  • Use "SHA256" secure hashing only (slower than the default hybrid approach) via --sha256:

    dorsal dir duplicates /path/to/my-downloads --sha256
    
  • Only consider files that are larger than a certain size using --min-size:

    dorsal dir duplicates /path/to/my-downloads --min-size 1MB
    
  • Only consider files that are smaller than a certain size using --max-size:

    dorsal dir duplicates /path/to/my-downloads --max-size 1000GiB
    
  • Limit the number of duplicate sets shown in the terminal with -l or --limit:

    dorsal dir duplicates /path/to/my-downloads -l 10
    
  • Save the full JSON report of all duplicates found with -s or --save:

    dorsal dir duplicates /path/to/my-downloads -s
    
  • Run the command without using the local cache:

    dorsal dir duplicates /path/to/my-documents --skip-cache
    

CLI Docs

 Usage: dorsal dir duplicates [OPTIONS] PATH                                   

 Finds and reports files with identical content hashes.                       

╭─ Arguments ─────────────────────────────────────────────────────────────────╮
│ * path      DIRECTORY  The directory path to scan for duplicates.           │
│                      [required]                                             │
╰─────────────────────────────────────────────────────────────────────────────╯
╭─ Options ───────────────────────────────────────────────────────────────────╮
│ --help          Show this message and exit.                                 │
╰─────────────────────────────────────────────────────────────────────────────╯
╭─ Scan Options ──────────────────────────────────────────────────────────────╮
│ --recursive / --no-recursive  -r/-R  Scan subdirectories recursively.       │
│ --limit                      -l    INTEGER  The number of duplicate sets to │
│                                      display. [default: 5]                  │
│ --min-size                   TEXT    Only find duplicates larger than this  │
│                                      size (e.g., '1MB', '500KB').           │
│                                      [default: 0]                           │
│ --max-size                   TEXT    Only find duplicates smaller than this │
│                                      size (e.g., '1GB').                    │
╰─────────────────────────────────────────────────────────────────────────────╯
╭─ Output Options ────────────────────────────────────────────────────────────╮
│ --output  -o      PATH   Path to save the full JSON report (e.g.,           │
│                          'duplicates.json').                                │
│ --save    -s             Save the full JSON report to the default           │
│                          directory or --output path.                        │
│ --json                   Output the full duplicate report as a raw JSON     │
│                          object to stdout for scripting.                    │
╰─────────────────────────────────────────────────────────────────────────────╯
╭─ Hashing Options ───────────────────────────────────────────────────────────╮
│ --hybrid                Use hybrid mode (default): fast initial scan with   │
│                         definitive SHA-256 verification.                    │
│ --quick   -q            Use QUICK hash only. Fastest, but may have rare     │
│                         false positives.                                    │
│ --sha256                Use SHA-256 hash only. Slower, but provides         │
│                         definitive results.                                 │
╰─────────────────────────────────────────────────────────────────────────────╯
╭─ Cache Options ─────────────────────────────────────────────────────────────╮
│ --use-cache           Force the use of the cache, overriding any global     │
│                       setting.                                              │
│ --skip-cache          Bypass the local cache and re-calculate all hashes,   │
│                       overriding any global setting to enable it.           │
╰─────────────────────────────────────────────────────────────────────────────╯

dorsal dir push

This command scans all files in a directory (like dorsal dir scan) and pushes their metadata to DorsalHub.

By default, all pushed records are private.

You can also use this command to create a new remote Collection on DorsalHub from the directory's contents.

Example

  1. Push metadata for all files in a directory as private records (the default):

    dorsal dir push "C:\My Stuff"
    

    On success, you'll see a confirmation:

    📡 Preparing to push metadata from C:\My Stuff
    ╭────────────── Push Complete ──────────────╮
    │             Access Level:Private          │
    │            Files Scanned:4 (4 from cache) │
    │     File Records to Push:4                │
    │    File Records Accepted:4                │
    ╰───────────────────────────────────────────╯
    

    Alternatively if you use the --public argument, it will create public records on DorsalHub:

    dorsal dir push "C:\My Stuff" --public
    
  2. Create a Collection on DorsalHub for the files in the directory using --create-collection:

    dorsal dir push "C:\My Stuff" --create-collection
    

    When it's complete you'll see something like:

        📡 Preparing to publish metadata from C:\My Stuff
    ╭─────────────────────────────────── Publish Complete ───────────────────────────────────╮
    │ Successfully pushed 4 files and created collection.                                    │
    │                                                                                        │
    │ URL: http://dorsalhub.com/dashboard/collections/vrTz4OKeG9onWMzIqn7RRY9jJvh6KZQ3 │
    ╰────────────────────────────────────────────────────────────────────────────────────────╯
    

    The URL provided will take you to a page on your DorsalHub dashboard, where you can view and (if public) share the metadata.

    Free users are can have up to five collections.

    You can see your remaining collection limit on your DorsalHub Dashboard

Options

  • Push all files as public records with --public:

    dorsal dir push /path/to/my-public-dataset -r --public
    
  • Scan and push recursively with -r:

    dorsal dir push /path/to/my-dataset -r
    
  • Scan all files and create a new private collection on DorsalHub named "My New Project":

    dorsal dir push /path/to/my-dataset -r --create-collection --name "My New Project"
    
  • Scan all files and create a new public collection on DorsalHub named "My New Project":

    dorsal dir push /path/to/my-dataset -r --create-collection --name "My New Project" --public
    
  • Scan files and show what would be pushed, without sending any data to the server with --dry-run:

    dorsal dir push /path/to/my-dataset --dry-run
    
  • If the directory contains duplicate files, this command will fail. You can force it to push only the first copy of each duplicate and ignore the rest using --ignore-duplicates:

    dorsal dir push /path/to/my-dataset --ignore-duplicates
    
  • Get the raw JSON API response after the push, instead of an info panel, with --json:

    dorsal dir push /path/to/my-dataset --json
    

CLI Docs

Usage: dorsal dir push [OPTIONS] PATH

 Scans a directory, pushes all file metadata to DorsalHub, and optionally creates a new remote
 collection from the contents.

╭─ Arguments ────────────────────────────────────────────────────────────────────────────────────╮
│ *    path      DIRECTORY  The directory path containing files to scan and push. [required]     │
╰────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Options ──────────────────────────────────────────────────────────────────────────────────────╮
│ --private                --public                  Index records as private or public.         │
│                                                    [default: private]                          │
│ --recursive          -r  --no-recursive  -R        Scan subdirectories recursively.            │
│                                                    [default: no-recursive]                     │
│ --create-collection                                Create a private collection on DorsalHub    │
│                                                    containing the pushed files.                │
│ --name                                       TEXT  Name for the new collection. Defaults to    │
│                                                    the directory name if not provided.         │
│ --desc                                       TEXT  Description for the new collection.         │
│ --dry-run                                          Scan files and show what would be pushed,   │
│                                                    without sending data to the server.         │
│ --ignore-duplicates                                Keep the first file of any duplicates and   │
│                                                    push it, ignoring subsequent copies.        │
│ --json                                             Output the final summary as a raw JSON      │
│                                                    object to stdout for scripting.             │
│ --help                                             Show this message and exit.                 │
╰────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Cache Options ────────────────────────────────────────────────────────────────────────────────╮
│ --use-cache           Force the use of the cache, overriding any global setting.               │
│ --skip-cache          Bypass the local cache and re-process all files, overriding any global   │
│                       setting.                                                                 │
| --overwrite-cache     Re-process all files and overwrite the local cache with new entries.     |
╰────────────────────────────────────────────────────────────────────────────────────────────────╯

➡️ Continue to: dorsal record Guide