Skip to content

Advanced Usage

Regular Expressions

While the basic glob patterns work for simple cases, regex provides more powerful pattern matching capabilities.

Pattern Examples

Use the --regex flag with --include or --exclude:

# Match files ending in numbers
hashreport scan --regex --include ".*[0-9]$" /path/to/directory

# Match specific date formats in filenames
hashreport scan --regex --include ".*\d{4}-\d{2}-\d{2}.*" /path/to/directory

# Exclude files with specific patterns
hashreport scan --regex --exclude "^(backup|temp).*" /path/to/directory

# Multiple patterns
hashreport scan --regex \
  --include ".*\.(jpg|png)$" \
  --exclude "thumb.*\.jpg$" \
  /path/to/directory

Filelist with Patterns

The filelist command supports the same pattern matching:

Email Notifications

hashreport can email reports upon completion using SMTP.

Basic Email Setup

hashreport scan /path/to/directory \
  --email recipient@example.com \
  --smtp-host smtp.example.com \
  --smtp-user username \
  --smtp-password password

Testing Email Configuration

Test your email settings without processing files:

hashreport scan /path/to/directory \
  --email recipient@example.com \
  --smtp-host smtp.example.com \
  --smtp-user username \
  --smtp-password password \
  --test-email

Gmail Example

Using Gmail's SMTP server:

hashreport scan /path/to/directory \
  --email recipient@gmail.com \
  --smtp-host smtp.gmail.com \
  --smtp-port 587 \
  --smtp-user your.email@gmail.com \
  --smtp-password "your-app-password"

Note

For Gmail, you'll need to use an App Password if you have 2-factor authentication enabled. Generate one at: Google Account → Security → 2-Step Verification → App passwords

Environment Variables

You can store SMTP credentials in environment variables:

export HASHREPORT_SMTP_HOST=smtp.example.com
export HASHREPORT_SMTP_USER=username
export HASHREPORT_SMTP_PASSWORD=password

# Now run without exposing credentials in command
hashreport scan /path/to/directory --email recipient@example.com

Performance Tuning

Worker Threads

hashreport automatically uses multiple threads based on your CPU cores. You can override this in the configuration:

# pyproject.toml
[tool.hashreport]
max_workers = 4  # Set specific number of worker threads

Chunk Size

For large files, you can adjust the chunk size used when calculating hashes:

# pyproject.toml
[tool.hashreport]
chunk_size = 8192  # Default is 4096

Custom Report Names

Timestamp Format

Customize the timestamp format used in report filenames:

# pyproject.toml
[tool.hashreport]
timestamp_format = "%Y%m%d_%H%M%S"  # Default is "%y%m%d-%H%M"

Output Examples

# Custom named reports
hashreport scan /path/to/directory -o custom_report.csv

# Multiple formats with custom names
hashreport scan /path/to/directory \
  -o report.csv \
  -f csv -f json

Report Comparison

Understanding Changes

The comparison functionality identifies several types of changes:

  • Modified: File exists in both reports but has different hash values
  • Moved: Same file (identical hash) exists in different locations
  • Added: File exists only in the newer report
  • Removed: File exists only in the older report

Output Format

Changes are displayed with bold text for better visibility:

  • Change type is displayed in bold
  • File paths show the original and new locations
  • Details include hash values and change descriptions
  • Complete path information is shown in separate columns

Saving Comparisons

Comparison results can be saved to CSV format:

hashreport compare old_report.csv new_report.csv -o /path/to/output/

The output filename will be generated automatically using the format: compare_<old_report>_<new_report>.csv

Using with Version Control

Example workflow for tracking file changes:

# Generate baseline report
hashreport scan /project/dir -o baseline.csv

# Later, generate new report
hashreport scan /project/dir -o current.csv

# Compare changes
hashreport compare baseline.csv current.csv