Advanced Usage
Basic Concepts
Before diving into advanced features, make sure you're familiar with the basic functionality:
- Basic Directory Scanning - Simple directory scanning
- Hash Algorithm Selection - Choosing hash algorithms
- File Type Filtering - Basic file filtering
- Output Format Selection - Basic report formats
- File Listing - Basic file listing functionality
- Report Viewing - Basic report viewing
- Basic Report Comparison - Simple report comparison
Memory Management
Control memory usage through various settings:
# settings.toml
[hashreport]
# Resource monitoring settings
memory_threshold = 0.85 # % of total memory
mmap_threshold = 10485760 # 10MB - Use mmap for files larger than this
chunk_size = 4096 # bytes
Thread Configuration
Optimize thread usage for your system:
# settings.toml
[hashreport]
# Thread pool settings
min_workers = 2
batch_size = 1000
max_retries = 3
retry_delay = 1.0
resource_check_interval = 1.0 # seconds
Performance Tuning
Fine-tune performance settings:
# settings.toml
[hashreport.progress]
refresh_rate = 0.1
show_eta = true
show_file_names = true
show_speed = true
Input Validation and Sanitization
hashreport implements robust input validation and sanitization to ensure safe operation.
Path Validation
- Absolute and relative paths are supported
- Path traversal attempts are blocked
- Special characters are properly escaped
- Unicode paths are handled correctly
Pattern Validation
- Glob patterns are validated for syntax
- Regular expressions are checked for validity
- Pattern length is limited to prevent DoS
- Special characters are properly escaped
Size Validation
- File size limits are enforced
- Memory limits are respected
- Buffer sizes are validated
- Chunk sizes are checked
Email Notifications
hashreport can email reports upon completion using SMTP.
Basic Email Setup
hashreport scan /path/to/directory \
--email recipient@example.com \
--smtp-host smtp.example.com \
--smtp-user username \
--smtp-password password
Security Best Practices
-
Use Environment Variables
-
Use App Passwords
- For Gmail, use App Passwords instead of account passwords
-
Generate at: Google Account → Security → 2-Step Verification → App passwords
-
Use TLS/SSL
-
Secure Configuration File
Testing Email Configuration
Test your email settings without processing files:
hashreport scan /path/to/directory \
--email recipient@example.com \
--smtp-host smtp.example.com \
--smtp-user username \
--smtp-password password \
--test-email
Gmail Example
Using Gmail's SMTP server:
hashreport scan /path/to/directory \
--email recipient@gmail.com \
--smtp-host smtp.gmail.com \
--smtp-port 587 \
--smtp-user your.email@gmail.com \
--smtp-password "your-app-password"
Pattern Matching
Glob Patterns
Use glob patterns to match files:
# Match all PDF files
hashreport scan --include "*.pdf" /path/to/directory
# Match specific file types
hashreport scan --include "*.{jpg,png,gif}" /path/to/directory
# Exclude temporary files
hashreport scan --exclude "*.tmp" /path/to/directory
Regular Expressions
Use regular expressions for more complex patterns:
# Match files with date in name
hashreport scan --regex --include ".*\d{8}.*" /path/to/directory
# Match specific file extensions
hashreport scan --regex \
--include ".*\.(jpg|png)$" \
--exclude "thumb.*\.jpg$" \
/path/to/directory
Filelist with Patterns
The filelist command supports the same pattern matching:
# List all PDF files
hashreport filelist --include "*.pdf" /path/to/directory
# List files with regex
hashreport filelist --regex --include ".*\d{8}.*" /path/to/directory
Report Generation
Report Formats
hashreport supports multiple report formats:
# Generate CSV report (default)
hashreport scan -f csv /path/to/directory
# Generate JSON report
hashreport scan -f json /path/to/directory
# Generate both formats
hashreport scan -f csv -f json /path/to/directory
Report Configuration
Configure report generation behavior:
# settings.toml
[hashreport.reports]
include_metadata = true
include_timing = true
max_concurrent_writes = 4
compression = false
Custom Report Names
Control report filenames:
# Custom named reports
hashreport scan /path/to/directory -o custom_report.csv
# Multiple formats with custom names
hashreport scan /path/to/directory \
-o report.csv \
-f csv -f json
Report Comparison
Understanding Changes
The comparison functionality identifies several types of changes:
- Modified: File exists in both reports but has different hash values
- Moved: Same file (identical hash) exists in different locations
- Added: File exists only in the newer report
- Removed: File exists only in the older report
graph LR
A[Old Report] --> D[Comparison Engine]
B[New Report] --> D
D --> E{Change Detection}
E -->|Modified| F[Hash Changed]
E -->|Moved| G[Location Changed]
E -->|Added| H[New File]
E -->|Removed| I[Deleted File]
F --> J[Generate Report]
G --> J
H --> J
I --> J
J --> K[Output CSV]
Output Format
Changes are displayed with bold text for better visibility:
- Change type is displayed in bold
- File paths show the original and new locations
- Details include hash values and change descriptions
- Complete path information is shown in separate columns
Saving Comparisons
Comparison results can be saved to CSV format:
The output filename will be generated automatically using the format:
compare_<old_report>_<new_report>.csv
Using with Version Control
Example workflow for tracking file changes:
# Generate initial report
hashreport scan /path/to/project -o initial_report.csv
# Make changes to files
git add .
git commit -m "Update project files"
# Generate new report
hashreport scan /path/to/project -o new_report.csv
# Compare reports
hashreport compare initial_report.csv new_report.csv
For more information, see: - Basic Usage - Command Reference - Configuration Guide - Troubleshooting