Command documentation sourced from the linux-command project This comprehensive command reference is part of the linux-command documentation project.
bzcmp - Compare bzip2 compressed files
The bzcmp command is a specialized utility that allows you to compare bzip2 compressed files directly without needing to manually decompress them first. It acts as a wrapper around the standard cmp command, automatically handling bzip2 decompression before passing the uncompressed data to cmp for comparison. This tool is particularly useful when working with large compressed log files, backups, or archives where you need to quickly verify differences or perform integrity checks without the overhead of full decompression.
Basic Syntax
bzcmp [cmp_options] file1 [file2]
bzdiff [diff_options] file1 [file2]
Common Options
Basic Comparison Options
-l, --verbose- Show byte-by-byte comparison results-s, --quiet- Silent mode, only show exit status-x- Exit after first difference is found
Output Format Options
-b, --print-bytes- Print differing bytes as values-h- Do not follow symbolic links
Skip and Ignore Options
-i num1[:num2], --ignore-initial=num1[:num2]- Skip initial bytes from files-n num, --bytes=num- Compare only first N bytes
Usage Examples
Basic File Comparisons
Simple Comparisons
# Compare two bzip2 compressed files
bzcmp document1.txt.bz2 document2.txt.bz2
# Compare a compressed file with an uncompressed file
bzcmp document.txt.bz2 document.txt
# Compare a compressed file with its uncompressed version
bzcmp config.json.bz2 config.json
# Silent comparison (only exit status)
bzcmp -s backup1.tar.bz2 backup2.tar.bz2
Single File Comparisons
# Compare file with its compressed version
bzcmp log.txt
# This compares log.txt with log.txt.bz2
# Equivalent to: cmp log.txt <(bzcat log.txt.bz2)
Detailed Comparison Analysis
Verbose Output
# Show detailed byte-by-byte differences
bzcmp -l old_config.conf.bz2 new_config.conf.bz2
# Print differing byte values
bzcmp -lb data1.csv.bz2 data2.csv.bz2
# Exit after finding first difference
bzcmp -x file1.bz2 file2.bz2
Limited Comparisons
# Compare only first 1024 bytes
bzcmp -n 1024 header1.txt.bz2 header2.txt.bz2
# Skip first 100 bytes from both files
bzcmp -i 100 file1.log.bz2 file2.log.bz2
# Skip different amounts from each file
bzcmp -i 50:75 backup1.tar.bz2 backup2.tar.bz2
Practical Examples
System Administration
Log File Analysis
# Compare rotated log files
bzcmp /var/log/application.log.1.bz2 /var/log/application.log.2.bz2
# Compare today's log with yesterday's compressed log
bzcmp -l /var/log/error.log /var/log/error.log.1.bz2
# Quick integrity check of backup logs
bzcmp -s system_backup.log.bz2 system_backup_current.log.bz2 && echo "Logs match" || echo "Logs differ"
Configuration File Management
# Compare current config with compressed backup
bzcmp /etc/ssh/sshd_config /backups/sshd_config.20231201.bz2
# Batch compare multiple config backups
for backup in /backups/nginx/*.bz2; do
echo "Comparing with $backup"
bzcmp /etc/nginx/nginx.conf "$backup"
done
# Verify configuration consistency across servers
bzcmp -s server1_nginx.conf.bz2 server2_nginx.conf.bz2
Backup Verification
# Verify database backup integrity
bzcmp -l database_dump.sql.bz2 database_dump_current.sql
# Compare incremental backups
bzcmp -i 1024 backup_incremental1.sql.bz2 backup_incremental2.sql.bz2
# Quick validation of compressed backups
bzcmp -s full_backup.tar.bz2 /reference/full_backup.tar.bz2
Development Workflow
Source Code Comparison
# Compare compressed source code archives
bzcmp -l project_v1.0.tar.bz2 project_v1.1.tar.bz2
# Compare with previous version
bzcmp current_source.tar.bz2 previous_source.tar.bz2
# Verify build artifacts
bzcmp -s build_artifacts.tar.bz2 expected_artifacts.tar.bz2
Testing and Quality Assurance
# Compare test results with baseline
bzcmp test_results.log.bz2 baseline_results.log.bz2
# Validate data processing outputs
bzcmp -i 512 output_data.csv.bz2 expected_output.csv.bz2
# Regression testing
bzcmp -l regression_test.log.bz2 baseline_test.log.bz2
Data Processing
Large File Operations
# Compare large datasets without full decompression
bzcmp -n 1048576 large_dataset1.csv.bz2 large_dataset2.csv.bz2
# Skip headers and compare data
bzcmp -i 100:100 financial_data1.csv.bz2 financial_data2.csv.bz2
# Verify data migration
bzcmp -s migrated_data.tar.bz2 source_data.tar.bz2
Archive Management
# Compare archive contents
bzcmp archive_2023Q1.tar.bz2 archive_2023Q2.tar.bz2
# Verify archive integrity
bzcmp -l backup.tar.bz2 backup_verification.tar.bz2
# Check for changes in log archives
bzcmp /logs/archives/january.log.bz2 /logs/archives/february.log.bz2
Advanced Usage
Script Integration
Automated Comparison Scripts
#!/bin/bash
# Automated log comparison script
LOG_DIR="/var/log"
ARCHIVE_DIR="/archives/logs"
THRESHOLD=100
compare_logs() {
local current="$1"
local previous="$2"
if bzcmp -s "$current" "$previous"; then
echo "Logs $current and $previous are identical"
return 0
else
echo "Logs $current and $previous differ"
# Count differences
bzcmp -l "$current" "$previous" | wc -l
return 1
fi
}
# Compare today's log with yesterday's
today_log="$LOG_DIR/application.log"
yesterday_log="$ARCHIVE_DIR/application.log.1.bz2"
if [ -f "$today_log" ] && [ -f "$yesterday_log" ]; then
compare_logs "$today_log" "$yesterday_log"
fi
Backup Integrity Checker
#!/bin/bash
# Backup integrity verification script
BACKUP_DIR="/backups"
REFERENCE_DIR="/reference/backups"
check_backup() {
local backup="$1"
local reference="$2"
local backup_name=$(basename "$backup")
echo "Checking $backup_name..."
if bzcmp -s "$backup" "$reference"; then
echo "✓ $backup_name matches reference"
return 0
else
echo "✗ $backup_name differs from reference"
# Show first few differences
bzcmp -l -n 1000 "$backup" "$reference" | head -20
return 1
fi
}
# Check all backup files
for backup in "$BACKUP_DIR"/*.bz2; do
if [ -f "$backup" ]; then
base_name=$(basename "$backup" .bz2)
reference="$REFERENCE_DIR/$base_name"
if [ -f "$reference" ]; then
check_backup "$backup" "$reference"
else
echo "⚠ No reference found for $base_name"
fi
fi
done
Batch Processing
Multiple File Comparison
# Compare all compressed logs in a directory
for file in /logs/*.log.bz2; do
echo "Comparing $file with uncompressed version"
uncompressed="${file%.bz2}"
if [ -f "$uncompressed" ]; then
bzcmp -s "$file" "$uncompressed" && echo "Match" || echo "Different"
fi
done
# Find differences in archive series
for i in {1..10}; do
if [ -f "backup_$i.tar.bz2" ] && [ -f "backup_$((i+1)).tar.bz2" ]; then
echo "Comparing backup_$i with backup_$((i+1))"
bzcmp -l "backup_$i.tar.bz2" "backup_$((i+1)).tar.bz2"
fi
done
Performance Monitoring
# Monitor comparison performance for large files
start_time=$(date +%s.%N)
bzcmp -l large_file1.bz2 large_file2.bz2 > /dev/null
end_time=$(date +%s.%N)
runtime=$(echo "$end_time - $start_time" | bc)
echo "Comparison completed in $runtime seconds"
# Memory usage during comparison
/usr/bin/time -v bzcmp -l file1.bz2 file2.bz2
Integration with Other Tools
Working with Pipelines
# Compare decompressed output with another file
bzcat log_file.bz2 | cmp - current_log.txt
# Use bzcmp output in scripts
if bzcmp -s file1.bz2 file2.bz2; then
echo "Files are identical"
else
echo "Files differ - showing differences:"
bzcmp -l file1.bz2 file2.bz2 | head -20
fi
# Chain with other compression tools
bzcat file1.bz2 | cmp - <(bzcat file2.bz2)
Combining with Text Processing
# Count differences between files
difference_count=$(bzcmp -l file1.bz2 file2.bz2 | wc -l)
echo "Found $difference_count differing bytes"
# Show differences in context
bzcat file1.bz2 > /tmp/file1_uncompressed
bzcat file2.bz2 > /tmp/file2_uncompressed
diff -u /tmp/file1_uncompressed /tmp/file2_uncompressed
rm /tmp/file1_uncompressed /tmp/file2_uncompressed
Troubleshooting
Common Issues
File Access Problems
# Permission denied errors
sudo bzcmp protected_file.bz2 another_file.bz2
# Symbolic link issues
bzcmp -h symlink_file.bz2 target_file.bz2
# Check if files are actually bzip2 compressed
file document.txt.bz2
Memory Issues with Large Files
# Compare only portions of large files
bzcmp -n 1048576 huge_file1.bz2 huge_file2.bz2 # First 1MB only
# Use incremental comparison
bzcmp -i 1048576:1048576 huge_file1.bz2 huge_file2.bz2 # Skip first 1MB
# Monitor memory usage
/usr/bin/time -v bzcmp file1.bz2 file2.bz2
Output Interpretation
# Understanding exit codes
bzcmp file1.bz2 file2.bz2
echo "Exit code: $?" # 0 = identical, 1 = different, 2 = error
# Show detailed comparison
bzcmp -l file1.bz2 file2.bz2 | head -10
# Silent comparison for scripting
bzcmp -s file1.bz2 file2.bz2 && echo "Match" || echo "No match"
Related Commands
cmp- Compare two files byte by bytediff- Compare files line by linebzdiff- Compare bzip2 files using diffbzcat- Decompress bzip2 files to stdoutbzless- View bzip2 compressed filesbzip2- Compress/decompress files using bzip2bunzip2- Decompress bzip2 files
Best Practices
- Use silent mode (-s) in scripts to only check if files differ
- Limit comparison scope with -n for large files to improve performance
- Skip headers with -i when comparing structured data files
- Check exit codes to handle comparison results programmatically
- Use verbose mode (-l) sparingly for large files due to output volume
- Verify file types before comparing to avoid unexpected results
- Use -h option when you don't want to follow symbolic links
- Monitor memory usage when comparing very large compressed files
- Combine with other tools for comprehensive file analysis
- Document comparison criteria when using in automated workflows
Performance Tips
- Single file comparison is efficient for checking against compressed versions
- Byte-limited comparison (-n) significantly speeds up large file checks
- Skip initial bytes (-i) when comparing files with headers
- Use exit status instead of verbose output for automated checks
- Batch processing is more efficient than individual comparisons
- Memory usage is lower than full decompression for most operations
- Consider file sizes before comparison to estimate processing time
- Use appropriate comparison levels based on your accuracy needs
The bzcmp command is an essential tool for anyone working with compressed files who needs to perform quick comparisons without the overhead of full decompression. Its seamless integration with the standard cmp command makes it familiar and easy to use while providing significant efficiency gains when dealing with bzip2 compressed data.