Command documentation sourced from the linux-command project This comprehensive command reference is part of the linux-command documentation project.
wc - Word Count
The wc (word count) command is a fundamental Unix/Linux utility that counts lines, words, characters, and bytes in files or from standard input. It provides essential text analysis capabilities for system administration, development workflows, data processing, and quick file statistics gathering. With support for multiple counting modes, Unicode handling, and efficient processing of large files, wc is an indispensable tool for text file analysis, log monitoring, code metrics, and data validation tasks.
Basic Syntax
wc [OPTIONS] [FILE]...
Common Options
Counting Options
Primary Count Modes
-c, --bytes- Print only the byte counts-m, --chars- Print only the character counts (Unicode-aware)-l, --lines- Print only the newline counts-w, --words- Print only the word counts-L, --max-line-length- Print the length of the longest line
Default Behavior
- (no options) - Print lines, words, and bytes in that order
Input Options
--files0-from=FILE- Read input from null-terminated filenames in FILE--files0-from=F- Read input from null-terminated filenames in F
Output Format Options
--total=MODE- When counting multiple files, show total linenever- Never show total linealways- Always show total line (default for >1 file)only- Only show total line
Help and Version
--help- Display help information--version- Output version information
Usage Examples
Basic Counting Operations
Default Output (Lines, Words, Bytes)
# Count all metrics for a single file
wc document.txt
# Output: 150 750 4500 document.txt
# Format: lines words bytes filename
# Count for multiple files with total
wc file1.txt file2.txt file3.txt
# 50 250 1500 file1.txt
# 100 500 3000 file2.txt
# 75 375 2250 file3.txt
# 225 1125 6750 total
# Count all text files in directory
wc *.txt
# Count with recursive pattern
wc **/*.md
Individual Count Modes
# Count only lines
wc -l access.log
# Output: 1250 access.log
# Count only words
wc -w essay.txt
# Output: 2500 essay.txt
# Count only bytes
wc -c data.bin
# Output: 1048576 data.bin
# Count only characters (Unicode-aware)
wc -m unicode_file.txt
# Output: 1000 unicode_file.txt
# Find longest line length
wc -L source_code.py
# Output: 120 source_code.py
# Multiple specific counts
wc -l -w document.txt
# Output: 100 500 document.txt
Standard Input Processing
Reading from stdin
# Count lines from command output
ls -la | wc -l
# Output: 25
# Count words from user input
echo "Hello world this is a test" | wc -w
# Output: 6
# Count characters from multiple commands
{ date; who; pwd; } | wc -c
# Process pipeline output
find . -name "*.py" | wc -l
# Count from here document
wc -l <<EOF
line 1
line 2
line 3
EOF
# Output: 3
Process Substitution
# Count from command substitution
wc -l < <(find . -type f)
# Count from multiple commands
wc -w < <(git log --oneline)
# Compare file contents
diff <(wc -l file1.txt) <(wc -l file2.txt)
Advanced File Processing
Null-separated Input
# Process files with null separators
find . -type f -print0 | wc -l --files0-from=-
# Count in files with spaces in names
find . -name "*.txt" -print0 | xargs -0 wc -l
# Safe processing of unusual filenames
printf "file1.txt\0file2.txt\0" | wc --files0-from=/dev/stdin
Total Control Options
# Always show total (even single file)
wc --total=always file.txt
# Never show total
wc --total=never *.txt
# Only show total summary
wc --total=only *.txt
# Output: 1000 total
# Custom total formatting
wc -l --files0-from=filelist.txt | tail -1
Specialized Counting Scenarios
Text and Document Analysis
Document Statistics
# Get comprehensive document stats
wc -l -w -c -L document.pdf
# Output: 50 250 1250 80 document.pdf
# Count code lines excluding comments
grep -v '^[[:space:]]*#' script.py | grep -v '^[[:space:]]*$' | wc -l
# Count effective lines (non-empty)
grep . config.txt | wc -l
# Count empty lines
grep -c '^$' data.txt
# Count lines with specific content
grep -c "ERROR" logfile.txt
Multilingual Text Processing
# Unicode character counting
wc -m unicode_text.txt
# Byte counting for encoding analysis
wc -c utf8_file.txt
# Compare byte vs character counts (encoding efficiency)
echo "Byte count: $(wc -c < file.txt), Char count: $(wc -m < file.txt)"
# Handle different line endings
wc -l windows_file.txt # Counts \r\n as one line
# Count words in multilingual text
export LC_ALL=C
wc -w multilingual.txt
Development and Code Analysis
Project Statistics
# Count lines in all source files
find . -name "*.py" -exec wc -l {} + | tail -1
# Detailed file-by-file analysis
wc -l src/**/*.py | sort -nr | head -10
# Count files by type and total lines
for ext in py js html css md; do
count=$(find . -name "*.$ext" | wc -l)
lines=$(find . -name "*.$ext" -exec wc -l {} + 2>/dev/null | tail -1 | awk '{print $1}')
echo "$ext: $count files, $lines lines"
done
# Function count in code
grep -c '^[[:space:]]*def\|^[[:space:]]*function' *.js
# Comment density analysis
total_lines=$(wc -l < code.py)
comment_lines=$(grep -c '^[[:space:]]*#' code.py)
echo "Comment density: $((comment_lines * 100 / total_lines))%"
Code Quality Metrics
# Find longest lines (potential style issues)
wc -L *.py | sort -nr | head -5
# Count blank lines
find . -name "*.py" -exec grep -c '^$' {} + | paste -sd+ | bc
# Count TODO/FIXME comments
grep -r "TODO\|FIXME" . --include="*.py" | wc -l
# Analyze file sizes
find . -name "*.js" -exec wc -c {} + | sort -nr
System Administration
Log File Analysis
# Count log entries by severity
grep "ERROR" /var/log/syslog | wc -l
grep "WARNING" /var/log/syslog | wc -l
grep "INFO" /var/log/syslog | wc -l
# Monitor log growth in real-time
watch "wc -l /var/log/syslog"
# Count unique IP addresses in access log
awk '{print $1}' access.log | sort | uniq | wc -l
# Log rotation statistics
for log in /var/log/*.log; do
echo "$(basename $log): $(wc -l < $log) lines"
done
# System call statistics
strace -c -p $(pidof process) 2>&1 | tail -1
User and Process Monitoring
# Count logged-in users
who | wc -l
# Count running processes
ps aux | wc -l
# Count open files for user
lsof -u username | wc -l
# Network connections count
netstat -an | wc -l
# Count file descriptors
ls /proc/$$/fd | wc -l
Performance Optimization Techniques
Efficient Large File Processing
Memory-Efficient Counting
# Process very large files efficiently
wc -l huge_file.log # wc is already memory-efficient
# Progress monitoring for large files
pv huge_file.txt | wc -l
# Split and count in parallel (for massive files)
split -l 1000000 huge_file.txt chunk_
wc -l chunk_* | tail -1
rm chunk_*
# Use appropriate counting mode
wc -l file.txt # Faster than full wc when only lines needed
wc -c file.txt # Faster than -m for byte counting
Batch Processing Optimization
# Process multiple files efficiently
wc *.txt # Better than individual wc commands
# Parallel processing with xargs
find . -name "*.log" -print0 | xargs -0 -P 4 wc -l
# Use shell globbing for large sets
wc **/*.csv 2>/dev/null | tail -1
# Combine counts from different sources
{ wc -l access.log; wc -l error.log; } | awk '{sum+=$1} END {print sum}'
Pipeline Optimization
Reducing Process Overhead
# Avoid unnecessary cat
wc -l file.txt # Good
cat file.txt | wc -l # Bad (extra process)
# Use direct file access
wc -w < file.txt # Efficient
head -1000 file.txt | wc -l # For partial files
# Combine operations
grep "pattern" file.txt | wc -l # Better than grep -c for complex patterns
Memory Usage Patterns
# wc processes files line by line (memory efficient)
# Suitable for files larger than available RAM
# Monitor memory usage
/usr/bin/time -v wc -l huge_file.txt
# Count in compressed files without extraction
zcat compressed.gz | wc -l
bzcat file.bz2 | wc -l
Integration with Other Tools
Text Processing Pipelines
Complex Text Analysis
# Extract and count specific patterns
grep -o 'https://[^"]*' webpage.html | sort | uniq | wc -l
# Count words by length
awk '{for(i=1;i<=NF;i++) print length($i)}' text.txt | sort -n | uniq -c
# Count email addresses
grep -oE '\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b' file.txt | wc -l
# Analyze word frequency
tr '[:space:]' '[\n*]' < document.txt | sort | uniq -c | sort -nr | wc -l
Data Validation and Cleaning
# Count malformed lines
awk 'NF != expected_fields' data.csv | wc -l
# Validate file completeness
expected_lines=1000
actual_lines=$(wc -l < data.txt)
if [ $actual_lines -eq $expected_lines ]; then
echo "File is complete"
else
echo "File has $((expected_lines - actual_lines)) missing lines"
fi
# Count duplicate lines
sort file.txt | uniq -d | wc -l
# Count unique entries
cut -d, -f1 data.csv | sort | uniq | wc -l
Shell Scripting Integration
Comprehensive File Analysis Script
#!/bin/bash
# analyze_files.sh - Detailed file statistics
analyze_file() {
local file="$1"
if [ ! -f "$file" ]; then
echo "Error: File $file not found"
return 1
fi
echo "=== Analysis for $file ==="
echo "Lines: $(wc -l < "$file")"
echo "Words: $(wc -w < "$file")"
echo "Bytes: $(wc -c < "$file")"
echo "Characters: $(wc -m < "$file")"
echo "Longest line: $(wc -L < "$file")"
echo "Non-empty lines: $(grep -c . "$file")"
echo "Empty lines: $(grep -c '^$' "$file")"
echo "File size: $(du -h "$file" | cut -f1)"
echo "---"
}
# Analyze multiple files
for file in "$@"; do
analyze_file "$file"
done
# Summary statistics
echo "=== Summary ==="
echo "Total files: $#"
echo "Total lines: $(wc -l "$@" | tail -1)"
echo "Total bytes: $(wc -c "$@" | tail -1)"
Project Statistics Generator
#!/bin/bash
# project_stats.sh - Generate comprehensive project statistics
echo "# Project Statistics Report"
echo "Generated on: $(date)"
echo ""
# File type analysis
echo "## File Type Analysis"
for ext in py js html css md json yaml yml; do
files=$(find . -name "*.$ext" -type f | wc -l)
if [ $files -gt 0 ]; then
lines=$(find . -name "*.$ext" -type f -exec wc -l {} + 2>/dev/null | tail -1 | awk '{print $1}')
echo "- $ext: $files files, $lines lines"
fi
done
echo ""
echo "## Largest Files"
find . -type f -exec wc -c {} + | sort -nr | head -10 | \
awk '{printf "%-15s %s\n", $1, substr($0, index($0,$2))}'
echo ""
echo "## Files by Line Count"
find . -name "*.py" -o -name "*.js" -o -name "*.html" | \
xargs wc -l 2>/dev/null | sort -nr | head -10
Troubleshooting and Common Issues
Character Encoding Issues
Unicode vs Byte Counting
# Problem: Different byte and character counts
# Solution: Understand encoding differences
file.txt="café"
echo -n "$file.txt" | wc -c # 5 bytes (UTF-8)
echo -n "$file.txt" | wc -m # 4 characters
# Check file encoding
file -bi document.txt
# Force specific encoding
iconv -f latin1 -t utf8 file.txt | wc -m
# Handle Windows line endings
dos2unix windows_file.txt
wc -l windows_file.txt # Now counts correctly
Binary File Handling
# Counting bytes in binary files
wc -c program_binary # Works fine
wc -m program_binary # May give unexpected results
# Safe character counting for mixed content
if file -bi "$filename" | grep -q "binary"; then
echo "Binary file, using byte count"
wc -c "$filename"
else
echo "Text file, using character count"
wc -m "$filename"
fi
Performance Issues
Large File Processing
# Problem: wc seems slow on huge files
# Solution: Use appropriate options and monitoring
# Count only what you need
wc -l huge_file.txt # Lines only is faster
# Monitor progress
pv huge_file.txt | wc -l
# Check if file is actually being processed
strace -e read wc -l file.txt 2>&1 | grep -c "read("
# Handle sparse files efficiently
du --apparent-size sparse_file.txt # See apparent size
wc -c sparse_file.txt # Count actual bytes
Pipeline Bottlenecks
# Problem: Pipeline is slow
# Solution: Identify bottleneck component
# Test each component separately
time cat huge_file.txt > /dev/null
time cat huge_file.txt | wc -l
time wc -l huge_file.txt
# Use faster alternatives
# grep -c "pattern" file.txt # Faster than grep | wc -l
awk 'END{print NR}' file.txt # Alternative for line counting
File Access Issues
Permission and File Problems
# Handle permission denied gracefully
find . -name "*.log" -exec wc -l {} + 2>/dev/null | tail -1
# Check file accessibility
if [ -r "$file" ]; then
wc -l "$file"
else
echo "Cannot read $file"
fi
# Process files with special characters
find . -name "*'"* -exec wc -l {} + # Handle single quotes
find . -name '* *' -exec wc -l {} + # Handle spaces
# Use null terminators for safety
find . -type f -print0 | xargs -0 wc -l
Best Practices
Performance Optimization
- Count only what you need: Use
-l,-w, or-cinstead of fullwcwhen possible - Avoid unnecessary piping: Use
wc file.txtinstead ofcat file.txt | wc - Process files directly: Let
wchandle file reading for better efficiency - Use appropriate options:
-cfor bytes,-mfor Unicode characters - Batch process multiple files:
wc *.txtis more efficient than individual calls - Monitor large operations: Use
pvor progress indicators for huge files - Consider memory constraints:
wcis memory-efficient but be aware of system limits
Accuracy and Reliability
- Use
-mfor Unicode text: Character counting vs. byte counting for international text - Handle line endings: Understand how different line endings affect line counts
- Validate input: Check file existence and readability before processing
- Test with sample data: Verify counts match expectations before processing large datasets
- Consider encoding: Be aware of character encoding effects on character counts
- Use null separators: Safe processing of filenames with special characters
- Error handling: Gracefully handle permission issues and missing files
Integration and Scripting
- Process substitution: Use
wc < <(command)for command output - Pipeline optimization: Place
wcat pipeline end for final counts - Combine with other tools: Use with
grep,awk,findfor complex analysis - Shell redirection: Use
wc < fileto suppress filename in output - Variable capture: Store counts in variables for script logic
- Error checking: Verify command success before using results
- Consistent formatting: Use consistent options for comparable results
Related Commands
cat- Concatenate and display fileshead- Display first lines of filestail- Display last lines of filesgrep- Search for patterns in filesawk- Pattern scanning and processingsed- Stream editor for text processingsort- Sort lines in filesuniq- Remove duplicate linescut- Remove sections from linesfind- Search for files and directoriesxargs- Build and execute command linesdu- Estimate file space usagewc- Word, line, and byte counting
Performance Tips
Optimization Strategies
- Single metric counting:
-l,-w, or-cis faster than fullwc - Direct file access: Avoid
cat file | wcpattern - Batch processing: Process multiple files in single
wccall - Appropriate tool selection: Use
grep -cinstead ofgrep | wc -l - Memory efficiency:
wcprocesses line by line, suitable for huge files - Unicode awareness: Use
-mfor character counting with international text - Pipeline placement: Use
wcas final command in pipelines
Performance Comparison
# Fast: Direct file access
wc -l file.txt
# Slower: Unnecessary cat
cat file.txt | wc -l
# Fast: Counting patterns
grep -c "pattern" file.txt
# Slower: Pipeline for counting
grep "pattern" file.txt | wc -l
# Fast: Batch processing
wc *.txt
# Slower: Individual calls
for f in *.txt; do wc -l "$f"; done
The wc command is a versatile and efficient tool for text analysis and file statistics. Its simple interface masks powerful capabilities for processing files of any size, making it essential for system administration, development workflows, data analysis, and automation tasks. By understanding its various options and optimization techniques, users can leverage wc effectively in countless scenarios requiring text and file analysis.