Command documentation sourced from the linux-command project This comprehensive command reference is part of the linux-command documentation project.

wc - Word Count

The wc (word count) command is a fundamental Unix/Linux utility that counts lines, words, characters, and bytes in files or from standard input. It provides essential text analysis capabilities for system administration, development workflows, data processing, and quick file statistics gathering. With support for multiple counting modes, Unicode handling, and efficient processing of large files, wc is an indispensable tool for text file analysis, log monitoring, code metrics, and data validation tasks.

Basic Syntax

wc [OPTIONS] [FILE]...

Common Options

Counting Options

Primary Count Modes

-c, --bytes - Print only the byte counts
-m, --chars - Print only the character counts (Unicode-aware)
-l, --lines - Print only the newline counts
-w, --words - Print only the word counts
-L, --max-line-length - Print the length of the longest line

Default Behavior

(no options) - Print lines, words, and bytes in that order

Input Options

--files0-from=FILE - Read input from null-terminated filenames in FILE
--files0-from=F - Read input from null-terminated filenames in F

Output Format Options

--total=MODE - When counting multiple files, show total line
- never - Never show total line
- always - Always show total line (default for >1 file)
- only - Only show total line

Help and Version

--help - Display help information
--version - Output version information

Usage Examples

Basic Counting Operations

Default Output (Lines, Words, Bytes)

# Count all metrics for a single file
wc document.txt
# Output: 150  750  4500 document.txt
# Format: lines words bytes filename

# Count for multiple files with total
wc file1.txt file2.txt file3.txt
#  50  250  1500 file1.txt
# 100  500  3000 file2.txt
#  75  375  2250 file3.txt
# 225 1125  6750 total

# Count all text files in directory
wc *.txt

# Count with recursive pattern
wc **/*.md

Individual Count Modes

# Count only lines
wc -l access.log
# Output: 1250 access.log

# Count only words
wc -w essay.txt
# Output: 2500 essay.txt

# Count only bytes
wc -c data.bin
# Output: 1048576 data.bin

# Count only characters (Unicode-aware)
wc -m unicode_file.txt
# Output: 1000 unicode_file.txt

# Find longest line length
wc -L source_code.py
# Output: 120 source_code.py

# Multiple specific counts
wc -l -w document.txt
# Output: 100  500 document.txt

Standard Input Processing

Reading from stdin

# Count lines from command output
ls -la | wc -l
# Output: 25

# Count words from user input
echo "Hello world this is a test" | wc -w
# Output: 6

# Count characters from multiple commands
{ date; who; pwd; } | wc -c

# Process pipeline output
find . -name "*.py" | wc -l

# Count from here document
wc -l <<EOF
line 1
line 2
line 3
EOF
# Output: 3

Process Substitution

# Count from command substitution
wc -l < <(find . -type f)

# Count from multiple commands
wc -w < <(git log --oneline)

# Compare file contents
diff <(wc -l file1.txt) <(wc -l file2.txt)

Advanced File Processing

Null-separated Input

# Process files with null separators
find . -type f -print0 | wc -l --files0-from=-

# Count in files with spaces in names
find . -name "*.txt" -print0 | xargs -0 wc -l

# Safe processing of unusual filenames
printf "file1.txt\0file2.txt\0" | wc --files0-from=/dev/stdin

Total Control Options

# Always show total (even single file)
wc --total=always file.txt

# Never show total
wc --total=never *.txt

# Only show total summary
wc --total=only *.txt
# Output: 1000 total

# Custom total formatting
wc -l --files0-from=filelist.txt | tail -1

Specialized Counting Scenarios

Text and Document Analysis

Document Statistics

# Get comprehensive document stats
wc -l -w -c -L document.pdf
# Output: 50  250  1250  80 document.pdf

# Count code lines excluding comments
grep -v '^[[:space:]]*#' script.py | grep -v '^[[:space:]]*$' | wc -l

# Count effective lines (non-empty)
grep . config.txt | wc -l

# Count empty lines
grep -c '^$' data.txt

# Count lines with specific content
grep -c "ERROR" logfile.txt

Multilingual Text Processing

# Unicode character counting
wc -m unicode_text.txt

# Byte counting for encoding analysis
wc -c utf8_file.txt

# Compare byte vs character counts (encoding efficiency)
echo "Byte count: $(wc -c < file.txt), Char count: $(wc -m < file.txt)"

# Handle different line endings
wc -l windows_file.txt  # Counts \r\n as one line

# Count words in multilingual text
export LC_ALL=C
wc -w multilingual.txt

Development and Code Analysis

Project Statistics

# Count lines in all source files
find . -name "*.py" -exec wc -l {} + | tail -1

# Detailed file-by-file analysis
wc -l src/**/*.py | sort -nr | head -10

# Count files by type and total lines
for ext in py js html css md; do
    count=$(find . -name "*.$ext" | wc -l)
    lines=$(find . -name "*.$ext" -exec wc -l {} + 2>/dev/null | tail -1 | awk '{print $1}')
    echo "$ext: $count files, $lines lines"
done

# Function count in code
grep -c '^[[:space:]]*def\|^[[:space:]]*function' *.js

# Comment density analysis
total_lines=$(wc -l < code.py)
comment_lines=$(grep -c '^[[:space:]]*#' code.py)
echo "Comment density: $((comment_lines * 100 / total_lines))%"

Code Quality Metrics

# Find longest lines (potential style issues)
wc -L *.py | sort -nr | head -5

# Count blank lines
find . -name "*.py" -exec grep -c '^$' {} + | paste -sd+ | bc

# Count TODO/FIXME comments
grep -r "TODO\|FIXME" . --include="*.py" | wc -l

# Analyze file sizes
find . -name "*.js" -exec wc -c {} + | sort -nr

System Administration

Log File Analysis

# Count log entries by severity
grep "ERROR" /var/log/syslog | wc -l
grep "WARNING" /var/log/syslog | wc -l
grep "INFO" /var/log/syslog | wc -l

# Monitor log growth in real-time
watch "wc -l /var/log/syslog"

# Count unique IP addresses in access log
awk '{print $1}' access.log | sort | uniq | wc -l

# Log rotation statistics
for log in /var/log/*.log; do
    echo "$(basename $log): $(wc -l < $log) lines"
done

# System call statistics
strace -c -p $(pidof process) 2>&1 | tail -1

User and Process Monitoring

# Count logged-in users
who | wc -l

# Count running processes
ps aux | wc -l

# Count open files for user
lsof -u username | wc -l

# Network connections count
netstat -an | wc -l

# Count file descriptors
ls /proc/$$/fd | wc -l

Performance Optimization Techniques

Efficient Large File Processing

Memory-Efficient Counting

# Process very large files efficiently
wc -l huge_file.log  # wc is already memory-efficient

# Progress monitoring for large files
pv huge_file.txt | wc -l

# Split and count in parallel (for massive files)
split -l 1000000 huge_file.txt chunk_
wc -l chunk_* | tail -1
rm chunk_*

# Use appropriate counting mode
wc -l file.txt  # Faster than full wc when only lines needed
wc -c file.txt  # Faster than -m for byte counting

Batch Processing Optimization

# Process multiple files efficiently
wc *.txt  # Better than individual wc commands

# Parallel processing with xargs
find . -name "*.log" -print0 | xargs -0 -P 4 wc -l

# Use shell globbing for large sets
wc **/*.csv 2>/dev/null | tail -1

# Combine counts from different sources
{ wc -l access.log; wc -l error.log; } | awk '{sum+=$1} END {print sum}'

Pipeline Optimization

Reducing Process Overhead

# Avoid unnecessary cat
wc -l file.txt  # Good
cat file.txt | wc -l  # Bad (extra process)

# Use direct file access
wc -w < file.txt  # Efficient
head -1000 file.txt | wc -l  # For partial files

# Combine operations
grep "pattern" file.txt | wc -l  # Better than grep -c for complex patterns

Memory Usage Patterns

# wc processes files line by line (memory efficient)
# Suitable for files larger than available RAM

# Monitor memory usage
/usr/bin/time -v wc -l huge_file.txt

# Count in compressed files without extraction
zcat compressed.gz | wc -l
bzcat file.bz2 | wc -l

Integration with Other Tools

Text Processing Pipelines

Complex Text Analysis

# Extract and count specific patterns
grep -o 'https://[^"]*' webpage.html | sort | uniq | wc -l

# Count words by length
awk '{for(i=1;i<=NF;i++) print length($i)}' text.txt | sort -n | uniq -c

# Count email addresses
grep -oE '\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b' file.txt | wc -l

# Analyze word frequency
tr '[:space:]' '[\n*]' < document.txt | sort | uniq -c | sort -nr | wc -l

Data Validation and Cleaning

# Count malformed lines
awk 'NF != expected_fields' data.csv | wc -l

# Validate file completeness
expected_lines=1000
actual_lines=$(wc -l < data.txt)
if [ $actual_lines -eq $expected_lines ]; then
    echo "File is complete"
else
    echo "File has $((expected_lines - actual_lines)) missing lines"
fi

# Count duplicate lines
sort file.txt | uniq -d | wc -l

# Count unique entries
cut -d, -f1 data.csv | sort | uniq | wc -l

Shell Scripting Integration

Comprehensive File Analysis Script

#!/bin/bash
# analyze_files.sh - Detailed file statistics

analyze_file() {
    local file="$1"
    if [ ! -f "$file" ]; then
        echo "Error: File $file not found"
        return 1
    fi

    echo "=== Analysis for $file ==="
    echo "Lines: $(wc -l < "$file")"
    echo "Words: $(wc -w < "$file")"
    echo "Bytes: $(wc -c < "$file")"
    echo "Characters: $(wc -m < "$file")"
    echo "Longest line: $(wc -L < "$file")"
    echo "Non-empty lines: $(grep -c . "$file")"
    echo "Empty lines: $(grep -c '^$' "$file")"
    echo "File size: $(du -h "$file" | cut -f1)"
    echo "---"
}

# Analyze multiple files
for file in "$@"; do
    analyze_file "$file"
done

# Summary statistics
echo "=== Summary ==="
echo "Total files: $#"
echo "Total lines: $(wc -l "$@" | tail -1)"
echo "Total bytes: $(wc -c "$@" | tail -1)"

Project Statistics Generator

#!/bin/bash
# project_stats.sh - Generate comprehensive project statistics

echo "# Project Statistics Report"
echo "Generated on: $(date)"
echo ""

# File type analysis
echo "## File Type Analysis"
for ext in py js html css md json yaml yml; do
    files=$(find . -name "*.$ext" -type f | wc -l)
    if [ $files -gt 0 ]; then
        lines=$(find . -name "*.$ext" -type f -exec wc -l {} + 2>/dev/null | tail -1 | awk '{print $1}')
        echo "- $ext: $files files, $lines lines"
    fi
done

echo ""
echo "## Largest Files"
find . -type f -exec wc -c {} + | sort -nr | head -10 | \
    awk '{printf "%-15s %s\n", $1, substr($0, index($0,$2))}'

echo ""
echo "## Files by Line Count"
find . -name "*.py" -o -name "*.js" -o -name "*.html" | \
    xargs wc -l 2>/dev/null | sort -nr | head -10

Troubleshooting and Common Issues

Character Encoding Issues

Unicode vs Byte Counting

# Problem: Different byte and character counts
# Solution: Understand encoding differences
file.txt="café"
echo -n "$file.txt" | wc -c  # 5 bytes (UTF-8)
echo -n "$file.txt" | wc -m  # 4 characters

# Check file encoding
file -bi document.txt

# Force specific encoding
iconv -f latin1 -t utf8 file.txt | wc -m

# Handle Windows line endings
dos2unix windows_file.txt
wc -l windows_file.txt  # Now counts correctly

Binary File Handling

# Counting bytes in binary files
wc -c program_binary  # Works fine
wc -m program_binary  # May give unexpected results

# Safe character counting for mixed content
if file -bi "$filename" | grep -q "binary"; then
    echo "Binary file, using byte count"
    wc -c "$filename"
else
    echo "Text file, using character count"
    wc -m "$filename"
fi

Performance Issues

Large File Processing

# Problem: wc seems slow on huge files
# Solution: Use appropriate options and monitoring

# Count only what you need
wc -l huge_file.txt  # Lines only is faster

# Monitor progress
pv huge_file.txt | wc -l

# Check if file is actually being processed
strace -e read wc -l file.txt 2>&1 | grep -c "read("

# Handle sparse files efficiently
du --apparent-size sparse_file.txt  # See apparent size
wc -c sparse_file.txt  # Count actual bytes

Pipeline Bottlenecks

# Problem: Pipeline is slow
# Solution: Identify bottleneck component

# Test each component separately
time cat huge_file.txt > /dev/null
time cat huge_file.txt | wc -l
time wc -l huge_file.txt

# Use faster alternatives
# grep -c "pattern" file.txt  # Faster than grep | wc -l
awk 'END{print NR}' file.txt  # Alternative for line counting

File Access Issues

Permission and File Problems

# Handle permission denied gracefully
find . -name "*.log" -exec wc -l {} + 2>/dev/null | tail -1

# Check file accessibility
if [ -r "$file" ]; then
    wc -l "$file"
else
    echo "Cannot read $file"
fi

# Process files with special characters
find . -name "*'"* -exec wc -l {} +  # Handle single quotes
find . -name '* *' -exec wc -l {} +   # Handle spaces

# Use null terminators for safety
find . -type f -print0 | xargs -0 wc -l

Best Practices

Performance Optimization

Count only what you need: Use -l, -w, or -c instead of full wc when possible
Avoid unnecessary piping: Use wc file.txt instead of cat file.txt | wc
Process files directly: Let wc handle file reading for better efficiency
Use appropriate options: -c for bytes, -m for Unicode characters
Batch process multiple files: wc *.txt is more efficient than individual calls
Monitor large operations: Use pv or progress indicators for huge files
Consider memory constraints: wc is memory-efficient but be aware of system limits

Accuracy and Reliability

Use -m for Unicode text: Character counting vs. byte counting for international text
Handle line endings: Understand how different line endings affect line counts
Validate input: Check file existence and readability before processing
Test with sample data: Verify counts match expectations before processing large datasets
Consider encoding: Be aware of character encoding effects on character counts
Use null separators: Safe processing of filenames with special characters
Error handling: Gracefully handle permission issues and missing files

Integration and Scripting

Process substitution: Use wc < <(command) for command output
Pipeline optimization: Place wc at pipeline end for final counts
Combine with other tools: Use with grep, awk, find for complex analysis
Shell redirection: Use wc < file to suppress filename in output
Variable capture: Store counts in variables for script logic
Error checking: Verify command success before using results
Consistent formatting: Use consistent options for comparable results

cat - Concatenate and display files
head - Display first lines of files
tail - Display last lines of files
grep - Search for patterns in files
awk - Pattern scanning and processing
sed - Stream editor for text processing
sort - Sort lines in files
uniq - Remove duplicate lines
cut - Remove sections from lines
find - Search for files and directories
xargs - Build and execute command lines
du - Estimate file space usage
wc - Word, line, and byte counting

Performance Tips

Optimization Strategies

Single metric counting: -l, -w, or -c is faster than full wc
Direct file access: Avoid cat file | wc pattern
Batch processing: Process multiple files in single wc call
Appropriate tool selection: Use grep -c instead of grep | wc -l
Memory efficiency: wc processes line by line, suitable for huge files
Unicode awareness: Use -m for character counting with international text
Pipeline placement: Use wc as final command in pipelines

Performance Comparison

# Fast: Direct file access
wc -l file.txt

# Slower: Unnecessary cat
cat file.txt | wc -l

# Fast: Counting patterns
grep -c "pattern" file.txt

# Slower: Pipeline for counting
grep "pattern" file.txt | wc -l

# Fast: Batch processing
wc *.txt

# Slower: Individual calls
for f in *.txt; do wc -l "$f"; done

The wc command is a versatile and efficient tool for text analysis and file statistics. Its simple interface masks powerful capabilities for processing files of any size, making it essential for system administration, development workflows, data analysis, and automation tasks. By understanding its various options and optimization techniques, users can leverage wc effectively in countless scenarios requiring text and file analysis.

Basic Syntax​

Common Options​

Counting Options​

Primary Count Modes​

Default Behavior​

Input Options​

Output Format Options​

Help and Version​

Usage Examples​

Basic Counting Operations​

Default Output (Lines, Words, Bytes)​

Individual Count Modes​

Standard Input Processing​

Reading from stdin​

Process Substitution​

Advanced File Processing​

Null-separated Input​

Total Control Options​

Specialized Counting Scenarios​

Text and Document Analysis​

Document Statistics​

Multilingual Text Processing​

Development and Code Analysis​

Project Statistics​

Code Quality Metrics​

System Administration​

Log File Analysis​

User and Process Monitoring​

Performance Optimization Techniques​

Efficient Large File Processing​

Memory-Efficient Counting​

Batch Processing Optimization​

Pipeline Optimization​

Reducing Process Overhead​

Memory Usage Patterns​

Integration with Other Tools​

Text Processing Pipelines​

Complex Text Analysis​

Data Validation and Cleaning​

Shell Scripting Integration​

Comprehensive File Analysis Script​

Project Statistics Generator​

Troubleshooting and Common Issues​

Character Encoding Issues​

Unicode vs Byte Counting​

Binary File Handling​

Performance Issues​

Large File Processing​

Pipeline Bottlenecks​

File Access Issues​

Permission and File Problems​

Best Practices​

Performance Optimization​

Accuracy and Reliability​

Integration and Scripting​

Related Commands​

Performance Tips​

Optimization Strategies​

Performance Comparison​