Mastering Text Processing with Grep, Sed, Awk, Cut, and Sort

Text processing is an essential skill for anyone working with data, scripts, or system administration. Linux provides a suite of powerful command-line tools that make it easy to search, modify, and manipulate text. Whether you’re working with log files, configuration files, or data sets, mastering tools like grepsedawkcut, and sort can save you time and improve efficiency.

So, in this tutorial article, I will share about how we can use text processing in real life with cut, sort, extract, filter with mini examples.


1. Grep: Searching Text Patterns

grep is used to search for specific patterns or text files. It is very helpful for finding logs or any particular files information.

Basic Syntax

grep [options] PATTERN [file...]

Common Use Cases

1. Simple String Search

grep "error" logfile.txt

This command will execute and grab error in log file and printing each line by line.

2. Case-Insensitive Search

grep -i "warning" logfile.txt

If you want to make more recursive then -I adds both case-sensitive features like, Warning & warning.

3. Search with Regular Expressions

grep -E "ERROR|WARN" logfile.txt

As you read in title -E flag used for regular expression. Suppose we have to find both error and warning, That time this flag can be useful.

4. Search Recursively

grep -r "TODO" /path/to/project

This command search TODO keyword in all the files as per we mentioned recursively. Real life example, We can search single word in entire source code.


2. Sed: Stream Editor for Modifying Text

sed stands for (stream editor) is an non-interactive command for editing and modifying text in file or maybe stream. It helps developer to make small edit in real time codebase.

Basic Syntax

sed [options] 'COMMAND' [file...]

Common Use Cases

1. Replace Text

sed 's/foo/bar/g' file.txt

The s/foo/bar/g command replaces all instances of “foo” with “bar” in file.txt.

2. Delete Specific Lines

sed 's/foo/bar/g' file.txt

This command deletes the third line from file.txt.

3. Insert Text After a Pattern

sed '/pattern/a\This is new text' file.txt

This command adds sentence “This is new text” after the word “pattern” in every line.

4. In-place File Editing

sed -i 's/old/new/g' file.txt

This -I option and command will edit or modify file in the place. so, we can make changes directly to the original files.


3. Awk: A Pattern-Scanning and Processing Language

awk is powerful programming language for patterns scanning and processing. It can transform or filter data of the various sources based on condition. Best for extracting data.

Basic Syntax

awk 'PROGRAM' [file...]

Common Use Cases

1. Print Specific Columns

awk '{print $1, $3}' file.txt

This command will print first and third column from every line from file.txt.

2. Filter by Condition

awk '$3 > 100' data.txt

This is command makes condition for prints lines where third column values are greater than 100.

3. Field Separator

awk -F, '{print $2}' data.csv

The -F option sets the field separator to the comma (,) for processing in csv files.

4. Mathematical Operations

awk '{sum += $2} END {print sum}' data.txt

This script will sum the values in all second columns of data.txt


4. Cut: Extract Specific Sections of Text

For extracting specific sections of lines or text as columns and fields we use cut command.

Basic Syntax

cut [options] [file...]

Common Use Cases

1. Extract Specific Columns

cut -f1,3 file.txt

Used for extracting specific columns.

2. Specify a Delimiter

cut -d',' -f2 file.csv

The -d option defines as delimiter.It is allowing you to work with delimited text files like CSVs. This command extracts the second column from a CSV file.


5. Sort: Sort Lines in a File

sort stands for sorting lines in alphabetical order or numerical order. it supports advance sorting algorithm for sorting columns, ignoring case and many more.

Basic Syntax

sort [options] [file...]

Common Use Cases

1. Sort Alphabetically

sort file.txt

This sort simple syntax command will sort lines alphabetically.

2. Sort Numerically

sort -n data.txt

The -n option sorts the file based on numeric values rather than alphabetically.

3. Reverse Sorting

sort -r file.txt

Reverses the sort order of lines.

4. Sort by a Specific Column

sort -k 2 file.txt

The -k 2 option sorts the lines based on the second column.


Combining Tools for Powerful Text Processing

The combination of this all tools make powerful workflows. Example, you can search, filter and sort data using one single command.

Example: Search, Extract, and Sort

grep "error" logfile.txt | cut -d' ' -f1,4 | sort -u

This pipeline searches for lines containing “error” in logfile.txt, extracts the first and fourth fields, and then sorts the results uniquely.

Example: Modify and Filter with Sed and Awk

sed 's/warning/WARNING/g' logfile.txt | awk '$3 == "ERROR" {print $1, $4}'

This command replaces “warning” with “WARNING” in logfile.txt and prints the first and fourth fields of lines where the third field is “ERROR”.


Conclusion

Learning tools like grep, sed, awk, cut and sort helps for making runtime and efficient data processing in linux. This tools are powerful for search, filter and sorting data efficiently. This is it from my side. I hope you like the short and simple way of learning with me. In upcoming article we will discuss about file permissions and ownership with specific commands.

Leave a Comment

Comments

No comments yet. Why don’t you start the discussion?

    Leave a Reply