Shell Script

14 Introduction sed and gawk

트리스탄1234 2022. 8. 27. 09:04
728x90
반응형

1. Manipulate text

Log files or a lot of data in Linux are stored in text format. Among these text editors, sed and gawk are among the text editors that extract only the necessary information from these texts, process them, and make them useful.

sed Editor

The sed editor is called the stream editor. In general interactive text editor, when inserting, deleting, or replacing data, input and process commands with the keyboard, but stream editor edits the data stream according to the applied rules. The syntax for using the sed editor is as follows.

$sed option script file

The table below shows the options available in sed.

Let's look at one example using the sed command.

$ echo "This is a test" | sed ’s/test/big test/’

This is a big test

$

This is an example of passing the result of echo to sed using a pipe and changing test to big data using the s command.

$ cat data1 ==> Look at the data 1 file.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog
The quick brown fox jumps over the lazy dog
The quick brown fox jumps over the lazy dog
$ sed ’s/dog/cat/’ data1 ==>Converts dog in data 1 file to cat.
The quick brown fox jumps over the lazy cat.
The quick brown fox jumps over the lazy cat.
The quick brown fox jumps over the lazy cat.
The quick brown fox jumps over the lazy cat.
$

Using multiple data instructions from the command line T

o use multiple sed commands on one command line, separate the -e option and commands with a semicolon.

Below is an example of usage.

$ sed -e ’s/brown/green/; s/dog/cat/’ data1 ==> In data 1 file, brown to green and cat to dog
The quick green fox jumps over the lazy cat
The quick green fox jumps over the lazy cat
The quick green fox jumps over the lazy cat
The quick green fox jumps over the lazy cat
$
Another option is to use a second prompt instead of a semicolon.
$ sed -e ’
> s/brown/green/
> s/fox/elephant/
> s/dog/cat/’ data1
The quick green elephant jumps over the lazy cat
The quick green elephant jumps over the lazy cat
The quick green elephant jumps over the lazy cat
The quick green elephant jumps over the lazy cat
$
반응형

 

Reading editor commands from a file

If there are many commands to be executed, there is also a way to save the commands as a file and load them. Below is an example.

$ cat script1
s/brown/green/
s/fox/elephant/
s/dog/cat/
$
$ sed -f script1 data1
=> It calls the command of the script1 file and processes the data of the data1 file.
The quick green elephant jumps over the lazy cat.
The quick green elephant jumps over the lazy cat.
The quick green elephant jumps over the lazy cat.
The quick green elephant jumps over the lazy cat.
$

gawk program

You can think of gwak as a more advanced editor than sed. In sed, data was processed using commands, but gwak provides more powerful functions such as processing data using programs and using variables. A brief summary of gwak is as follows:

■ Using variables to store data

■ Arithmetic operations and string manipulation possible

■ Structured programming available

■ Generate formatted reports

gawk command format

The command format is as follows. And the available options are shown in the table below.

$gawk options program file

Reading program script from command line

gwak uses square brackets to read scripts from the command line. Below is an example of usage.

$ gawk ’{print "Hello John!"}’

gwak must be used by starting with quotation marks, putting the command to be executed in square brackets, closing the square brackets, and enclosing them in quotation marks.

When the above command is entered, the above command is executed whenever the user presses Enter after entering text.

$ gawk '{print "Hello Wold!"}'
this is a test 
Hello Wold! ==> run gawk command
HELLO
Hello Wold!
this is another test
Hello Wold!
Ctrl +D ==> inform the end line of script to gawk

Using data field variables

gawk automatically allocates a long variable to handle the data stream.

■ $0 displays the entire line

■ Display the first data in text $1

■ $2 Displays the second data in the text.

■ $n Displays the nth data in text.

Data fields are delimited by a Field seperate character in a line of text. Separate space for basic data fields. The example below prints the first data in the data field.

$ cat data3
One line of test text.
Two lines of test text.
Three lines of test text.
$ gawk ’{print $1}’ data3
One
Two
Three
$

Now, let's change the data separator to a colon and read the first data in the /etc/passwd file.

$ gawk -F: ’{print $1}’ /etc/passwd
root
bin
daemon
adm
lp
sync
shutdown
halt
...

Using multiple commands in a program script

If you want to use multiple commands in gawk, you can put semicolons between them. Let's look at the example below.

$ echo "My name is Rich" | gawk ’{$4="Dave"; print $0}’
My name is Dave
$

 

The above command outputs one line after changing the 4th data field among the output statements executed by echo to Dave. You can enter the program script as below at the second prompt like sed below.

$ gawk ’{

> $4="testing"

> print $0 }’

This is not a good test.

This is not testing good test.

$

read program from file

Like the sed editor, gawk puts a program into a file, fetches the file and runs it from the command line. Below is an example of reading from a file.

$ cat script2
{ print $5 "’s userid is " $1 }
$ gawk -F: -f script2 /etc/passwd
root’s userid is root
bin’s userid is bin
PostgreSQL Server’s userid is postgres
FTP User’s userid is ftp
GDM User’s userid is gdm
HTDIG User’s userid is htdig

Multiple commands can be entered into a file and executed. Let's look at the example below.

$ cat script3
{
text="’s userid is "
print $5 text $1
}
$ awk -F: -f script3 /etc/passwd | more
root’s userid is root
bin’s userid is bin
PostgreSQL Server’s userid is postgres
FTP User’s userid is ftp
GDM User’s userid is gdm
HTDIG User’s userid is htdig
Dhcpd User’s userid is dhcpd
Bind User’s userid is named
NSCD Daemon’s userid is nscd
X Font Server’s userid is xfs
MySQL server’s userid is mysql
Rich’s userid is rich
test account’s userid is testing
postfix’s userid is postfix
$

Execute the secret before data processing

gawk usually processes data first and then executes the gawk command. However, sometimes it is necessary to execute the script first, and the useful syntax at this time is the begin command. Below is an example of using begin.

$ gawk ’BEGIN {print "Hello World!"}’
Hello World!
$

This statement outputs "Hello World" before data processing..

Using the END command

Unlike begin, the END command executes the last script after data processing. Let's take a look at the example below.

$ gawk ’BEGIN {print "Hello World!"} {print $0} END {print "byebye"}’
Hello World!
This is a test
This is a test
This is another test.
This is another test.
byebye
$

Let's take a look at one good example this time using BEING and end.

$ cat script4
BEGIN {
print "The latest list of users and shells"
print " Userid Shell"
print "-------- -------"
FS=":"
}
{print $1 " " $7}
END {print "This concludes the listing"}
let's run this script
$ gawk -f script4 /etc/passwd
hyowon@hyowon-800G5M-800G5W:~$ gawk -f script4 /etc/passwd
The latest list of users and shells
Userid Shell
-------- -------
root /bin/bash
daemon /usr/sbin/nologin
bin /usr/sbin/nologin
sys /usr/sbin/nologin
sync /bin/sync
games /usr/sbin/nologin
man /usr/sbin/nologin
lp /usr/sbin/nologin
mail /usr/sbin/nologin

2. sed editor basics

The best way to write sed data is to know the various commands and formats. Let's take a look at the basic commands and functions of sed.

Subsitution flags

Let's take a look at the example below.

$ cat data4
This is a test of the test script.
This is the second test of the test script.
$ sed ’s/test/trial/’ data4
This is a trial of the test script.
This is the second trial of the test script. $

The sed's/test/trial syntax seems to work fine, but you can see that it only replaces one word per line and the second does not. In this case, you can make more specific manipulations by using the substitution flags. Below are the available flags.

■ Number: If you put a number in the flag value, you can define the number of characters to be replaced.

■ g : All characters appearing on one line are replaced.

■ p : Outputs the sentence that matches the pattern in the original string.

■ w file : Saves the file in the collation file.

$ cat test
This is a test of the test script.
This is the second test of the test script.
$ sed ’s/test/trial/2’ test
This is a test of the trial script.
This is the second test of the trial script.
$
$ sed ’s/test/trial/g’ test
This is a trial of the trial script.
This is the second trial of the trial script.
$
$ cat test
This is a test line.
This is a different line.
$ sed -n ’s/test/trial/p’ data5
This is a trial line.
$
$ sed ’s/test/trial/w test’ result
This is a trial line.
This is a different line.
$ cat result
This is a trial line.
$

replace the characters

The example below shows the case of changing the path in a file. In this case, use '!" as the string delimiter.

$ sed ’s!/bin/bash!/bin/csh!’ /etc/passwd ==>> Change the shell of the file in /etc/passwd to csh

Using Addressing

Because it is a specific line in one file, there are cases where you want to change only a specific group. Addressing can be used in this case. There are cases where the addressing method uses numbers and there are cases where lines are filtered using character patterns.

Below is an example using numbers.

$ sed ’2s/dog/cat/’ data1 ==> Applied only to the 2nd line of the data1 file
The quick brown fox jumps over the lazy dog
The quick brown fox jumps over the lazy cat
The quick brown fox jumps over the lazy dog
The quick brown fox jumps over the lazy dog $
$ sed ’2,3s/dog/cat/’ data1
The quick brown fox jumps over the lazy dog
The quick brown fox jumps over the lazy cat
The quick brown fox jumps over the lazy cat
The quick brown fox jumps over the lazy dog
$
$ sed ’2,$s/dog/cat/’ data1 .
The quick brown fox jumps over the lazy dog
The quick brown fox jumps over the lazy cat
The quick brown fox jumps over the lazy cat
The quick brown fox jumps over the lazy cat
$

Using Text Pattern Filters

You can change a portion of text with a specific pattern using the text pattern filter. Its usage is shown in the example below.

$ sed ’/rich/s/bash/csh/’ /etc/passwd ==> Change rich user's shell in etc/passwd file
rich:x:500:500:Rich Blum:/home/rich:/bin/csh barbara:x:501:501:Barbara:/home/barbara:/bin/bash katie:x:502:502:Katie:/home/katie:/bin/bash
jessica:x:503:503:Jessica:/home/jessica:/bin/bash
test:x:504:504:Ima test:/home/test:/bin/bash $

Using command grouping

There are cases when you want to change several characters in one line. In this case, you can change the following by enclosing the commands using curly braces.

$ sed ’2{ ==> applied to the second line
> s/fox/elephant/ ==>from fox to elephant
> s/dog/cat/ ==> from dog to cat
> }’ data1
The quick brown fox jumps over the lazy dog.
The quick brown elephant jumps over the lazy cat.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
$
$ sed ’3,${ ==> Apply from the 3rd line to the end
> s/brown/green/ ==>from brown to green
> s/lazy/active/ ==> from lazy to active
> }’ data1
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick green fox jumps over the active dog.
The quick green fox jumps over the active dog.
$

erase line

In addition to replacing characters in sed, you can also delete specific characters. The command to delete is 'd'.

$ sed ’d’ data1 ==> Delete the entire data1 file
$ sed ’3d’ data6 ==> Delete the 3rd line of the data6 file
This is line number 1.
This is line number 2.
This is line number 4.
$
$ sed ’2,3d’ data6 ==>Delete 2nd and 3rd lines
This is line number 1.
This is line number 4.
$
$ sed ’3,$d’ data6 ==> Delete from the 3rd line to the end
This is line number 1.
This is line number 2.
$
$ sed ’/number 1/d’ data6
This is line number 2.
This is line number 3.
This is line number 4.
$

So far you have learned the command to delete. However, the thing to remember is that what is deleted with the d command is deleted in the sed editor, and it is not deleted from the actual file.

Deleting using 2 data patterns

Using two data patterns, you can define the beginning of erasing and the end of erasing.

$ sed ’/1/,/3/d’ data6
This is line number 4.
$
$ cat data7
This is line number 1.
This is line number 2.
This is line number 3.
This is line number 4.
This is line number 1 again.
This is text you want to keep.
This is the last line in the file.
$ sed ’/1/,/3/d’ data7
This is line number 4.
$
The reason that only line number 4 remains as the result of executing the above command is that erasing starts again in the line number 1 again statement and erases all data until the end of the file because it did not find number 3 where erasing ends until it is finished.

Inserting and adding text

sed can be inserted and added like other editors. Please refer to the options below.

■ i : Inserts a new line before a specific line.

■a : Adds a new line after a specific line.

The command format is as follows.

sed ’[address]command\ new line’

$ echo "testing" | sed ’i\ ==> Insert it before testing.
> This is a test’
This is a test
testing
$
$ echo "testing" | sed ’a\ ==> Add a statement after testing.
> This is a test’
testing
This is a test
$
$ sed ’3i\ ==> > Add the following sentence before the 3rd sentence.
> This is an inserted line.’ data6
This is line number 1.
This is line number 2.
This is an inserted line.
This is line number 3.
This is line number 4.
$
$ sed ’3a\ ==> Add the following sentence after the 3rd line.
>This is an inserted line.’ data6
This is line number 1.
This is line number 2.
This is line number 3.
This is an inserted line.
This is line number 4.
$
$ sed ’$a\
=> '$'The symbol indicates the end of the line. Add the sentence below to the last line.
> This is a new line of text.’ data6
This is line number 1.
This is line number 2.
This is line number 3.
This is line number 4.
This is a new line of text.
$
$ sed ’1i\ ==> Insert the following 2 sentences on the first line.
> This is one line of new text.\
> This is another line of new text.’ data6
This is one line of new text.
This is another line of new text.
This is line number 1.
This is line number 2.
This is line number 3.
This is line number 4.
$

change line

The command to change the line is 'c' . You can change a line to a new line with the c command. Let's look at an example below.

$ sed ’3c\ ==>Change the 3rd line to the following sentence.
> This is a changed line of text.’ data6
This is line number 1.
This is line number 2.
This is a changed line of text.
This is line number 4.
$
$ sed ’/number 3/c\ ==> Change the line with the pattern number 3 to the following sentence.
> This is a changed line of text.’ data6
This is line number 1.
This is line number 2.
This is a changed line of text.
This is line number 4.
$
$ sed ’2,3c\ ==> Change the 2nd and 3rd lines.
> This is a new line of text.’ data6
This is line number 1.
This is a new line of text.
This is line number 4.
$
$ sed ’y/123/789/’ data7 ==> 1 to 7, 2 to 8, 3 to 9.
This is line number 7.
This is line number 8.
This is line number 9.
This is line number 4.
This is line number 7 again.
This is yet another line.
This is the last line in the file.
$
$ echo "This 1 is a test of 1 try." | sed ’y/123/456/’ ==> Change 1 to 4, 2 to 5, and 3 to 6.
This 4 is a test of 4 try.
$

line print

The printing command prints the lines that the sed editor changes. Those options are as follows.

■ p: Prints the line edited by sed.

■ = : Displays the line number. command to print line numbers

■ l : Lists up lines.

$ sed -n ’/number 3/p’ data6 ==> The line matching number 3 is printed.
This is line number 3.
$
$ sed -n ’2,3p’ data6 ==> Lines 2 and 3 are printed.
This is line number 2.
This is line number 3.
$
$ sed -n ’/3/{ ==> A line with a pattern of 3
p ==>print before change
s/line/test/p }’ data6 ==> Change the line to test and print it.
This is line number 3.
This is test number 3.
$

print line number

The '=' command prints the line number. Let's look at an example below.

$ sed ’=’ data1
1
The quick brown fox jumps over the lazy dog.
2
The quick brown fox jumps over the lazy dog.
3
The quick brown fox jumps over the lazy dog.
4
The quick brown fox jumps over the lazy dog.
$
$ sed -n ’/number 4/{ ==> The sentence with pattern number 4
= ==> mark the line
p ==> print.
}’ data6 4
This is line number 4.
$

Print non-printing characters

The 'l' option outputs both text and non-printed contents in the file. Let's take a look through the example below.

$ cat data8
This line contains tabs.
$ sed -n ’l’ data8
This\tline\tcontains\ttabs.$ $
\t shows the tab character, and the last $ character shows the end of the line.
$ cat data9
This line contains an escape character
$ sed -n ’l’ data9
This line contains an escape character \033[44m$ $
\033[44-$ shows the escape code.

Accessing files from sed

The 'w' command can be used to write a line to a file. The usage syntax is as follows. [address]w filename ==> filename can take both relative and absolute paths.

$ sed ’1,2w test’ data6 ==> Save the first to second lines of the data6 file in the test file.
This is line number 1.
This is line number 2.
This is line number 3.
This is line number 4.
$ cat test
This is line number 1.
This is line number 2.
$

Reading data from a file and appending it to a line You can also read lines from another file and append them to the file you are editing. The command used in this case is 'r'. The syntax used is as follows. And in the r command, only text patterns or line numbers can be used, and you cannot specify the range of lines.

[address]r filename

$ cat data1
This is an added line.
This is the second added line.
$ sed ’3r data1’ data6 ==> Add data1 file after the 3rd line of data6 file.
This is line number 1.
This is line number 2.
This is line number 3.
This is an added line.
This is the second added line.
This is line number 4.
$
$ sed ’/number 2/r data11’ data6
==> Add the line of data11 after the number2 pattern to the data6 file.
This is line number 1.
This is line number 2.
This is an added line.
This is the second added line.
This is line number 3.
This is line number 4.
$
$ sed ’$r data11’ data6 ==> Add the contents of data11 to the end of the data6 file.
This is line number 1.
This is line number 2.
This is line number 3.
This is line number 4.
This is an added line.
This is the second added line.
$
Let's look at the contents of the letter file.
$ cat letter
Would the following people:
LIST
please report to the office.
$
If you look at the contents of the data10 file
$ cat data10
Blum, Katie Chicago, IL
Mullen, Riley West Lafayette, IN
Snell, Haley Ft. Wayne, IN
Woenker, Matthew Springfield, IL
Wisecarver, Emma Grant Park, IL
$
Here, by using the LIST pattern, the list of participants in DATA 10 is put into the letter file.
$ sed ’/LIST/{
> r data10 ==>Read the data10 file with the participant list.
> d ==> Delete the LIST in data10.
> }’ letter
Would the following people:
Blum, Rich Chicago, IL
Mullen, Riley West Lafayette, IN Snell,
Haley Ft. Wayne, IN Woenker,
Matthew Springfield, IL
Wisecarver,
Emma Grant Park, IL
please report to the office. $

728x90
반응형

'Shell Script' 카테고리의 다른 글

16 Advanced Sed  (1) 2022.08.31
15 Regular Expression  (1) 2022.08.27
13 Using Graphic in Script  (1) 2022.08.17
12 Making Function  (5) 2022.08.16
11 Script Control  (0) 2022.08.15