Operating systems 1, Lecture 4

Sanda-Maria AVRAM, Ph. D.

Unix filters

Any command that reads a file from the standard input converts it in some way and displays it at the standard output.

sed
grep
awk
sort
uniq
wc
head
tail

sed (stream editor = noninteractive text editor)

Process line-by-line text files using a temporary buffer and display the temporary buffer at the standard output (unless the -n ( --quiet, --silent) option is used).

sed [-n] [-e script ] [-f script_file ] [ file_list ]

        
script:
          condition   instruction
        
      

sed. Conditions

no condition true for all lines in the file
n true for the n-th line
(lines are numbered cumulatively in the file list)
$ true condition for the last line in the file
/regular expression/ a true condition for lines containing at least one substring
that matches the regular expression
expr1, expr2 true for lines between the line that matches expr1 and the
line that matches expr2

sed. Conditions. Examples

sed. Instructions

p displays the temporary buffer at the standard output
d delete the temporary buffer
i\<ENTER> has as a parameter a text (given on the following lines in the script file)
which it displays at the standard output before the processed line
a\<ENTER> analogue to i\ but displays the text after processing each line
y/str1/str2/ (where str1 and str2 have equal lengths)
 performs a translation
by replacing the characters in the input files found in str1
with the corresponding characters in str2

sed. Instructions. Examples

sed. Instructions (cont.)

s/regular_expression/str/[flags]
replaces the first occurrence of a string that matches the regular expression with str

flags:
nothing replaces only the first occurrence
n (n is a number between 1 and 512) replaces the nth occurrence
g replaces all occurrences in the lines
p displays the buffer on the output if there has been a change in that line

sed. Instructions (cont.) Examples

grep

Searches for a specific character string in a file or multiple files and displays the outcome to the standard output. The name comes from the English expression „global/regular expression/print”.

grep [-chilnqsvw] [ [-e] regular_expression | -f script_file ] [ file_list ]

grep. Options

-c (count) displays only the number of lines that match the regular expression
-h (hide) does not display the file name
-i (ignore case) does not make the difference between upper and lower case letters
-l only displays the file names that contain the string that is searched
-n displays the lines that match the regular expression preceded by the line number
relative to the beginning of each file
-q, -s displays nothing, to determine whether or not there was at least one match
-v displays lines that do not contain the given string
-w displays the lines where the string you are looking for is an entire word
-e is used if we want the regular expression to begin with "-"

grep. Options. Examples

Extended regular expression (ERE)

^ the beginning of the line (if ^ is the first character in the regular expression)
$ the end of the line (if $ is the last character in the regular expression)
. any character
[list] any character in list
[c1-c2] any character between c1 and c2 in lexicographic order
[^list] the negation of [list]
* repeats the previous regular expression as many times as possible
+ as *, but repeats once or more
? as *, but repeats once or zero times

Extended regular expression (ERE) (cont.)

{n} (where n is a number between 0 and 255)
repeat the previous expression of exactly n times
{n,} repeats the previous regular expression at least n times
{n,m} repeats the previous regular expression of at least n times
and at most m times
(regular_exp) group several characters into an expression
\n replaces a string with the nth regular expression
found in the brackets ( )
regexp1 | regexp2 matches either regexp1 or regexp2

Extended regular expression (ERE) (cont.)

\ escape, changes the meaning of the character following it,
between normal and special
\< beginning of word
\> end of word

sed. Examples with regular expressions

grep. Examples with regular expressions

awk

Processes text files by selecting those lines that satisfy the conditions imposed by a list of templates (regular expressions). Its name comes from its three designers and implementers: A. Aho, P. Wieinberger and B. Kerninghan.

awk [ -f script_file ] [-Fc] [ script ] [-v variable=value... ] [ file_list ]

awk. Script

script describes the filtering actions by lines of the form:
          condition { instructions }
      
The awk utility handles the input files one line at a time and executes instructions when the condition is true. If condition is missing, then instructions are run for all lines in the files.

awk. Condition

- is a logical expression built with C operators: ||, &&, !, (). Operands can be arithmetic expressions, relational expressions, constants and variables. Variables must not be declared, they are automatically initialized, their type deducted from the context. For strings there is the concatenation operator (space) as well as some string functions. Arrays can be used, whose indices can be numerical values or strings.

Predefined conditions:
BEGIN it is true before the first line of the first file
END it is true after the last line of the last file

awk. Instructions

awk. Predefined variables

NF (Number of Fields) the number of words/fields in the current line
NR (Number of Records) the number of the current processed line
(the countdown starts at 1); line 1 is the first line of the first file
FNR (File Number of Records) restarts from 1 at the beginning of each file
FS (Field Separator) word/field separator
FILENAME the name of the current file being processed
OFS field separator at output (default is space)
ORS record separator at output (default is new line)
ARGV string of command line parameters
ARGC the number of command line parameters

awk. Accessing fields

$0 the whole line that is being processed
$1 the first word/field on the current line being processed
$2 the second ...
...
$NF The last word/field on the current line

awk. Predefined functions

length(str) the length of str; length <=> length($0)
substr(s,p,n) substring of s which begins at position p and has the length n
index(s1,s2) returns the position to which s2 appears in s1, or 0 otherwise
sprintf(format, arg1, ...) returns the string that printf would print in C as a result
split(s,a,c) where s is string, a is an array and c is a character.
Splits the string s into fields, considering the character c as separator;
if c is missing then the default separator FS is used.
The obtained strings are given as values to the elements of the array a.

awk. Examples

awk. Examples (cont.)

sort

sorts lexicographically the lines of a text file

sort [-cmudMnr] [ -o output ] [ file_list ]

sort. Options

-c (check) checks whether the file is sorted or not
-m (merge) interlaces the input files
-u (unique) removes duplicate output lines
-d (dictionary) compares only letters, numbers and space
-M (month) compares the names of months; ex: “JAN” < “FEB”
-n (numeric) compares the numerical lines
-r (reverse) sorts in reverse order

sort. Examples (cont.)

uniq

- report or omit repeated lines; ! does not detect repeated lines unless they are adjacent (sort the input first)

uniq [-cdiu] [ input_file [ output_file ] ]

-c (count) prefix lines by the number of occurrences;
-d (duplicate) only print duplicate lines;
-i (ignore-case) ignore differences between upper and lowercase when comparing;
-u (unique) only print unique lines.

uniq. Examples

wc

(word count) - count the characters, lines, or words in the input files

wc [-clw] [ file_list ]

-c (chars) returns the number of characters/bytes
-l (lines) returns the number of lines
-w (words) returns the number of words

wc. Examples

head, tail

- list the first lines, and the last lines in a file, respectively

head [ -n [-]number_of_lines | -c [-]number_of_bytes ] [ file_list ]
tail [ -n [+]number_of_lines | -c [+]number_of_bytes ] [ file_list ]

head, tail. Examples

External process management commands



ps

- displays information about active processes in the system

ps [-efl] [ -t terminals ] [ -U user_list ] [ -u user_list ]

-e (similar to -A) all active processes in the system (generic Unix/Linux format)
-f display detailed information about active processes in the system
-l long display format
-t only processes launched from specific terminals
-U selects processes with real owners from user_list
-u selects processes with actual owners from user_list

ps. Output fields of the command

F (flags) flags specified in <sys/proc.h>
(for example P_PPWAIT=10 means that the parent is waiting for the child to finish)
S or STAT process state:
R runable/ready, S sleep, I idle (sleep > 20s), T stopped, Z zombie
UID the owner ID
PID the process ID
PPID the parent process ID
C the number of children of the process
PRI the priority of the process

ps. Output fields of the command (cont.)

TTY the terminal where the process was launched
TIME the time it was served by the CPU
NICE or NI if the priority was modified by the nice command
ADDR the memory address of the process
SZ process size
START or STIME (start) the starting time of the process
CMD the external form of the process launch command

ps. Examples

kill

- emits an interrupt type signal to a process specified by its Process Identifier (PID)

kill [ -signal ] PID

Some of the most used signals:
1 HUP (hang up)
2 INT (interrupt)
3 QUIT (quit)
6 ABRT (abort)
9 KILL ( non-ignorable kill)
14 ALRM (alarm clock)
15 TERM (software termination)

kill. Examples

The End