|
|
|
La idea general de awk es que es un lenguaje que analiza archivos de
texto, generalmente separados por algun separador, de modo que son
archivos de texto en columnas, y analiza cada linea por la coincidencia de
un
pattern, y si lo encuentra realiza una accion. (esto es lo que hace vim si
el rango
especificado es un pattern!!!)
Pero tambien se puede trabajar archivos texto no columnares y hacer cosas
mas sofisticadas.
Ver dary.awk
AWK patterns may be one of the following:
BEGIN
END
/regular expression/
relational expression
pattern && pattern
pattern || pattern
pattern ? pattern : pattern
(pattern)
! pattern
pattern1, pattern2
Su sintaxis de invocacion: (resumida)
awk 'texto del programa' archivoVictima.txt
awk -f programa.awk archivoVictima.txt
Ejemplos:
awk '$1<$2 {print $0, $1/$2}' file1 >file2
Recorre el archivo file1, linea a linea.
Si en una encuentra que el valor de la columna 1 es menor que la columna 2,
entonces realiza la accion de imprimir la linea completa y luego el resultado
de la division de la columna 1 entre la 2.
La salida la manda a file2.
#! /bin/awk -f
#1.awk
#
# Programa awk que lee un archivo con columnas separadas por ":"
# e imprime la primera columna pero ordenada
# Se invoca asi:
# awk -f 1.awk archivoTabular.txt
# o se hace ejecutable con chmod y listo
# antes de la primera de analizar la primera linea haga:
BEGIN {FS=":"}
# ahora si el programa
# En cada linea imprimi el campo 1 pero luego redirecciona TODO a sort
{print $1|"sort"}
# mi primer programa real awk
# La idea es leer un archivo *.SPS que se caracteriza por tener en la linea
# 16 al profesor, y en donde hay una fecha, ciertos numeros de interes
# Este programa es llamado por todos.bash
# Aunque el punto y coma al final de la linea no es necesario, ayuda a emacs
a formatear mejor
# Codigo que se ejecuta una vez, antes de leer todas las lineas del archivo
a examinar
BEGIN {
j=0;
for (i=9; i<=29; i++) {
cuantos[i]=0;
suma[i]=0;
promedio[i]=0.;
max[i]=0;
min[i]=1000;
elem[j,i]=0;
desvEst[i]=0.;
sumaRaiz[i]=0;
}
}
# EL SIGUIENTE CODIGO SE EJECUTA UNA VEZ POR CADA LINEA:
# Si el Field Number Record (numero de linea es 16...)
{if (FNR==16)
printf " profesor:%s\n", $0;
}
#El delimitador de campo es null, entonces se manejara byte a byte
FS=""
# si la linea tiene el pattern /05 actue
/\/05/ {
# j es la fila
for (i=9; i<=29; i++) {
# En la columna 9 empiezan los datos y terminan en la 29
printf "%d,",$i;
elem[j,i]=$i;
if ($i!=0) {
cuantos[i]++;
suma[i]+=$i;
if($i>max[i])
max[i]=$i;
if ($i<min[i])
min[i]=$i;
}
}
printf "\n";
j++;
}
# Este codigo se ejecuta al final del archivo examinado
END {
numFilas=j;
# promedio
for (i=9; i<=29; i++) {
promedio[i]=suma[i]/cuantos[i];
printf "%.1f,", promedio[i];
}
printf "\n";
# desvEst=sqrt(sum((prom[i]-elem[i])^2)/cuantos[i])
# desviacion estandard
# impresión de la matriz para probar su contenido
# for (j=0; j<numFilas; j++) {
# for (i=9; i<=29;i++) {
# printf "%d,",elem[j,i];
# }
# printf "\n";
# }
for (i=9; i<=29; i++) {
sumaRad=0;
for (j=0;j<numFilas; j++) {
temp=elem[j,i]-promedio[i];
sumaRad+=temp*temp;
}
desvEst[i]=sqrt(sumaRad/(cuantos[i]-1));
printf "%.1f,",desvEst[i];
}
printf "\n";
# min
for (i=9; i<=29; i++)
if (min[i]!=0)
printf "%s,", min[i];
printf "\n";
# max
for (i=9; i<=29; i++)
printf "%d,",max[i];
printf "\n";
# cuantos
for (i=9; i<=29; i++)
printf "%d,",cuantos[i];
printf "\n";
}
Mas ejemplos tomados de:
http://sparky.rice.edu/~hartigan/awk.html
por: Patrick Hartigan
hartigan@sparky.rice.edu
EXAMPLES # is the comment character for awk. 'field' means
'column'
# Print first two fields in opposite order:
awk '{ print $2, $1 }' file
# Print lines longer than 72 characters:
awk 'length > 72' file
# Print length of string in 2nd column
awk '{print length($2)}' file
# Add up first column, print sum and average:
{ s += $1 }
END { print "sum is", s, " average is", s/NR }
# Print fields in reverse order:
awk '{ for (i = NF; i > 0; --i) print $i }' file
# Print the last line
{line = $0}
END {print line}
# Print the total number of lines that contain the word Pat
/Pat/ {nlines = nlines + 1}
END {print nlines}
# Print all lines between start/stop pairs:
awk '/start/, /stop/' file
# Print all lines whose first field is different from previous one:
awk '$1 != prev { print; prev = $1 }' file
# Print column 3 if column 1 > column 2:
awk '$1 > $2 {print $3}' file
# Print line if column 3 > column 2:
awk '$3 > $2' file
# Count number of lines where col 3 > col 1
awk '$3 > $1 {print i + "1"; i++}' file
# Print sequence number and then column 1 of file:
awk '{print NR, $1}' file
# Print every line after erasing the 2nd field
awk '{$2 = ""; print}' file
# Print hi 28 times
yes | head -28 | awk '{ print "hi" }'
# Print hi.0010 to hi.0099 (NOTE IRAF USERS!)
yes | head -90 | awk '{printf("hi00%2.0f \n", NR+9)}'
# Replace every field by its absolute value
{ for (i = 1; i <= NF; i=i+1) if ($i < 0) $i = -$i print}
# If you have another character that delimits fields, use the -F option
# For example, to print out the phone number for Jones in the following
file,
# 000902|Beavis|Theodore|333-242-2222|149092
# 000901|Jones|Bill|532-382-0342|234023
# ...
# type
awk -F"|" '$2=="Jones"{print $4}' filename
# Some looping for printouts
BEGIN{
for (i=875;i>833;i--){
printf "lprm -Plw %d\n", i
} exit
}
Formatted printouts are of the form printf( "format\n", value1,
value2, ...
valueN)
e.g. printf("howdy %-8s What it is bro. %.2f\n", $1, $2*$3)
%s = string
%-8s = 8 character string left justified
%.2f = number with 2 places after .
%6.2f = field 6 chars with 2 chars after .
\n is newline
\t is a tab
# Print frequency histogram of column of numbers
$2 <= 0.1 {na=na+1}
($2 > 0.1) && ($2 <= 0.2) {nb = nb+1}
($2 > 0.2) && ($2 <= 0.3) {nc = nc+1}
($2 > 0.3) && ($2 <= 0.4) {nd = nd+1}
($2 > 0.4) && ($2 <= 0.5) {ne = ne+1}
($2 > 0.5) && ($2 <= 0.6) {nf = nf+1}
($2 > 0.6) && ($2 <= 0.7) {ng = ng+1}
($2 > 0.7) && ($2 <= 0.8) {nh = nh+1}
($2 > 0.8) && ($2 <= 0.9) {ni = ni+1}
($2 > 0.9) {nj = nj+1}
END {print na, nb, nc, nd, ne, nf, ng, nh, ni, nj, NR}
# Find maximum and minimum values present in column 1
NR == 1 {m=$1 ; p=$1}
$1 >= m {m = $1}
$1 <= p {p = $1}
END { print "Max = " m, " Min = " p }
# Example of defining variables, multiple commands on one line
NR == 1 {prev=$4; preva = $1; prevb = $2; n=0; sum=0}
$4 != prev {print preva, prevb, prev, sum/n; n=0; sum=0; prev = $4;
preva =
$1; prevb = $2}
$4 == prev {n++; sum=sum+$5/$6}
END {print preva, prevb, prev, sum/n}
# Example of using substrings
# substr($2,9,7) picks out characters 9 thru 15 of column 2
{print "imarith", substr($2,1,7) " - " $3, "out."substr($2,5,3)}
{print "imarith", substr($2,9,7) " - " $3, "out."substr($2,13,3)}
{print "imarith", substr($2,17,7) " - " $3, "out."substr($2,21,3)}
{print "imarith", substr($2,25,7) " - " $3, "out."substr($2,29,3)}
***********************
Ejemplos tomados de:
http://www.softpanorama.org/Tools/awk.shtml
Print the length of the longest input line:
awk '{ if (length($0) > max) max = length($0) }
END { print max }' data
* Print every line that is longer than 80 characters:
awk 'length($0) > 80' data
The sole rule has a relational expression as its pattern and it
has no
action—so the default action, printing the record, is used.
* Print the length of the longest line in data:
expand data | awk '{ if (x < length()) x = length() }
END { print "maximum line length is " x }'
The input is processed by the expand utility to change tabs into
spaces,
so the widths compared are actually the right-margin columns.
* Print every line that has at least one field:
awk 'NF > 0' data
This is an easy way to delete blank lines from a file (or rather,
to
create a new file similar to the old file but from which the blank
lines have
been removed).
* Print seven random numbers from 0 to 100, inclusive:
awk 'BEGIN { for (i = 1; i <= 7; i++)
print int(101 * rand()) }'
* Print the total number of bytes used by files:
ls -l files | awk '{ x += $5 }
END { print "total bytes: " x }'
* Print the total number of kilobytes used by files:
ls -l files | awk '{ x += $5 }
END { print "total K-bytes: " (x + 1023)/1024 }'
* Print a sorted list of the login names of all users:
awk -F: '{ print $1 }' /etc/passwd | sort
* Count the lines in a file:
awk 'END { print NR }' data
* Print the even-numbered lines in the data file:
awk 'NR % 2 == 0' data
If you use the expression `NR % 2 == 1' instead, the program would
print the
odd-numbered lines.
Dr. Nikolai Bezroukov
Actualizada 2008-01-03