Select lines where the first two words are identical
by danielbmartin from LinuxQuestions.org on (#6FHV4)
Have: a file in which each line contains two or more blank-delimited words.
Want: an OutFile which contains lines from the InFile where the first two words are identical.
This is a learning exercise, nothing more.
With this InFile ...
Code:Daniel George
Henry Frank
Linda Carol Mary Debbie Michelle
Samuel Samuel (my Uncle Sam)
Irving Simon Simon
Harold Harold
Edward Edward
Robert Richard Robert
David David
Davi David
David Davi
avid David
David avid... the desired OutFile is ...
Code:Samuel Samuel (my Uncle Sam)
Harold Harold
Edward Edward
David DavidNote that the irregular spacing is preserved.
This awk works.
Code:awk '{if ($1==$2) print}' $InFile >$OutFileThis concise awk also works.
Code:awk '$1==$2' $InFile >$OutFileThis sed works.
Code:sed -rn '/^(.+) *\b\1\b/p' $InFile >$OutFileThis grep works.
Code:egrep '^(.+) *\b\1\b' $InFile >$OutFileThis bash almost works...
Code:while read InLine # Read one line from the InFile.
do
arr=($InLine)
# first two words
W1="${arr[@]:0:1}"
W2="${arr[@]:1:1}"
if [ "$W1" == "$W2" ]
then
echo $InLine
fi
done <$InFile # End of bash loop... but the irregular spacing is lost.
1) Corrections and suggested improvements are welcomed.
2) Please show how the bash solution could be changed
to preserve the irregular spacing.
Thank you.
Daniel B. Martin
.
Want: an OutFile which contains lines from the InFile where the first two words are identical.
This is a learning exercise, nothing more.
With this InFile ...
Code:Daniel George
Henry Frank
Linda Carol Mary Debbie Michelle
Samuel Samuel (my Uncle Sam)
Irving Simon Simon
Harold Harold
Edward Edward
Robert Richard Robert
David David
Davi David
David Davi
avid David
David avid... the desired OutFile is ...
Code:Samuel Samuel (my Uncle Sam)
Harold Harold
Edward Edward
David DavidNote that the irregular spacing is preserved.
This awk works.
Code:awk '{if ($1==$2) print}' $InFile >$OutFileThis concise awk also works.
Code:awk '$1==$2' $InFile >$OutFileThis sed works.
Code:sed -rn '/^(.+) *\b\1\b/p' $InFile >$OutFileThis grep works.
Code:egrep '^(.+) *\b\1\b' $InFile >$OutFileThis bash almost works...
Code:while read InLine # Read one line from the InFile.
do
arr=($InLine)
# first two words
W1="${arr[@]:0:1}"
W2="${arr[@]:1:1}"
if [ "$W1" == "$W2" ]
then
echo $InLine
fi
done <$InFile # End of bash loop... but the irregular spacing is lost.
1) Corrections and suggested improvements are welcomed.
2) Please show how the bash solution could be changed
to preserve the irregular spacing.
Thank you.
Daniel B. Martin
.