Article 5DNGW Comparing a reference word to file contents

Comparing a reference word to file contents

by
danielbmartin
from LinuxQuestions.org on (#5DNGW)
Have: a reference word. This is not (necessarily) an English word.
It is merely a string of characters.

Have: a file of words, one word per line. Again, might be or not be English words.

Want: the input file with an indication of which letters in each word are NOT in the reference word.

Example: reference word = etaoinshrdlu
Example: the file contains this ...
Code:roosevelt
truman
eisenhower
kennedy
johnson
nixon
fordAll of these three solutions..
Code:RW='etaoinshrldu' # RW = Reference Word
# Method 1.
tr -d "[$RW]" <$InFile \
|paste -d' ' $InFile - \
> $OutFile

# Method 2.
sed 's/['$RW']//g' <$InFile \
|paste -d' ' $InFile - \
> $OutFile

# Method #3.
awk '{a=gensub(/['$RW']/,"","g",$0);
print $0,a}' $InFile >$OutFile... produce the desired OutFile ...
Code:roosevelt v
truman m
eisenhower w
kennedy ky
johnson j
nixon x
ford f
Now consider the mirror-image problem....

Have: a reference word. This is not (necessarily) an English word.
It is merely a string of characters.

Have: a file of words, one word per line. Again, might be or not be English words.

Want: the input file with an indication of which letters in the reference word are NOT in each input word.

This solution ...
Code:# Method 6.
awk -v rw=$RW 'BEGIN{n=split(rw,a,"")}
{NoMatch=""
for (j=1;j<=n;j++)
if (!match($0,a[j])) NoMatch=NoMatch a[j]
print $0,NoMatch}' \
$InFile >$OutFile... produces the desired OutFile ...
Code:roosevelt ainhdu
truman eoishld
eisenhower taldu
kennedy taoishrlu
johnson etairldu
nixon etashrldu
ford etainshluAs a matter of personal coding style I strive for concise solutions without explicit loops. Gurus, please offer constructive criticism and better solutions.

Daniel B. Martin

.latest?d=yIl2AUoC8zA latest?i=8WKagBszCa4:XKncsbNbdk0:F7zBnMy latest?i=8WKagBszCa4:XKncsbNbdk0:V_sGLiP latest?d=qj6IDK7rITs latest?i=8WKagBszCa4:XKncsbNbdk0:gIN9vFw8WKagBszCa4
External Content
Source RSS or Atom Feed
Feed Location https://feeds.feedburner.com/linuxquestions/latest
Feed Title LinuxQuestions.org
Feed Link https://www.linuxquestions.org/questions/
Reply 0 comments