Finding The Complete SQL statement Using PDFGREP Or Grep
by metallica1973 from LinuxQuestions.org on (#4YSQM)
Linux Gods,
I am simply attempting to parse SQL statements from a PDF doc in creating a base SQL script at a later time but for the life of me, am having a tough time extracting this data.This exact sting worked perfectly a couple of months ago and now it doesnt. Below is an example of the data structure.
PHP Code:
show parameter os_authent_prefix
SHOW PARAMETER log_archive_dest;
Audit:
To assess this recommendation, execute the following SQL statement.
SELECT AUD.POLICY_NAME, AUD.AUDIT_OPTION, AUD.AUDIT_OPTION_TYPE
FROM AUDIT_UNIFIED_POLICIES AUD, AUDIT_UNIFIED_ENABLED_POLICIES ENABLED
WHERE AUD.POLICY_NAME = ENABLED.POLICY_NAME
AND AUD.AUDIT_OPTION = 'DROP DATABASE LINK'
AND AUD.AUDIT_OPTION_TYPE = 'STANDARD ACTION'
AND ENABLED.SUCCESS = 'YES'
AND ENABLED.FAILURE = 'YES'
AND ENABLED.ENABLED_OPT = 'BY'
AND ENABLED.USER_NAME = 'ALL USERS';
Audit:
SELECT AUD.POLICY_NAME, AUD.AUDIT_OPTION, AUD.AUDIT_OPTION_TYPE
FROM AUDIT_UNIFIED_POLICIES AUD, AUDIT_UNIFIED_ENABLED_POLICIES ENABLED
WHERE AUD.POLICY_NAME = ENABLED.POLICY_NAME
AND AUD.AUDIT_OPTION = 'CREATE TRIGGER'
AND AUD.AUDIT_OPTION_TYPE = 'STANDARD ACTION'
AND ENABLED.SUCCESS = 'YES'
AND ENABLED.FAILURE = 'YES'
AND ENABLED.ENABLED_OPT = 'BY'
AND ENABLED.USER_NAME = 'ALL USERS';
Other variations I have tried:
PHP Code:pdfgrep -i -PB 20 -A 20 "audit\:" ./Oracle-12.pdf | gawk '{IGNORECASE=1;} /show.*\;/ || /select.*\;/ {print "Here is the data \n\n",$0, "\n"}'
gawk: cmd. line:1: warning: regexp escape sequence `\;' is not a known regexp operator
Here is the data
REVOKE SELECT_ANY_DICTIONARY FROM <grantee>;
Here is the data
REVOKE SELECT ANY TABLE FROM <grantee>;
Here is the data
REVOKE SELECT_CATALOG_ROLE FROM <grantee>;
Here is the data
AUDIT SELECT ANY DICTIONARY;
I suspect something changed in a binary or two. In attempting to get past this, I have attempted various regex variations:
PHP Code:pdfgrep -i -PB 20 -A 20 "audit\:" ./Oracle-12.pdf | gawk '{IGNORECASE=1;} /show.*|;/ || /select.*|;/ {print "Here is the data \n\n",$0, "\n"}'
pdftotext ./Oracle-12.pdf - | grep -i "select.*\; | show.*\;"
gawk '{IGNORECASE=1;} /show.*|;/ || /select.*|;/ {print "The Goodies \n\n",$0, "\n"}' ./Oracle-12.pdf.txt
Can someone shed some light? I am using distro Kali 2020.1 which I upgrade from 2019.4 and now the original string doesnt work. Thanks


I am simply attempting to parse SQL statements from a PDF doc in creating a base SQL script at a later time but for the life of me, am having a tough time extracting this data.This exact sting worked perfectly a couple of months ago and now it doesnt. Below is an example of the data structure.
PHP Code:
show parameter os_authent_prefix
SHOW PARAMETER log_archive_dest;
Audit:
To assess this recommendation, execute the following SQL statement.
SELECT AUD.POLICY_NAME, AUD.AUDIT_OPTION, AUD.AUDIT_OPTION_TYPE
FROM AUDIT_UNIFIED_POLICIES AUD, AUDIT_UNIFIED_ENABLED_POLICIES ENABLED
WHERE AUD.POLICY_NAME = ENABLED.POLICY_NAME
AND AUD.AUDIT_OPTION = 'DROP DATABASE LINK'
AND AUD.AUDIT_OPTION_TYPE = 'STANDARD ACTION'
AND ENABLED.SUCCESS = 'YES'
AND ENABLED.FAILURE = 'YES'
AND ENABLED.ENABLED_OPT = 'BY'
AND ENABLED.USER_NAME = 'ALL USERS';
Audit:
SELECT AUD.POLICY_NAME, AUD.AUDIT_OPTION, AUD.AUDIT_OPTION_TYPE
FROM AUDIT_UNIFIED_POLICIES AUD, AUDIT_UNIFIED_ENABLED_POLICIES ENABLED
WHERE AUD.POLICY_NAME = ENABLED.POLICY_NAME
AND AUD.AUDIT_OPTION = 'CREATE TRIGGER'
AND AUD.AUDIT_OPTION_TYPE = 'STANDARD ACTION'
AND ENABLED.SUCCESS = 'YES'
AND ENABLED.FAILURE = 'YES'
AND ENABLED.ENABLED_OPT = 'BY'
AND ENABLED.USER_NAME = 'ALL USERS';
Other variations I have tried:
PHP Code:pdfgrep -i -PB 20 -A 20 "audit\:" ./Oracle-12.pdf | gawk '{IGNORECASE=1;} /show.*\;/ || /select.*\;/ {print "Here is the data \n\n",$0, "\n"}'
gawk: cmd. line:1: warning: regexp escape sequence `\;' is not a known regexp operator
Here is the data
REVOKE SELECT_ANY_DICTIONARY FROM <grantee>;
Here is the data
REVOKE SELECT ANY TABLE FROM <grantee>;
Here is the data
REVOKE SELECT_CATALOG_ROLE FROM <grantee>;
Here is the data
AUDIT SELECT ANY DICTIONARY;
I suspect something changed in a binary or two. In attempting to get past this, I have attempted various regex variations:
PHP Code:pdfgrep -i -PB 20 -A 20 "audit\:" ./Oracle-12.pdf | gawk '{IGNORECASE=1;} /show.*|;/ || /select.*|;/ {print "Here is the data \n\n",$0, "\n"}'
pdftotext ./Oracle-12.pdf - | grep -i "select.*\; | show.*\;"
gawk '{IGNORECASE=1;} /show.*|;/ || /select.*|;/ {print "The Goodies \n\n",$0, "\n"}' ./Oracle-12.pdf.txt
Can someone shed some light? I am using distro Kali 2020.1 which I upgrade from 2019.4 and now the original string doesnt work. Thanks