pdfgrep vertical text
by fdq09eca from LinuxQuestions.org on (#54NJY)
Hi, I am very new to linux.. please bare with me if I am asking stupid question.
I have a bunch of pdf that may not be written in format (randomly downloaded from google scholar). I am trying to check through their data source, so I used pdfgrep for the task.
It was successful most of the time until I found out that are some tables vertically placed. I attempt to rotate before grep-ing them
Code:pdftk pdfs/25770596.pdf cat 8east output pdfs/25770596_r.pdfbut it makes no difference.
then I tried
Code:pdftotext -f 1 -l 8 pdfs/25770596_r.pdf pdfs/25770596r_txt.txtwhich actually served my purpose. It rendered some caption line in the vertical page but the vertical table is messed up, the numbers are in chaos.. and there are some numbers missing.
I would like to know if there is any more elegant way to complete the task?
The .pdf is here.
Thank you.


I have a bunch of pdf that may not be written in format (randomly downloaded from google scholar). I am trying to check through their data source, so I used pdfgrep for the task.
It was successful most of the time until I found out that are some tables vertically placed. I attempt to rotate before grep-ing them
Code:pdftk pdfs/25770596.pdf cat 8east output pdfs/25770596_r.pdfbut it makes no difference.
then I tried
Code:pdftotext -f 1 -l 8 pdfs/25770596_r.pdf pdfs/25770596r_txt.txtwhich actually served my purpose. It rendered some caption line in the vertical page but the vertical table is messed up, the numbers are in chaos.. and there are some numbers missing.
I would like to know if there is any more elegant way to complete the task?
The .pdf is here.
Thank you.