Paralleliziing perl scripts
by Turbocapitalist from LinuxQuestions.org on (#4Y1NG)
I have some perl scripts which sequentially collect specific data files from the net over HTTPS. The sequence is fetch, parse, append central log, repeat. Since each of the data files are from separate sites, running the process in parallel would not put a different load on the source machines and the aggregator itself has enough resources to a lot more at the same time.
The problem is that all the parsed data must end up in the same destination file eventually. The overall order, based on source site, does not matter. So timing is not really an issue, except for a site timing out. Thus I think this task might do well to be modified from sequential to run in parallel instead.
What would be the approximate workflow to do it in parallel Would it be reasonable to just run everything independently and then use a lock file for the centrally placed log?


The problem is that all the parsed data must end up in the same destination file eventually. The overall order, based on source site, does not matter. So timing is not really an issue, except for a site timing out. Thus I think this task might do well to be modified from sequential to run in parallel instead.
What would be the approximate workflow to do it in parallel Would it be reasonable to just run everything independently and then use a lock file for the centrally placed log?