PaceMaker - node fenced when resource failed to move
by weisman from LinuxQuestions.org on (#5HH6V)
We have a three node pcs cluster that holds multiple Oracle database resources
Each resource group consist of the file system , volume group the Oracle database and listener.
We had to move few of the groups from one cluster node to another for maintenance purposes.
Few of the times the file system or volume group was not able to be unmounted ( due to some locking or in use' issues )
Unfortunately , this is causing a fencing of the entire node , which means that all the oracle databases on that node goes doe because of a specific issue on one file system.
From documentation :
Failure to start or stop will immediately set the failcount for a resource to INFINITY, forcing it to move to a different node. If fencing is enabled, a node that failed to stop a resource will also be fenced.
My question is :
Other than increasing the timeout parameter, is there a way to avoid fencing an entire node if a file system/volume group fail to unmount ?
I cannot risk multiple databases on that node going down , if one resource fail to stop properly
That looks like poor design choice.
Why not allow pcs to set the resource as FAILED ?


Each resource group consist of the file system , volume group the Oracle database and listener.
We had to move few of the groups from one cluster node to another for maintenance purposes.
Few of the times the file system or volume group was not able to be unmounted ( due to some locking or in use' issues )
Unfortunately , this is causing a fencing of the entire node , which means that all the oracle databases on that node goes doe because of a specific issue on one file system.
From documentation :
Failure to start or stop will immediately set the failcount for a resource to INFINITY, forcing it to move to a different node. If fencing is enabled, a node that failed to stop a resource will also be fenced.
My question is :
Other than increasing the timeout parameter, is there a way to avoid fencing an entire node if a file system/volume group fail to unmount ?
I cannot risk multiple databases on that node going down , if one resource fail to stop properly
That looks like poor design choice.
Why not allow pcs to set the resource as FAILED ?