Hi All,
I am looking for the possible solution for problem on customer's environment.
I have read a lot of articles and posts. And did not found any solution. None of solutions (i.e. keepalive timeout, etc.) doesn't work for us.
We have a lot of servers which are behind the firewall. These servers are for Oracle databases.
And the problem is that the parent job is terminating after about 2 hours and backup ended with 636 error - read from input socket failed. All child jobs ending with 0.
The reason of the situation is firewall session which is set to 14400 half-second - 2 hours. Terminated session is established between client and media server.
I know the recomended solution is that we should increase the timeout for inactive session on firewall but LAN administrators don't want do this.
And here is my question. Is there any way to make this TCP session "active" for parent job during whole backup session? Maybe some output from RMAN script could be redirected to media server?
Media server is running AIX and below are tcp_* settings:
tcp_keepcnt = 8
tcp_keepidle = 28800
tcp_keepinit = 150
tcp_keepintvl = 150
Other timeout settings on master or client are also set to be above 2 hours.
We have tested on servers which are not behind the firewall and parent job is running longer then 2 hours. So we are sure thet the problem is firewall.
Master server: RHEL, NBU 7.6.1.1
Media server: AIX 7.1, NBU 7.5.0.3
Clients: various versions 7.5.0.4 to 7.6.1.1 most of them is AIX.
Any suggestion would be appreciated.
Regards
Madej