Home » Server Options » RAC & Failsafe » Clusterware doesn't start with Jumbo Frames enabled
Clusterware doesn't start with Jumbo Frames enabled [message #282256] Wed, 21 November 2007 05:00 Go to next message
thomas.vogt@sec101.ch
Messages: 5
Registered: November 2007
Location: Switzerland
Junior Member
Hi all

I'm running a 4 Node RAC Cluster (Enterprise Linux 4 U5, Oracle 10.2.0.2, Clusterware 10.2.0.2) with a Gb-Ethernet Cluster Interconnect.
Everything works perfect.

But, when I enable Jumbo Frames on the Interconnect Interface (set MTU to 9000, in /etc/sysconfig/network-scripts/ifcg-eth1) and restart all nodes, the cluster cannot be formed, meaning the clusterware starts only on the first node.

In the log files, can I see for example (on second node):

$ tail crsd.log

007-11-21 11:56:50.087: [ CSSCLNT][2541194816]clsssInitNative: connect failed, rc 9

2007-11-21 11:56:50.087: [ CRSRTI][2541194816]0CSS is not ready. Received status 3 from CSS. Waiting for good status ..

2007-11-21 11:56:51.488: [ COMMCRS][1084229984]clsc_connect: (0xb7fb40) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_xen4_crs))

2007-11-21 11:56:51.488: [ CSSCLNT][2541194816]clsssInitNative: connect failed, rc 9

2007-11-21 11:56:51.488: [ CRSRTI][2541194816]0CSS is not ready. Received status 3 from CSS. Waiting for good status ..

2007-11-21 11:56:52.890: [ COMMCRS][1084229984]clsc_connect: (0xb7fb40) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_xen4_crs))

2007-11-21 11:56:52.890: [ CSSCLNT][2541194816]clsssInitNative: connect failed, rc 9

2007-11-21 11:56:52.890: [ CRSRTI][2541194816]0CSS is not ready. Received status 3 from CSS. Waiting for good status ..


If I set the MTU to default and restart the nodes, everything works fine, again.


Thanks for every hint.


cheers
- thomas
Re: Clusterware doesn't start with Jumbo Frames enabled [message #282486 is a reply to message #282256] Thu, 22 November 2007 02:35 Go to previous messageGo to next message
thomas.vogt@sec101.ch
Messages: 5
Registered: November 2007
Location: Switzerland
Junior Member
A few details more:

Node 1:
# ifconfig eth1
eth1 Link encap:Ethernet HWaddr 00:13:21:78:63:99
inet addr:10.0.0.1 Bcast:10.0.0.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:9000 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:15 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 b) TX bytes:960 (960.0 b)
Base address:0x5040 Memory:fdd60000-fdd80000

# cat /etc/sysconfig/network-scripts/ifcfg-eth1
DEVICE=eth1
BOOTPROTO=static
HWADDR=00:13:21:78:63:99
ONBOOT=yes
TYPE=Ethernet
IPADDR=10.0.0.1
NETMASK=255.255.255.0
MTU=9000

# cat /var/log/messages | grep eth1
Nov 22 08:42:23 xen1 kernel: e1000: eth1: e1000_probe: Intel(R) PRO/1000 Network Connection
Nov 22 08:42:23 xen1 kernel: e1000: eth1: e1000_watchdog_task: NIC Link is Up 1000 Mbps Full Duplex
Nov 22 08:42:21 xen1 network: Bringing up interface eth1: succeeded
Nov 22 08:42:41 xen1 ntpd[6374]: Listening on interface eth1, 10.0.0.1#123

# ping 10.0.0.2
PING 10.0.0.2 (10.0.0.2) 56(84) bytes of data.
64 bytes from 10.0.0.2: icmp_seq=0 ttl=64 time=0.261 ms
64 bytes from 10.0.0.2: icmp_seq=1 ttl=64 time=0.231 ms
64 bytes from 10.0.0.2: icmp_seq=2 ttl=64 time=0.274 ms

--- 10.0.0.2 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2001ms
rtt min/avg/max/mdev = 0.231/0.255/0.274/0.022 ms, pipe 2


Node 2:
# ifconfig eth1
eth1 Link encap:Ethernet HWaddr 00:13:21:78:17:F5
inet addr:10.0.0.2 Bcast:10.0.0.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:9000 Metric:1
RX packets:7778 errors:0 dropped:0 overruns:0 frame:0
TX packets:8319 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:950350 (928.0 KiB) TX bytes:944232 (922.1 KiB)
Base address:0x5040 Memory:fde60000-fde80000

# cat /etc/sysconfig/network-scripts/ifcfg-eth1
DEVICE=eth1
BOOTPROTO=static
HWADDR=00:13:21:78:17:F5
ONBOOT=yes
TYPE=Ethernet
IPADDR=10.0.0.2
NETMASK=255.255.255.0
MTU=9000

# cat /var/log/messages| grep eth1
Nov 22 08:53:00 xen2 kernel: e1000: eth1: e1000_probe: Intel(R) PRO/1000 Network Connection
Nov 22 08:53:01 xen2 kernel: e1000: eth1: e1000_watchdog_task: NIC Link is Up 1000 Mbps Full Duplex
Nov 22 08:52:59 xen2 network: Bringing up interface eth1: succeeded
Nov 22 08:53:20 xen2 ntpd[6052]: Listening on interface eth1, 10.0.0.2#123

# ping 10.0.0.1
PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.
64 bytes from 10.0.0.1: icmp_seq=0 ttl=64 time=0.157 ms
64 bytes from 10.0.0.1: icmp_seq=1 ttl=64 time=0.140 ms
64 bytes from 10.0.0.1: icmp_seq=2 ttl=64 time=0.190 ms

--- 10.0.0.1 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2001ms
rtt min/avg/max/mdev = 0.140/0.162/0.190/0.023 ms, pipe 2


Another thing, I'm using ocfs2 Filesystem which use the same network interconnect as the Clusterware and ocfs also works with Jumbo Frames enabled.

Could it be a problem that I installed Clusterware with an MTU=1500 and later changed it to MTU=9000. Is it possible that the MTU is stored in a configuration file used by the Clusterware ?

just an idea;;


Thank you in advance for your hints.

- thomas
Re: Clusterware doesn't start with Jumbo Frames enabled [message #286750 is a reply to message #282486] Mon, 10 December 2007 01:25 Go to previous message
thomas.vogt@sec101.ch
Messages: 5
Registered: November 2007
Location: Switzerland
Junior Member
OK, I have solved the problem.

The cluster interconnect switch I'm using is a HP ProCurve 1800 24G. In specs is only written that the Switch supports Jumbo Frames up to 9K size.

The problem is, that the switch doesn't have jumbo frames enabled per default. I had to login to the web configuration interface of the switch and there is an option "Enable Jumbo Frames".

Now it works Wink

cheers
- thomas
Previous Topic: RAC node service lost
Next Topic: IBM Pseries Servers requirements
Goto Forum:
  


Current Time: Fri Mar 29 06:34:07 CDT 2024