PCIe Network Express Module Usr Guide
PCIe Network Express Module Usr Guide
Copyright © 2007 Sun Microsystems, Inc., 4150 Network Circle, Santa Clara, California 95054, Etats-Unis. Tous droits réservés.
Sun Microsystems, Inc. détient les droits de propriété intellectuels relatifs à la technologie incorporée dans le produit qui est décrit dans ce
document. En particulier, et ce sans limitation, ces droits de propriété intellectuelle peuvent inclure un ou plus des brevets américains listés à
l'adresse http://www.sun.com/patents et un ou les brevets supplémentaires ou les applications de brevet en attente aux Etats - Unis et dans les
autres pays.
Des parties de ce produit pourront être dérivées des systèmes Berkeley BSD licenciés par l'Université de Californie. UNIX est une marque
déposée aux Etats-Unis et dans d'autres pays et licenciée exclusivement par X/Open Company, Ltd.Sun,
Sun Microsystems, le logo Sun, Solaris et Sun Blade sont des marques de fabrique ou des marques déposées de Sun Microsystems, Inc. aux
Etats-Unis et dans d'autres pays.
L'interface d'utilisation graphique OPEN LOOK et Sun(TM) a été développée par Sun Microsystems, Inc. pour ses utilisateurs et licenciés. Sun
reconnaît les efforts de pionniers de Xerox pour la recherche et le développement du concept des interfaces d'utilisation visuelle ou graphique
pour l'industrie de l'informatique. Sun détient une license non exclusive de Xerox sur l'interface d'utilisation graphique Xerox, cette licence
couvrant également les licenciés de Sun qui mettent en place l'interface d'utilisation graphique OPEN LOOK et qui, en outre, se conforment aux
licences écrites de Sun.
L'utilisation de pieces detachees ou d'unites centrales de remplacement est limitee aux reparations ou a l'echange standard d'unites centrales
pour les produits exportes, conformement a la legislation americaine en matiere d'exportation. Sauf autorisation par les autorites des Etats-
Unis, l'utilisation d'unites centrales pour proceder a des mises a jour de produits est rigoureusement interdite.
LA DOCUMENTATION EST FOURNIE "EN L'ETAT" ET TOUTES AUTRES CONDITIONS, DECLARATIONS ET GARANTIES EXPRESSES
OU TACITES SONT FORMELLEMENT EXCLUES, DANS LA MESURE AUTORISEE PAR LA LOI APPLICABLE, Y COMPRIS NOTAMMENT
TOUTE GARANTIE IMPLICITE RELATIVE A LA QUALITE MARCHANDE, A L'APTITUDE A UNE UTILISATION PARTICULIERE OU A
L'ABSENCE DE CONTREFACON.
Please
Recycle
Contents
Preface vii
iii
▼ Troubleshoot Using ILOM CLI 22
Contents v
vi Sun InfiniBand NEM • March 2007
Preface
The Sun 10-Port 4X DDR InfiniBand Network Express Module User’s Guide describes
how to install and configure the Sun™ InfiniBand 10-Port Network Express Module
(NEM) in a powered-on Sun Blade™ 8000 Series Modular System.
These instructions are designed for enterprise system administrators with experience
installing network hardware and software.
Chapter 2 describes how to replace or install the IB NEM and verify that it has been
installed correctly. It also describes how to remove the NEM.
vii
Using UNIX Commands
This document might not contain information about basic UNIX® commands and
procedures such as shutting down the system, booting the system, and configuring
devices. Refer to the following for this information:
■ Software documentation that you received with your system
■ Solaris™ Operating System documentation, which is at:
http://docs.sun.com
Shell Prompts
Shell Prompt
C shell machine-name%
C shell superuser machine-name#
Bourne shell and Korn shell $
Bourne shell and Korn shell superuser #
Typographic Conventions
Typeface* Meaning Examples
viii Sun 10-Port 4X DDR InfiniBand Network Express Module User’s Guide • March 2007
Related Documentation
The documents listed as online are available at:
http://www.sun.com/documentation/
Documentation http://www.sun.com/documentation/
Support http://www.sun.com/support/
Training http://www.sun.com/training/
Preface ix
Sun Welcomes Your Comments
Sun is interested in improving its documentation and welcomes your comments and
suggestions. You can submit your comments by going to:
http://www.sun.com/hwdocs/feedback
Please include the title and part number of your document with your feedback:
Sun 10-Port 4X DDR InfiniBand Network Express Module User’s Guide, Sun part
number: 820-0810-10.
x Sun 10-Port 4X DDR InfiniBand Network Express Module User’s Guide • March 2007
CHAPTER 1
This chapter provides an overview of the Sun 10-Port 4X DDR InfiniBand (IB) PCI
Express (PCIe) Network Express Module (NEM). It also lists the host platforms and
operating systems that support the IB NEM.
You can order additional Sun 10-Port IB NEMs from Sun Microsystems using the
following Marketing part number: X1289A-Z.
Product Features
The Sun IB NEM is a 10-port Network Express Module (NEM) that provides one
InfiniBand DDR 4X connection to each Sun Blade Server Module. Each 4X DDR port
supports a single x8 PCI Express (PCIe) connection to its Server Module, providing
full-duplex data transfers of up to 16 Gbps.
The NEM supports 10 independent InfiniBand Host Channel Adapter (HCA) ports
operating at up to 20-Gbps (DDR) speed. It is also backward compatible with
10-Gbps (SDR) devices.
1
FIGURE 1-1 IB NEM Back Panel
Feature Description
2 Sun 10-Port 4X DDR InfiniBand Network Express Module User’s Guide • March 2007
Platform and Operating System Support
This section provides information about selected platforms that are compatible with
the heterogeneous InfiniBand network design.
TABLE 1-2 lists operating system support for the IB NEM on the Sun Blade 8000 Series
Modular System.
Red Hat Enterprise Linux Advanced Server v.4, Update 3 (RHEL AS 4-U3) or later
(32-bit and 64-bit)
SUSE Linux Enterprise Server 9 Service Pack 3 (SLES9 SP3) or later (64-bit)
Enterprise Server 10 (SLES10) (64-bit)
Microsoft Windows Microsoft Windows Server 2003 Standard Edition with Service
Pack 1
Microsoft Windows Server 2003 Enterprise Edition with
Service Pack 1
Microsoft Windows Server 2003 Standard Edition R2
Microsoft Windows Server 2003 Enterprise Edition R2
Microsoft Windows Server 2003 Standard Edition x64
Microsoft Windows Server 2003 Enterprise Edition x64
Microsoft Windows Server 2003 Standard Edition x64 R2
Microsoft Windows Server 2003 Enterprise Edition x64 R2
Note – Currently, Solaris™ Operating System support for the IB NEM is not
available. For the latest support information, see the Sun Blade 8000 Series Product
Notes (Sun Part number: 819-5651) or the following Sun web site:
http://www.sun.com/documentation/
TABLE 1-3 provides additional details on the operating systems that provide support
for the IB NEM.
For more information about the InfiniBand software stack and related topics, see
Chapter 3.
4 Sun 10-Port 4X DDR InfiniBand Network Express Module User’s Guide • March 2007
FIGURE 1-2 Sun IB NEM Back Panel Indicators, Buttons, and Ports
1 Ejector lever To remove the NEM, you open this ejector lever.
2 Ready-to-Remove Blue Provides the following indications:
indicator • Steady On – Lights when it is safe to remove the
associated NEM from the NEM slot in the
chassis.
• Off – The NEM is not ready for removal.
This indicator is normally lit from the ILOM web
interface through the Chassis Monitoring Module
(CMM) or by pressing the Attention button.
3 Service Action Amber Provides the following indications:
Required indicator • Steady On – Lights when there is a fault
associated with the NEM.
• Off – The NEM has no fault condition.
6 Sun 10-Port 4X DDR InfiniBand Network Express Module User’s Guide • March 2007
Each InfiniBand port has two LED indicators, as shown in FIGURE 1-3. TABLE 1-5 lists
and describes the LED indicators.
1 Green This physical link LED illuminates when the port is electrically
active, that is, when a driver is attached and a physical link to a
remote switch (or, possibly an HCA) has been established. This
indicator is on the left above the cable connector.
2 Yellow This logical link LED illuminates when the InfiniBand port is in
the UP or active state and data can flow through the connection.
This indicator is on the right above the cable connector.
This chapter describes how to replace an InfiniBand (IB) NEM in a powered-on Sun
Blade 8000 Series Chassis. It also includes instructions to verify that the replacement
NEM has been installed correctly.
Caution – Damage to the IB NEM can occur as the result of careless handling or
electrostatic discharge (ESD). Always handle the NEM with care to avoid damage to
electrostatic sensitive components. To minimize the possibility of ESD-related
damage, Sun strongly recommends using both a workstation antistatic mat and an
ESD wrist strap. You can get an ESD wrist strap from any reputable electronics store
or from Sun as part number 250-1007.
You can install the IB NEM in the following Sun Blade 8000 Series Chassis:
■ Sun Blade 8000 Chassis
■ Sun Blade 8000 P Chassis
9
When you remove a NEM, you must replace it within two minutes to prevent
adjacent modules from overheating. If you are removing but not replacing the NEM,
you must install a NEM filler panel to meet FCC limits for electromagnetic
interference (EMI) and to ensure proper airflow and cooling.
Note – If installing an IB NEM in a Sun Blade 8000 Series Chassis that has not been
powered on, see the Sun Blade 8000 Series Installation Guide (Sun Part number:
819-5647).
2. Prepare the IB NEM for a hot-plug procedure. Use either of these methods:
■ Press the Attention button on the IB NEM to initiate the hot-plug removal.
The green OK LED will blink for up to one minute, indicating that the IB NEM
is being prepared for removal.
To abort the operation, press the Attention button again within five seconds.
Once the green LED goes dark and the blue LED is illuminated, you can safely
remove the NEM.
■ Use the ILOM web interface or the command-line interface (CLI) to initiate
the hot-plug removal.
If the IB NEM fails the hot-plug preparation and its Ready-to-Remove
indicator does not light, see “Troubleshooting a Hot-Remove Operation” on
page 20.
3. When the blue Ready-to-Remove LED is lit, physically remove the IB NEM as
follows:
b. Press the latch on both ejector levers inward at the same time.
10 Sun 10-Port 4X DDR InfiniBand Network Express Module User’s Guide • March 2007
d. Slide the NEM out of its slot.
Support the weight of the IB NEM with one hand at the bottom of the NEM.
Note – Before installing the replacement IB NEM, locate its GUID (Globally Unique
Identifier) printed on the rear of the NEM. Record the GUID and the number of the
slot into which you are about to install the NEM and keep this data for future
reference.
2. Align the replacement IB NEM with the chassis guidance system and slide the
NEM into its slot until the ejector levers engage and start to close.
Ensure that the NEM engages with the chassis guidance system. Failure to align
the NEM correctly can result in damage to the NEM's internal connections to the
chassis midplane.
3. Close the levers to secure the IB NEM in its slot. They will click when locked.
Ensure that the back plate on the module mounts flush with the chassis panel
opening.
The green OK indicator on the NEM should be in Standby Blink mode.
Caution – Avoid putting unnecessary stress on the connection. Do not bend or twist
the cable near the connectors and avoid sharp cable bends of more than
90 degrees.
6. If you have not already done so, connect the other end of the InfiniBand
cable(s) to the appropriate port(s) on an InfiniBand switch.
Note – If you are replacing an IB NEM, you will not need to install the InfiniBand
software packages. The appropriate software package will have been installed and
configured as part of the initial IB NEM installation.
Verifying Installation
If you have not installed the IB NEM in the chassis and connected it to an
operational InfiniBand switch, do so before you attempt to verify the installation.
The InfiniBand switch should automatically recognize InfiniBand servers when they
are connected to the fabric.
12 Sun 10-Port 4X DDR InfiniBand Network Express Module User’s Guide • March 2007
▼ Verify Hardware Installation
1. Once you have physically installed the IB NEM and ensured that the cables are
connected to the IB NEM and switch(es), ensure that an IB subnet manager is
running on the connected InfiniBand fabric (network).
If the green port LED is illuminated, you have successfully completed the
hardware installation and you can proceed to verification through the ILOM
interface(s). The green LED indicates that the port is enabled, that is, that a
physical link to a remote switch (or, possibly an HCA) has been established.
If the port LEDs are not illuminated, one possible cause might be that the
InfiniBand drivers are not installed. You cannot verify a complete installation on
Linux or Windows until you install these drivers.
2. You can now examine hardware status through one of the ILOM (Integrated
Lights Out Manager) interfaces. Use one of the following procedures:
■ “Verify Installation Using the ILOM Web Interface” on page 13
■ “Verify Installation Using the ILOM CLI” on page 16
For a description of the possible states of the NEM LEDs, see “Verify Component
Status Using the LEDs” on page 18.
3. Select the System Information tab and then select the Components tab.
The Component Management page appears.
14 Sun 10-Port 4X DDR InfiniBand Network Express Module User’s Guide • March 2007
4. Select the IB NEM component name. (You may need to scroll down in the
Component Management Status page.)
The ILOM page showing the NEM status details appears.
5. If you are physically near the IB NEM, you can examine its LEDs to verify that
it has returned the expected feedback.
See “Verify Component Status Using the LEDs” on page 18.
16 Sun 10-Port 4X DDR InfiniBand Network Express Module User’s Guide • March 2007
3. To verify that the IB NEM is installed, that is, that the Ready To Remove status
is Not Ready, enter:
Properties:
type = Network Express Module FRU
board_part_number = 501-7460-04
board_serial_number = 0060HSV-0649123404
board_product_name = ASSY,ANDY,NEM,IB_PASS_THROUGH_MODULE
product_name = SUN BLADE 8000 NEM IB DDR 10PT UNSWITCHED
product_manufacturer = SUN MICROSYSTEMS
product_version = (none)
product_part_number = (none)
product_serial_number = (none)
fault_state = OK
clear_fault_action = (none)
prepare_to_remove_status = NotReady
prepare_to_remove_action = (none)
return_to_service_action = (none)
blade0_link_status = Connected
blade2_link_status = Connected
blade4_link_status = Connected
Note that the NEM prepare_to_remove status is NotReady and that the
bladen_link_status is Connected, indicating successful hardware
installation.
4. If you are physically near the IB NEM, you can examine its LEDs to verify that
it has returned the expected feedback.
See “Verify Component Status Using the LEDs” on page 18.
To verify successful IB NEM insertion, check the following LEDs and the ILOM
interface status:
2. When the Attention button is pressed to activate the NEM, the green OK indicator
transitions to Slow Blink.
3. When all links to active blades have been made, the green OK indicator
transitions to Steady On.
4. In the ILOM interfaces, the Ready to Remove status shows Not Ready.
18 Sun 10-Port 4X DDR InfiniBand Network Express Module User’s Guide • March 2007
▼ Verify Installation on Linux
● To determine whether the IB NEM is visible to the Linux OS, enter the lspci
command.
Output similar to the following appears.
> lspci
00:00.0 Memory controller: nVidia Corporation CK804 Memory
Controller (rev a3)
00:01.0 ISA bridge: nVidia Corporation CK804 ISA Bridge (rev a3)
...
01: 01.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1064
PCI-X Fusion-MPT SAS (rev 02)02:00.0 Ethernet controller: Intel
Corporation 82571EB Gigabit Ethernet Controller (rev 06)
02:00.1 Ethernet controller: Intel Corporation 82571EB Gigabit
Ethernet Controller (rev 06)
80:00.0 Memory controller: nVidia Corporation CK804 Memory
Controller (rev a3)
80:01.0 Memory controller: nVidia Corporation CK804 Memory
Controller (rev a3)
80:0e.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3)
81:00.0 InfiniBand: Mellanox Technologies MT25204 [InfiniHost III
Lx HCA] (rev 20)
The last entry in the sample output (InfiniHost III Lx HCA) verifies the hardware
installation and confirms the NEM’s availability to the Linux host.
● Click Cancel to close the Found New Hardware window and exit this instance
of the Found New Hardware Wizard.
Troubleshooting a Hot-Remove
Operation
Because IB NEMs are shared resources, all Sun Blade Server Modules must respond
favorably to the PCI hot-remove request. However, a blade might not relinquish the
link to a NEM if, for instance, there are busy NFS mounted volumes, file transfers,
and so on.
To determine the state of the NEM-to-blade connections, you can use the ILOM web
interface or the ILOM command-line interface, as described in the following
procedures.
20 Sun 10-Port 4X DDR InfiniBand Network Express Module User’s Guide • March 2007
3. Select the System Information tab and then select the Components tab.
The Component Management page appears.
5. Perform the appropriate host OS procedure for releasing the NEM from the
blade.
2. Enter the following command, where n is the number of the NEM in question.
Properties:
type = Network Express Module FRU
board_part_number = 501-7460-04
board_serial_number = 0060HSV-0649123404
board_product_name = ASSY,ANDY,NEM,IB_PASS_THROUGH_MODULE
product_name = SUN BLADE 8000 NEM IB DDR 10PT UNSWITCHED
product_manufacturer = SUN MICROSYSTEMS
product_version = (none)
product_part_number = (none)
product_serial_number = (none)
fault_state = OK
clear_fault_action = (none)
prepare_to_remove_status = Ready
prepare_to_remove_action = (none)
return_to_service_action = (none)
blade0_link_status = Not_present
blade2_link_status = Not_present
blade4_link_status = Not_present
3. Perform the appropriate host OS procedure for releasing the NEM from the
blade.
22 Sun 10-Port 4X DDR InfiniBand Network Express Module User’s Guide • March 2007
CHAPTER 3
This chapter provides an overview and installation instructions for the InfiniBand
software stack for the Linux and Windows operating systems.
Consult the Sun Blade 8000 Series Product Notes for the most recent information about
supported operating systems, firmware and software updates, and other issues not
covered in the main product documentation.
Specifically, RHEL AS 4-U4 contains support in the kernel for HCA hardware
produced by Mellanox (mthca driver). The kernel also includes core InfiniBand
modules, which provide the interface between the lower-level hardware driver and
the upper-layer InfiniBand protocol drivers and provide user space access to
InfiniBand hardware.
23
The kernel also includes the Sockets Direct Protocol (SDP) driver, IP over Infiniband
(IPoIB) and the SCSI RDMA Protocol (SRP) driver.
Note – These package names can change, depending on the Linux OS.
24 Sun 10-Port 4X DDR InfiniBand Network Express Module User’s Guide • March 2007
The packages selected to support any given configuration will vary. TABLE 3-1 lists
the packages considered the absolute minimum needed to support the environment
described in this guide.
If you elected not to install these packages when installing the Linux OS or if you
want to upgrade your drivers, you can install these packages at any time from the
OS distribution source or by downloading the required files from OpenFabrics.org.
For information on both of these procedures, see “Installing the InfiniBand Drivers
on Linux” on page 26.
RHEL AS 4-U3 or later For RHEL AS 4-U3, Sun has tested OFED Release 1.0 of the
OpenFabrics stack. RHEL AS 4-U4 includes a built-in version
of Release 1.0.
SLES9 SP3 or later, SLES10 Sun has tested OFED Release 1.0 for the SLES10 platform.
If you need to determine whether or not the drivers are already installed, see “Verify
Driver Installation on Linux” on page 32.
26 Sun 10-Port 4X DDR InfiniBand Network Express Module User’s Guide • March 2007
▼ Install IB Drivers From Linux Distribution
Source
To install the InfiniBand drivers, you need access to the Red Hat Package Manager
(RPM) files. Access to these files is dependent on your individual installation
configuration (net boot, CD/DVD boot, .iso files, and so on). When you decide on
the appropriate access method and package selection, you can add the packages to
the KickStart configuration file (on RHEL) for automatic inclusion in future
installations.
Note – On a 32-bit RHEL4 system, all packages have a .i386.rpm extension (as
shown in the following procedure). On a 64-bit RHEL4 system, all packages have a
.x86_64.rpm extension instead.
1. Enter the rpm -ivh command for each InfiniBand package that you need to
install.
Packages must be installed in the following order:
■ libibcommon
■ libibumad
■ libibmad
■ openib-diags
■ mstflint
■ perftest
The following example shows the installation of one package (libibcommon)
and the resulting dialog on an RHEL AS 4-U4 32-bit system:
2. If you are running the CSH or TCSH shell, enter the rehash command to
rebuild the shell’s view of available executables.
> ibstat
CA 'mthca0'
CA type: MT25204
Number of ports: 1
Firmware version: 1.1.0
Hardware version: a0
Node GUID: 0x001b00000ca72640
System image GUID: 0x001b00000ca72643
Port 1
State: Active
Physical state: LinkUp
Rate: 20
Base lid: 71
LMC: 0
SM lid: 2
Capability mask: 0x02510a68
Port GUID: 0x001b00000ca72641
4. (Optional) You can enter the ibnetdiscover command to verify the presence
of an operational IB fabric.
For an example of the output of this command, see “Verify Driver Installation on
Linux” on page 32.
5. (Optional) You can check the status of the ib0 network interface to determine
whether the ib_ipoib driver is installed.
For details on this step, see “Install IPoIB Driver” on page 46.
Note – You will need write access to the files to execute the install script.
3. From the OFED-1.0 directory, initiate the installation process by entering the
following command:
> ./install.sh
28 Sun 10-Port 4X DDR InfiniBand Network Express Module User’s Guide • March 2007
4. When the InfiniBand OFED Distribution Software Installation Menu appears,
enter option 2 (Install OFED Software).
5. When the Select OFED Software menu appears, enter option 3 (All packages).
6. When you are asked if you wish to create/install an MPI RPM with gcc,
enter n.
7. Next, you are asked if you wish to create/install an openmpi RPM with gcc.
Again, enter n.
8. The installation script then lists the OFED packages that it will build. See the
following sample output.
Following is the list of OFED packages that you have chosen (some
may have been added by the installation program due to package
dependencies):
ib_ipath
ib_ipoib
...
mpitests
ibutils
9. Enter Y to continue.
10. Next, you are prompted to configure InfiniBand IP support. Enter Y when
asked if you want to include IPoIB configuration files.
Do you want to include IPoIB configuration files (ifcfg-ib*)? [Y/n]:
11. Press Enter to accept the default when prompted to enter a temporary directory
for OFED.
30 Sun 10-Port 4X DDR InfiniBand Network Express Module User’s Guide • March 2007
15. The system next displays the current configuration. When asked if you want to
change the configuration as displayed, enter y.
16. The configuration script guides you through the changes one at a time. See the
following as an example.
Enter an IP Address:10.0.0.52
Enter the Netmask: 255.255.255.0
Enter the Network:10.0.0.0
Enter the Broadcast Address:10.0.0.255
Start Device On Boot? [Y/n]:Y
Selected configuration:
IPADDR=10.0.0.52
NETMASK=255.255.255.0
NETWORK=10.0.0.0
BROADCAST=10.0.0.255
ONBOOT=yes
19. Once all IPoIB interfaces have been configured, you are prompted as follows to
configure OpenSM for the blade.
Do you want to configure OpenSM [Y/n]?
20. Enter n to complete this part of the installation. You should see a message like
the following:
The Sun Blade Server Module is configured now to start up the InfiniBand software
on reboot (ONBOOT=yes).
After successful installation, it is recommended that you reboot the Server Module.
After reboot, the Server Module should come up as a functional member of the
InfiniBand fabric.
Note – When using the openibd command, enter the entire path as shown in the
example.
The following example shows the IB driver installed, running and presenting one
IB HCA channel or network device (ibn) to the OS. In the example, the Linux
network device appears as ib0.
32 Sun 10-Port 4X DDR InfiniBand Network Express Module User’s Guide • March 2007
2. To view details of operational status, enter the ibstat command.
The following example shows one operational IB channel into the IB fabric (or
network). The LinkUp state indicates active participation in an IB fabric. It is
present as lid 69 and it is being managed by lid 2.
> ibstat
CA 'mthca0'
CA type: MT25204
Number of ports: 1
Firmware version: 1.1.0
Hardware version: a0
Node GUID: 0x001b00000ca72620
System image GUID: 0x001b00000ca72623
Port 1
State: Active
Physical state: LinkUp
Rate: 20
Base lid: 69
LMC: 0
SM lid: 2
Capability mask: 0x02510a68
Port GUID: 0x001b00000ca72621
You can also verify that the InfiniBand fabric is operational by entering the
ibnetdiscover command. The output from this command will list all the
nodes, as shown in the following sample output.
> ibnetdiscover
#
# Topology file: generated on Thu Jan 11 15:19:59 2007
#
# Max of 4 hops discovered
# Initiated from node 001b00000ca72620 port 001b00000ca72621
vendid=0x8f1
devid=0x5a31
sysimgguid=0x8f10400411ef9
switchguid=0x8f10400411ef8
34 Sun 10-Port 4X DDR InfiniBand Network Express Module User’s Guide • March 2007
■ Support for Microsoft Windows Server 2003 and Windows Compute Cluster
Server (CCS) 2003
TABLE 3-3 provides details about the InfiniBand software components in the WinIB
Collection version 1.3.0.1347.
IBAL (InfiniBand Access Layer) driver, Core – IB Access Layer, lower-level 1.3.0.1347
Access Layer, Upper Layer Protocols drivers.
IPoIB protocol IP communications over IB fabric. 1.3.0.1347
OpenSM for Windows IB-compliant subnet manager (SM) 1.8.1
and administrator running on top of
OpenIB.
Performance test (perf_main) Application reporting latency and
bandwidth HCA performance.
Documentation IBAL API HTML documentation,
license file, code examples.
Hardware and software requirements for the Sun Blade Server Module are as
follows:
■ Installed IB NEM
■ Windows Server 2003 (32-bit or 64-bit)
■ Administrator privileges on your Server Module
■ Disk space for installation: 10 Mbytes
After you have met these requirements, download and install one of the WinIB
packages for Windows from the Sun Blade 8000 Series Resource CD.
TABLE 3-4 summarizes Windows platforms and the required InfiniBand drivers. For
additional platform details, see TABLE 1-2.
2. When the Found New Hardware Wizard appears, select No, not this time and
click Next to continue.
36 Sun 10-Port 4X DDR InfiniBand Network Express Module User’s Guide • March 2007
4. The License Agreement screen appears. Read the license agreement carefully.
If a printer has been configured on the blade, you can click Print to send the
agreement to the default printer. You must accept the license agreement to
continue. After selecting the acceptance radio button, click Next to continue.
5. The Customer Information screen is displayed. Enter your User Name and
Company information and click Next to continue.
7. The Setup Type screen appears. Select the Complete radio button and click
Next to continue.
9. The setup process begins. A cmd window opens in the background and
minimizes to the taskbar. Setup pauses at driver installation because the driver
is not digitally signed and a Security Alert is displayed on top of the Found
New Hardware screen. Click Yes to acknowledge the warning and continue
with the installation.
38 Sun 10-Port 4X DDR InfiniBand Network Express Module User’s Guide • March 2007
10. Setup displays another Security Alert when installing the InfiniBand Fabric
driver because that driver is also unsigned. Click Yes to acknowledge the
warning and continue. A long pause follows (approximately 40 seconds). Be
patient and take NO action.
11. After the pause, the Found New Hardware Wizard appears again. Do not close
this screen
Caution – Do not close this screen or respond to it in any way except to move it
aside.
12. The Hardware Installation alert for the OpenIB IPoIB Adapter is displayed.
Click Continue Anyway.
Caution – Do not attempt to close or minimize any of the visible screens. If the
Hardware Installation alert is not visible, use the mouse to drag the InstallShield
screen aside to reveal it.
Once you click Continue Anyway, all screens will close and the InstallShield
Wizard Completed screen is displayed.
13. In the InstallShield Wizard Completed screen, select the desired protocol
depending on the requirements of your application (either WSD or SDP for
InfiniBand). Leave Show the readme file selected and click Finish to exit the
wizard.
40 Sun 10-Port 4X DDR InfiniBand Network Express Module User’s Guide • March 2007
14. The README.txt file is displayed. Read this file for important last-minute
updates.
It is recommended that you reboot the Sun Blade Server Module at the end of the
installation.
2. Double-click System.
The System Properties screen opens.
42 Sun 10-Port 4X DDR InfiniBand Network Express Module User’s Guide • March 2007
> vstat
1 HCA found:
hca_id=InfiniHost_III_Lx0
pci_location={BUS=0x81,DEV/FUNC=0x00}
vendor_id=0x02C9
vendor_part_id=0x6274
hw_ver=0x20
fw_ver=1.2.0
PSID=SUN0010010001
num_phys_ports=1
port=1
port_state=PORT_ACTIVE
sm_lid=0x0002
port_lid=0x0046
port_lmc=0x00
max_mtu=2048
Running OpenSM
In the InfiniBand architecture, a subnet manager (SM) is required for the InfiniBand
fabric to function properly. The SM discovers all the nodes on the fabric and assigns
the local identifiers (LIDs) in the HCA(s). It also sets up the routing tables in the
switches to support routing packets between nodes.
To meet these needs, OFED and WinIB supply OpenSM, an open source subnet
manager. OpenSM can initialize and configure the subnet as well as keep the subnet
operational when the network topology and nodes change. OpenSM runs as a
system daemon on at least one of the host machines in the InfiniBand fabric.
The OpenSM application also contains the subnet administrator (SA), an associated
component that acts like a database and can be affected by end node requests. It
supports querying as well as event forwarding. Applications send queries to the SA
to discover the path records for remote nodes, which are needed to establish
connections between endpoints on the fabric.
Note – Two instances of OpenSM running concurrently on the same port will result
in a system crash.
For more information on OpenSM, see the ReadMe file for your OS (OFED for Linux
and WinIB for Windows).
This chapter describes configuration aspects of running the Internet Protocol over
InfiniBand (IPoIB) and contains the following section:
■ “Configuring IPoIB on Linux” on page 45
45
▼ Install IPoIB Driver
1. Detemine whether the IPoIB driver is already installed by entering the
lsmod | grep ib command.
The output from this command shows all the IB drivers.
In the following sample output, note that the driver, ib_ipoib, is not listed.
3. Enter the lsmod | grep ib command again and note that ib_ipoib is now
listed.
46 Sun 10-Port 4X DDR InfiniBand Network Express Module User’s Guide • March 2007
4. Enter the ifconfig command to check for network interface ib0.
Note that network interface ib0 is present but has no valid IP address.
To assign an address, see “Change IPoIB Configuration Without Rebooting” on
page 47.
> route
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use If
ace
10.0.0.0 * 255.255.255.0 U 0 0 0 ib
0
10.8.134.0 * 255.255.255.0 0 0 0 et
h0
169.254.0.0 * 255.255.0.0 U 0 0 0 et
h0
default ban3rtr0d0 0.0.0.0 UG 0 0 0 et
h0
3. As shown in the following example, you can enter the ping command to see
another IPoIB node on the 10.0.0 subnet:
At this point, the IPoIB network is active and properly configured without
rebooting.
In the following example, openib.conf specifies that whenever the system boots,
the InfiniBand services, IPoIB, and the SDP IP service are to start up automatically
(ONBOOT=yes, IPOIB_LOAD=yes, SDP_LOAD=yes). However, openib.conf
specifies that the SRP service is NOT to start up automatically (SRP_LOAD=no). You
can alter any and all of these parameters.
48 Sun 10-Port 4X DDR InfiniBand Network Express Module User’s Guide • March 2007
1. Edit ipoib.conf.
2. Create (or edit) the ifcfg-ibn file to configure an individual network interface.
For each InfiniBand network interface, you will need a corresponding
/etc/sysconfig/network-scripts/ startup file (ifcfg-ib0). As an
example, the startup file for ib0, might look something like the following.
more /etc/sysconfig/network-scripts/ifcfg-ib0
DEVICE=ib0
BOOTPROTO=static
IPADDR=10.0.0.50
NETMASK=255.255.255.0
NETWORK=10.0.0.0
BROADCAST=10.0.0.255
ONBOOT=yes
This chapter provides information on updating the IB NEM firmware on Linux and
Windows.
Consult the Sun Blade 8000 Series Product Notes for the most recent information about
the availability of firmware updates.
For Linux (RHEL AS 4-U3 or later, SLES9 SP3 or later, and SLES10), use the OFED
mstflint tool to load new IB NEM firmware. The tool, mstflint, is available both
as part of the bundled software and from the standard OFED stack.
Installed by default, mstflint is similar to the Mellanox flint tool with the
following exception. You must identify the IB NEM in the PCI bus:dev.fun format
to satisfy the mstflint command -d device syntax requirement.
51
▼ Update IB NEM Firmware for Linux
1. Enter the lspci command to identify the IB NEM.
In the following example, the IB NEM (Mellanox InfiniHost III Lx HCA) is
configured as PCI bus number 81, device 00, function 0 (81:00.0), which is
NEM slot 1 in a Sun Blade 8000 P Series Modular System. On your system, you
might see a different designation for the NEM.
> lspci
...
80:0e.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3)
81:00.0 InfiniBand: Mellanox Technologies MT25204 [InfiniHost III
Lx HCA] (rev 20)
Note – The GUIDs you will see (Node, Port1, and Sys. Image) during the burn
process will differ from those shown in the example.
4. As with any IB NEM FLASHRAM update, you must reset the Server Module
(or at least the IB NEM) to load and execute the new firmware image.
52 Sun 10-Port 4X DDR InfiniBand Network Express Module User’s Guide • March 2007
5. After resetting the Server Module (or the NEM), enter the ibstat command to
verify the new firmware version.
> ibstat
CA 'mthca0'
CA type: MT25204
Number of ports: 1
Firmware version: 1.1.0
Hardware version: a0
Node GUID: 0x001b00000ca72600
System image GUID: 0x001b00000ca72603
Port 1:
State: Active
Physical state: LinkUp
Rate: 20
Base lid: 70
LMC: 0
SM lid: 2
Capability mask: 0x02510a68
Port GUID: 0x001b00000ca72601
This appendix provides information to help you select the appropriate cables to
support expected performance.
According to the InfiniBand specification, the link must transmit data with a bit
error rate (BER) of at least 10-12. The BER can be guaranteed for DDR speed only
when recommended cables are used for InfiniBand connections.
55
Vendor Length (Meters) Gauge Equalization Type Part #
56 Sun 10-Port 4X DDR InfiniBand Network Express Module User’s Guide • March 2007
Active Optical Cables
An active optical cable consists of one optical cable assembly and two Optical Media
Converters (OMCs).
In turn, an optical cable assembly consists of one parallel optical cable and two MPO
(Multi-fiber Push On) male connectors, one at each end of the optical cable.
Because the optical port on the QTR3500 OMC is a male MPO connector, it is
essential that any optical cable plugged into the OMC optical port be an MPO female
connector plug as specifed in the FOCIS 5 P-12-1-0-2-2 industry specification.
A F
access to RPM files, 27 features (IB NEM), 2
filler panel, installing to prevent overheating, 10
B
back panel (IB NEM), 1 H
handling instructions, 9
C hardware requirements for WinIB, 35
cable length, maximum, 2 host interface, 2
changing the I/O configuration, 9 hot-plug operations (supported), 2
commands hot-remove request, blade does not respond, 20
ibnetdiscover, 33
ibstat, 33, 53 I
ifconfig, 47
I/O configuration (changing), 9
lsmod, 46
IB cable length, maximum, 2
lspci, 19
modprobe, 46 IB NEM
mstflint, 52 (installed), graphic view, 13
openibd, 32 handling instructions, 9
openibd status, 32 how to release from blade, 21
ping, 48 LED status, 18
route, 48 IB uplink interface, 2
show /ch, 16 ibnetdiscover command, 33
tar, 28 ibstat command
vstat, 42 status details, 33
Component Management page (ILOM), 14 verifying firmware version on Linux, 53
Component Management Status page (ILOM), 15 ifcfg-ibn file
configuring IPoIB, 30 configure IBoIP, 29
controlling basic IB behavior, 32 creating/editing, 49
ifconfig command, 47
D ifconfig command (check for network interfaces), 47
DHCP, patch required for default IPoIB, 30 ILOM Component Management page, 14
disk space required for installation (Windows), 35 ILOM Component Management Status page, 15
59
InfiniBand OFED1.0.tgz, 28
changing startup behavior, 48 onboard memory, 2
configuration file (openib.conf), 32 openib.conf file
controlling basic behavior, 32 full path, 48
transfer rate, 2 openib.conf file (InfiniBand configuration file), 32
InfiniBand drivers openibd command, 32
required for Windows platforms, 35
operating system support, 3
InfiniBand Trade Association (IBTA) interoperability
overheating NEM modules, preventing, 10
Version 1.2 (supported), 2
interface (host), 2
P
interoperability, IBTA Version 1.2, 2
PCI hot-remove request, blade does not respond, 20
IPoIB configuration files, 29
PCIe transfer rate, 2
ping command (viewing the subnet), 48
K
platform support, 3
KickStart configuration file (RHEL), 27
power, 2
L product features, 2
LED status (IB NEM), 18
Linux platforms (tested with OFED 1.0), 26
R
Red Hat Package Manager files, access, 27
lsmod command, 46
required Infiniband drivers for Windows, 35
lspci command (NEM visibility to OS), 19
RHEL4 U4 kernel support for HCA hardware, 23
M RoHS, fully compliant, 2
Marketing part number, 1 route command (viewing subnet availability), 48
maximum PCIe transfer rate, 2 RPM, access to Red Hat Package Manager files, 27
Mellanox InfiniHost III Lx (MT25204) HCA, 2
memory (onboard), 2
S
software requirements for WinIB, 35
modprobe command, 46
software stack, OFED, 26
mstflint command (updating firmware on
Linux), 52 startup behavior (IB), changing, 48
startup file, InfiniBand network interface, 49
N state of the NEM-to-blade connections,
NEM-to-blade connections determining, 20
ILOM CLI interface, 22
ILOM web interface, 20 T
NEM-to-blade connections, determining, 20 tar command, 28
network interface startup file (ifcg-ibn), 49 transfer rate (PCIe), 2
O U
OFED updating firmware on Linux with mstflint
configuring IPoIB interfaces, 30 command, 52
including IPoIB configuration files, 29 upgrading IB drivers on Windows, 34
installation directory (default), 30
package components, 26
required files, 28
W
Windows platforms, 35
WinIB
hardware requirements, 35
software requirements, 35
Index 61
62 Sun InfiniBand NEM • March 2007