How can I get a Mellanox ConnectX-2 to work on this ASUS desktop motherboard?

2

People seem to be happy with second-hand Mellanox ConnectX-2s on Linux so I grabbed a pair. Both cards result in an network interface showing up on one computer but neither show up in the other computer.

Working computer:

  • EVGA 120-LF-E650 desktop board
  • Ubuntu 16.04 LTS
  • Linux 4.4.0

Not working computer

  • ASUS Z87-PLUS UEFI desktop board
  • ArchLinux
  • Linux 4.4.5

Cards:

Part Number:      666172-001
Description:      HP ConnectX-2 Lx EN network interface card; single-port SFP+; PCIe2.0 5.0GT/s; mem-free; RoHS R6
PSID:             HP_0F60000010
FW             2.9.1000

Attempt 1

After the ASUS splash logo, a blank screen with a only blinking cursor appears and it never gets to GRUB. The other computer was showing "Press some key to enter the Mellanox network boot manager" at this point. (I wish I could disable this screen altogether because I'm never going to PXE boot.)

Attempt 2

I reset the box and it booted Linux this time but the kernel reports:

pci 0000:01:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
mlx4_core: Mellanox ConnectX core driver v2.2-1 (Feb, 2014)
mlx4_core: Initializing 0000:01:00.0
mlx4_core 0000:01:00.0: enabling device (0000 -> 0002)
mlx4_core 0000:01:00.0: Multiple PFs not yet supported - Skipping PF
mlx4_core: probe of 0000:01:00.0 failed with error -22

My onboard Intel no longer works:

e1000e 0000:00:19.0: can't find IRQ for PCI INT A; probably buggy MP table
e1000e 0000:00:19.0: Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode
e1000e: probe of 0000:00:19.0 failed with error -2

Removing the Mellanox card doesn't bring the Intel card back. The Intel returns only after I cut power to the motherboard and powered it back on.

Attempt 3

I disabled all PCIe power saving things in the UEFI setup, try a different PCIe port, and pass acpi=off or pcie_aspm=off to Linux.

mlx4_core: Mellanox ConnectX core driver v2.2-1 (Feb, 2014)
mlx4_core: Initializing 0000:02:00.0
mlx4_core 0000:02:00.0: Missing DCS, aborting (driver_data: 0x2, pci_resource_flags(pdev, 0):0x0)

According to the driver source, this means the "PCIe BAR" was 4 MB but it was expecting 1 MB? Maybe I need to disable SR-IOV on the card but I don't know how; for ConnectX-3 it can be done through mlxconfig. I don't even need SR-IOV, I'm not planning on using VFs.

Attempt 4

I downloaded a non-HP branded firmware image from Mellanox's website, backed up the current image, and flashed one the cards using:

sudo flint -d /dev/mst/mt26448_pci_cr0 -i fw-ConnectX2-rel-2_9_1200-MNPA19_A1-A3-FlexBoot-3.3.400.bin -allow_psid_change burn

Now it looks like this:

Part Number:      MNPA19_A1-A3
Description:      ConnectX-2 Lx EN network interface card; single-port SFP+; PCIe2.0 5.0GT/s; mem-free; RoHS R6
PSID:             MT_0F60110010
FW             2.9.1200

Now, when I boot it with pcie_aspm=off, I get this:

mlx4_core 0000:02:00.0: command 0xff6 timed out (go bit not cleared)
mlx4_core 0000:02:00.0: device is going to be reset
mlx4_core 0000:02:00.0: PCI can't be accessed to read vendor id
mlx4_core 0000:02:00.0: device was reset successfully
mlx4_core 0000:02:00.0: RUN_FW command failed, aborting
mlx4_core 0000:02:00.0: Failed to start FW, aborting
mlx4_core 0000:02:00.0: Failed to init fw, aborting.
mlx4_core: probe of 0000:02:00.0 failed with error -5

According to an OFED FAQ, "The error message above indicates that the device's hardware capabilities do not match the firmware configuration (.ini) file parameter settings," but it still works in the other machine.

Can I get this card to work with this motherboard? (Virtual functions not needed)

linux
networking
network-adapter
asked on Super User Apr 23, 2016 by Easter Sunshine • edited Apr 23, 2016 by Easter Sunshine

0 Answers

Nobody has answered this question yet.


User contributions licensed under CC BY-SA 3.0