Today there are number of cheap QSFP network interfaces available on secondary market, such as Mellanox ConnectX-3 Pro. I’ve decided to give it a try to establish a link between server and workstation for the NAS role. I got reasonable fast flash storage on both machines with SAS and NVMe disks. Working with gigabytes of high-resolution RAW-files from modern mirrorless camera require plenty of storage and waiting for transfers on 1GbE link isn’t much fun. Let’s see what we can do with two PCIe 3.0 ×8 fiber NICs and QSFP+ AOC cable.
For experiment I got :
- Mellanox ConnectX-3 Pro FDR InfiniBand + 40Gigabit Ethernet Card (MCX354AFCCT) – 2 pcs, for $25.33 total
- Arista AOC-Q-Q-40G-10M QSFP+ to QSFP+ 40GbE 10m AOC-00002-02 Optical Cable – 1 pcs, for $31.88 total
- Finisar 40GBs QSFP AOC FCBG410QB1C50-FC 50m cable – 1 pcs, for $37.32 total
I got 10 meter cable for testing and future use, but actual real-case use will be with 50 meter fiber. So that’s total $94.53 USD with taxes for everything. Here’s the kit w/10m AOC:
Active Optical Cable (AOC) implements two identical transceivers and permanently attached OM2 cable between them. These are branded by ARISTA and have model number AOC-Q-Q-40G-10M which means QSFP+ cable type with bandwidth up to 40 Gbps and length 10 meter. Cable is heavily used and discolored in few spots but hopefully still workable.
Now let’s look at the cards. Both of them are identical and were manufactured by Mellanox back in 2016. Marketing name for this cards is ConnectX-3 Pro FDR, Model No: CX354A.
There are some power regulators, small serial memory chip for firmware, 156.250 MHz oscillator U3 and that’s about it.
Each card has PCI-express x8 Gen3 interface to the host and provide two QSFP+ ports for 56 Gbps InfiniBand or 40 GbE Ethernet links. Chip has a passive heatsink which get quite warm during operation.
Both cards have same part-number MCX354A-FCCT. Letters and numbers are important, as some more basic variants may offer only 40 Gbps Infiniband and 10 GbE only Ethernet.
QSFP+ cages just have card edge connectors, all the magical optics and lasers is integrated inside metal QSFP transceivers, either integrated into one assembly for the AOC or provided separately for MTP/MTO fiber ports.
I did not take photo of 50 meter cable, as it’s visually looks nearly identical with OM2 fiber and two QSFP+ Dell-branded XVRs.
Initial check with Windows 2008 R2 machine
I’ve plugged both cards and connected them with cable on the same machine to see if there is actually an active uplink between the ports. Never assume that item from secondary market is good and working (even if seller promises so) without actual test and verification in your machine. It could be damaged in shipping or not compatible with your components otherwise, so never assume, always verify.
Here I was a little bit concerned if ARISTA AOC would work in Mellanox card, but no issues were discovered after re-configuring both cards into Ethernet mode. But everything seem to be working fine after installing MLNX_VPI_WinOF 5.35 driver for Windows 2008 R2 ×64 , so we can continue with real-use case testing next. There are some manuals available as well:
Mellanox VPI WinOF user manual for v5.35
Mellanox Windows Network Adapter Management, Rev 4.80
Mellanox VPI WinOF release notes for v5.35
Server used for testing
One card was installed in HPE DL380 Gen10 server with plenty of flash storage, running two Xeon Gold 6126 processors, some DDR4 SDRAM and both SAS and NVMe SSD disks. Server is running FreeBSD 12 operating system so few extra steps needed to be done for Mellanox card to detect and operate properly.
Configuration and first tests
First I needed to load Mellanox kernel module so card could be brought up and initialized. This was done by kldload mlx4en command. After this command you should see mlxen0 and mlxen1 ports in ifconfig output. Then we can assign static IP to the port and see if it works :)
mlxen1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=ed07bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWFILTER,VLAN_HWTSO,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6> ether 24:8a:07:da:15:82 inet 10.20.0.1 netmask 0xffffff00 broadcast 10.20.0.255 media: Ethernet autoselect (40Gbase-CR4 <full-duplex,rxpause,txpause>) status: active nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
And works it is!
@ ping 10.20.0.10 PING 10.20.0.10 (10.20.0.10): 56 data bytes 64 bytes from 10.20.0.10: icmp_seq=0 ttl=128 time=0.150 ms 64 bytes from 10.20.0.10: icmp_seq=1 ttl=128 time=0.150 ms 64 bytes from 10.20.0.10: icmp_seq=2 ttl=128 time=0.112 ms 64 bytes from 10.20.0.10: icmp_seq=3 ttl=128 time=0.131 ms 64 bytes from 10.20.0.10: icmp_seq=4 ttl=128 time=0.110 ms
Pings are coming thru just fine and test 1 GByte file was send over the link with FTP server in just 1.7 seconds (mostly limited by slow SATA SSD on the test client machine, not the network link itself.
To utilize big juicy link it’s recommended to change MTU for larger size to utilize so called Jumbo frames. One can add the settings for permanent operation in /etc/rc.conf on FreeBSD OS as well. Be sure to check that MTU is correctly applied by testing with ifconfig utility.
mlxen0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000 options=ed07bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWFILTER,VLAN_HWTSO,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6> ether 24:8a:07:da:15:81 inet 10.20.0.1 netmask 0xffffff00 broadcast 10.20.0.255 media: Ethernet autoselect (40Gbase-CR4 <full-duplex,rxpause,txpause>) status: active nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
That’s about it on a server side.
Client workstation used for testing
Client machine has follow hardware:
- Intel Core i7-7980XE 18-core LGA2066 processor
- EVGA X299 DARK motherboard
- 96 GB DDR4 UDIMM memory
- EVGA GeForce GTX 1080 Ti K|NGP|N Edition
- Mellanox Connect-X3 card in PCIe Gen3 ×8 CPU-attached port
- 50 meter Finisar DELL-branded 40GbE QSFP 50M FCBG410QB1C50-FC AOC
- Windows 2016 Server OS
Tests and benchmarks
I’ve used iperf3 server running on BSD machine and iperf3 client on windows PC. Results for 4 threaded test run are below with default MTU = 1500:
[ ID] Interval Transfer Bitrate [ 5] 0.00-10.00 sec 3.50 GBytes 3.01 Gbits/sec sender [ 5] 0.00-10.00 sec 3.50 GBytes 3.00 Gbits/sec receiver [ 7] 0.00-10.00 sec 3.49 GBytes 3.00 Gbits/sec sender [ 7] 0.00-10.00 sec 3.49 GBytes 3.00 Gbits/sec receiver [ 9] 0.00-10.00 sec 3.49 GBytes 3.00 Gbits/sec sender [ 9] 0.00-10.00 sec 3.49 GBytes 3.00 Gbits/sec receiver [ 11] 0.00-10.00 sec 3.48 GBytes 2.99 Gbits/sec sender [ 11] 0.00-10.00 sec 3.48 GBytes 2.99 Gbits/sec receiver [SUM] 0.00-10.00 sec 14.0 GBytes 12.0 Gbits/sec sender [SUM] 0.00-10.00 sec 14.0 GBytes 12.0 Gbits/sec receiver
Speed hovers about 12-13 Gbps, perhaps still some optimizations or settings need to be improved to make it faster. But even this is significant improvement and help, working with my simple workloads like copying files between server and workstation. For example actual copy of the 3.2 GB video file takes just 4 seconds at speed ~800 MBytes/s. On 1 GbE connection this would take half a minute.
Now changing MTU to 9000 by executing command ifconfig mlxen0 mtu 9000 we can test speed again with large dataframes setting.
iperf with 4 threads report clear increase in speeds, with total about 24.5 Gbps.
[ ID] Interval Transfer Bitrate [ 5] 0.00-10.00 sec 7.25 GBytes 6.23 Gbits/sec sender [ 5] 0.00-10.00 sec 7.25 GBytes 6.23 Gbits/sec receiver [ 7] 0.00-10.00 sec 7.22 GBytes 6.20 Gbits/sec sender [ 7] 0.00-10.00 sec 7.22 GBytes 6.20 Gbits/sec receiver [ 9] 0.00-10.00 sec 7.06 GBytes 6.06 Gbits/sec sender [ 9] 0.00-10.00 sec 7.06 GBytes 6.06 Gbits/sec receiver [ 11] 0.00-10.00 sec 7.00 GBytes 6.01 Gbits/sec sender [ 11] 0.00-10.00 sec 7.00 GBytes 6.01 Gbits/sec receiver [SUM] 0.00-10.00 sec 28.5 GBytes 24.5 Gbits/sec sender [SUM] 0.00-10.00 sec 28.5 GBytes 24.5 Gbits/sec receiver
And copying some videos from Intel DC P4500 4TB CPU-connected SSD in client machine shows actual transfer speeds well over a 1 GByte/s. I plan to install array with SAS SSDs soon in workstation as well so hopefully we can see datarates over 2 GByte/s between storages next month.
Running upload in three streams gives us good speeds as well, limited only by the disks performance.
Conclusion
Cheap high-speed fiber link is indeed very possible and doable today in 2025. Perhaps with a little bit extra cash one could even get faster 100 GbE with QSFP28 ports but cables for that system start to get pretty expensive with length.
Modified: Feb. 1, 2025, 11:32 p.m.