Networking for Post-Exceeding 10Gigabit

This article wraps up a series where I have been exploring different networking options for exceeding the bandwidth limitations of Gigabit Ethernet.  10GbE has come down in price, and has become more common for post production work over the last few years, but is not quite fast enough for uncompressed 4K at higher framerates, higher resolutions, or multiple streams.  There are technologies that offer bandwidths exceeding 10Gigabits a second, but there is minimal information and familiarity with these options outside the datacenter space.  The original approach was 40GbE, which combines 4 channels of 10GbE in a single connection, while the newer 25GbE standard is based on increasing the signaling frequency of a single connection.

40GbE seems like it could be an affordable way to exceed 10GbE for some users. In regards to cheap direct links between systems.  It should offer 5GB/sec of bandwidth, which is as much as a large disk array or NVMe SSD can serve up. 40GbE hardware was originally developed not long after 10GbE, and was primarily used for aggregate links between switches.  Then it was used to connect servers to the switches, replacing teamed NICs that had been used to increase bandwidth up to that point, but it was never widely used to connect individual client systems to a network.  40GbE uses a single Quad Small Form-factor Plug (QSFP) port, which allows connection of direct attach cables with four pairs of TwinAx copper, or over multi-line fiber, with MTP connectors in QSFP transceivers.  QSFP ports can also be adapted to single channel SFP ports, via a QSA adapter. (Quad to Single Adapter)  On the switch side, a single QSFP port can be adapted to support 4 separate 10GbE SFP connections, but NIC’s apparently do not support this approach.  That would among other things, require the OS to view it as 4 separate adapters, with unique settings and IPs to configure.  If there is a way to do that, I would like to know, as it would make for some interesting switchless 10GbE resource sharing options.

Normally when you divide network traffic between multiple channels, the load balancing directs all traffic from a single request over a single link. This isn’t an issue when your storage server is streaming data to a number of clients, where no individual recipient will receive data faster than one of the individual links, but the aggregate total of those connections will exceed an individual link. So in a traditional use case, you have eight workstations connected to a switch at 10Gb, and the media server they are all pulling data from is connected with a 40Gb link. Each system can get up to 10Gb from the server, up to 40Gb total.  But what if one of those workstations was connected at 40Gb, and everyone else was off for the weekend, could a single transfer exceed 10Gb? Or would you have to initiate multiple transfer tasks to utilize the four component links? I asked this a few years ago at NAB, and none of the representatives from the networking companies there had a solid answer for me.

The next approach to exceeding the 1.2GB/sec bandwidth limit of 10GbE, was 25GbE, which is designed to eventually replace 10GbE all the way down to the client level when needed.  It is much simpler than 40GbE, because they just increased the signal frequency in the SFP connector, and that is basically it.  The other simplifying factor is that there is no support for older connection types like CX4, and then even RJ-45 over twisted pair (CAT#) cable.  All 25GbE connections come in the form of SFP28 ports, (With 28Ghz being the number for compatibility with InfiniBand protocols which use the same hardware.)  Similar to other SFP+ ports, SFP28 ports can be connected with direct attached TwinAx cables for short distances, or transceivers with fiber for longer runs.  It is a single unified link, so there is no question about maximum speeds, and you should be able to transfer 3GB/s over 25GbE connections, if your storage can support that rate.  And what’s more, 25GbE links can be combined to create 50GbE and 100GbE connections, using QSFP28 ports.  (4x25GbE links for 100Gb of aggregate bandwidth, similar to 4x10GbE links for 40Gb of aggregate bandwidth.)  So there is lots of bandwidth available, and VERY few users need to (or even CAN) move more than 12GB/sec.  The only downside is price.  While 40GbE gear can be found for cheap on eBay, this is because other people are moving from 40GbE to 25GbE based solutions, including 50GbE and 100GbE.  And 25GbE gear hasn’t been available long enough to make it into the used market.  So 25GbE based hardware still commands a much higher price premium.  And while that is where high end post production solutions are moving eventually, that doesn’t necessarily make it the best choice for everyone at the moment.

I had known about 40GbE for quite a while before I really explored it, because I was familiar with some of the limits of traditional ethernet bonding.  You need a mechanism to breakup the request across the channels.  My first fiber channel SAN used two channels of 4Gb fiber to share two separate storage volumes, that were then combined as a dynamic RAID-0 in Windows disk manager, and the RAID architecture divided the I/O between the two channels.  This worked, but was hard to manage, and the added complexity opens up all sorts of potential problems.  So I was only interested in using 40GbE if it wouldn’t require any of these types of work arounds.  And no one seemed to know for sure if it would, because similar to bonded ethernet, 40GbE was normally only used in situations where data from many connections was aggregated, making it easy to divide between separate channels.  So it was time to run some tests.  Could 40GbE be used to connect high bandwidth workstations at relatively low cost, or would the more expensive 25GbE gear be required?

While 40GbE capable switches aren’t cheap, the 40GbE cards are. I bought two PCIe cards for this experiment for under $100 on eBay. (Mellanox ConnectX-3 MCX354A)  I got the cards installed, and connected them with a QSFP TwinAx cable. I installed Mellanox’s WinOF 5.5 drivers, and set the cards to 40GbE mode, instead of 56Gb InfiniBand mode, which I haven’t explored yet. Beyond that, the steps are the same for configuring a 10GbE direct link, or a 100GbE one, which I detailed out in the second article in this series. It is a bit more work to establish a single link network than to connect to an existing network switch and router.  In this case I setup my 40GbE direct link to run on the subnet.  I also recommend setting the Jumbo Packet threshold to 9014 for best performance. Usually when establishing a high speed direct connection, all of your systems are already connected to the same gigabit network for internet access and such. And we are going to want to make sure the traffic between those systems is routed over the new link, and not the existing gigabit network. To do that, we use those unique IP addresses that are in a separate subnet. So in Windows, map a network drive with the path: \\[OppositeSystemIP]\[ShareName] like \\\RAID.

Then you can try copying files to or from that network drive. Assuming your storage volumes themselves on both systems are fast enough (RAIDs or SSDs), you should get around 800-1000MB/s if you have a 10GbE direct link. It would be nice to get 4-5GB/s over 40GbE, but apparently the Windows TCP/IP stack is only able to process about 2GB/s without using third party tools to divide the copy over multiple threads. But anything exceeding 1.25GB/s is enough to prove that we aren’t limited to a single 10Gb link for a discrete copy task. Even if only one of the connected systems has storage that supports that data rate, the connection can be tested by playing back high res uncompressed video across the network. (Try exporting some lower res content to 4K or 6K DPXs.) If neither system has storage that supports that data rate, then you over sized your network.  But for testing purposes you can create RAM drives, share them over the network, and copy between them to test maximum network performance. This approach removes the potential storage bottleneck for any network bandwidth test, as long as you have the available RAM. My home systems both have drives that run over 1000MB/s, so they are optimal candidates for a 10GbE link. The 40GbE link I was testing here was overkill for them, but I was able to get over 2GB/s between RAM drives on the systems. These cards will be optimal for one of my client locations, where they have a number of systems with large RAIDs and SSDs that get 2-3GB/s. So that is probably where they will end up being used long term, connecting the main editor to the largest storage server at maximum speed.

When I installed one of the 40GbE cards in my Sonnet Thunderbolt PCIe Breakaway Box, I was able to get 1.5GB/s connection to my laptop, but this is only marginally better than the 1GB/s I can get with their much easier to use Solo10G Thunderbolt NIC. Sonnet offers both SFP and RJ45 versions of the Solo10G adapter, which is the optimal high bandwidth solution for most Thunderbolt laptops, while QNAP offers a USB-C adapter that provides 5Gb NBase-T support for systems without Thunderbolt. Using PCIe expansion boxes allows bandwidths up to the 40Gb limit for Thunderbolt 3, but that is probably not a reasonable solution for most laptop users.  Laptop users shouldn’t need 40GbE, but since I had the parts to test it, I gave it a try.  I was also able to connect the 40GbE cards to the QSW-2104 switch I reviewed in my previous article, by using a QSA adapter and a direct attached SFP cable, or by using an MTP to 4xLC breakout cable.  This allows me to have a triangle of connections using the dual QSFP port cards, linking the systems together at 40Gb, and giving them each a 10Gb connection to the rest of my network.  This makes for an  interesting application combining all of the technologies I have been exploring, but the main purpose of these dedicated high bandwidth direct link connections is to allow a power user maximum speed access to their files on a shared storage location, or to allow two users to share files that are stored locally on one of the systems.

The end result is that: Yes, 40GbE can be used to connect workstations, and exceed 10Gb of bandwidth for individual transfers or playback.  This will only be practical when all of the systems are in close proximity, but could be scaled to a large number of systems via 40GbE switches that are relatively cheap on eBay.  For any other use, or for better future compatibility, investing in 25GbE based products will probably be worth the money, if you need that level of performance and bandwidth.  But delaying that upgrade until prices come down, by using 10GbE or 40GbE in the interim, until you outgrow those limits, will probably save you lots of money.

Leave a Reply

Your email address will not be published. Required fields are marked *