Over-Clocking and Submersion Cooling – The Future of HFT Technology?
Thu, 04 Aug 2011 09:50:00 GMT
An Interview with Chad Attlesey & Chris Butler
Sponsored by Supermicro UK & Hardcore Computer
As high frequency trading firms and quantitative hedge funds look for ever increasing performance from their technology, vendors have to be more innovative in the solutions they provide. Two such vendors are Hardcore Computer and Supermicro® UK, who are currently blazing a trail through the HFT community because of the performance gains they offer with their unique submersion cooling technology, which enables them to over-clock CPUs to run faster, cooler and longer.
In the first of two interviews, Chad Attlesey, President, CTO and Co-Founder of Hardcore Computer (left), and Chris Butler, Business Development Manager at Supermicro UK, discuss the work they are doing in the desktop/workstation space. The second interview will focus on submersion-cooled blade servers in the data centre.
HFT Review: Chad, can you give us a quick introduction to Hardcore Computer and Chris, can you tell us where Supermicro fits in? What’s the relationship between the two companies?
Chad Attlesey: Hardcore Computer was started in my basement, where I developed the technology, filed for the first patents, and structured the business. The company was officially incorporated in 2006. We started out by launching our Reactor-branded desktop line back in 2008, and we’ve been shipping submersion-cooled desktops and workstations since then.
We initially announced our line of servers at a recent Blade Server conference in Florida, winning the Best Data Center Innovation Award. Then we publicly launched the server line at Super Compute 10 in New Orleans, where the booth immediately behind us was Boston Limited, so we began a discussion with them. Boston was the major reseller in Europe of the Supermicro X8DTT line of boards we use in our server and so we developed a partnership around that. That’s how it all evolved.
Chris Butler: Supermicro UK provides the marketing and sales support for financial services solutions . We engage with the Tier 1s, hedge funds and various “solutions providers” who do the software side of things, the fast data feeds and so on.
HFTR: So Hardcore Computer owns the patents for the submersion-cooling technology, builds the systems with Supermicro boards, and Supermicro UK helps market those systems to the financial community in Europe, with Boston Limited acting as the major reseller, correct?
CB: That’s right.
HFTR: Chad, when you first started developing these systems in your basement, what made you focus on submersion cooling?
CA: Well, to cut a long story short, when I was in high school, I built robots that ran mazes, won the International Science Fair Award and wound up spending part of a summer at Livermore National Labs working on Cray supercomputers, among other things. One of them was a fluid-submersion-cooled Cray 2. It had a multi-room, highly volatile, two-phased cooling solution that required boiling of the fluid at a low temperature to allow for adequate cooling. The fluid they used was very expensive, evaporated readily and it pretty much ate ozone for breakfast. From that point I always had in the back of my mind that I wanted to build a desktop supercomputer.
The trigger came one day when I arrived home just in time to prevent my house being burned down by a water-cooled workstation I’d built in my basement office. I had bought an on off-the-shelf heat sink, which developed a hairline fracture and was dripping on to the USB card in a PCI Express slot, catching fire in the process. If I had been home just 20 minutes later my house would probably have burned down.
So I thought there’s got to be a better way of doing this over-clocking and getting more performance. I decided to look at alternate ways of doing cooling, and started thinking back to the days of working on the supercomputers, which is how I got back into submersion cooling. But there’s a variety of problems with two-phase fluid cooling solutions, which I wanted to avoid. You have to maintain the right temperatures and pressures; if you get over-boiling, processors start to overheat and you start having catastrophic failures; vapors can cause plasma fires, etc. That’s the reason why there was a big red shut-off button.
I started looking at other types of submersion cooling using different types of dielectric, non-conductive liquids that I could use in direct contact with electronics. I experimented with vegetable oils and other things that had been used in the transformer industry for decades. Eventually I arrived at what we call Core Coolant today, which is a clear, odorless, food-grade product that is 1350 times better than air at heat removal by volume. It’s an almost perfect dielectric, and it breaks down in the environment much better even than vegetable oils.
It’s funny, we always joke about salespeople being able to drink it as long as they don’t stray too far from the bathroom afterwards! It’s actually a very clean, safe product, it’s just a few dollars a gallon versus hundreds of dollars a gallon, it doesn’t readily evaporate, and it doesn’t have all of the environmental issues that the other liquids do. But the main benefit is the heat removal.
HFTR: Was your initial focus to build workstations using this technology, or were you looking at the wider application of liquid cooling within data centers?
CA: I immediately saw the benefit for data centers, because once you can pump that heat outside the building, you can remove all the power requirements for air conditioning and air moving equipment, which is significant. The problem is, as a startup company with no track record, you are not going to walk in to a data center saying: “Here is my liquid filled system. Would you please displace your name-brand products, which are entrenched in the industry, with mine?” It’s never going to happen! So, the Trojan horse approach was to create the world’s most powerful desktop gaming system.
With that in mind we built the Reactor desktop systems, based on the submersion cooling concept but applied to being able to do maximum over-clocking and getting the maximum performance possible. That got our name out there, we proved our technology and we had 55 million web hits! The fact that we had our patents out in public and we never received a challenge allowed us to picket-fence that patented technology, prove it out and get this roomful of data saying that submersion cooling is safe.
The Trojan horse approach came into play because something like fifty percent of the people working in IT centers around the world are also gamers, so they now know about us. Which means that now, when we go in into the data center, there is already some built-in knowledge of who we are and there’s more acceptance in the technology as a result. There is a lot of stuff that goes on across the LAN after the business hours are over, so when we go into these IT meetings, at least one or two people in the group have been exposed to Hardcore Computer already. So there’s a method to the madness!
HFTR: Getting back to your desktop systems and workstations, what do you offer, what kind of applications are your customers running and what sort of gains are they seeing over the hardware they’re replacing?
CA: We currently have a dual socket workstation called the Detonator, and the next generation of single sockets is called Reactor X. They are both top of the heap as far as benchmarks, performance and capabilities. Also we allow things like three PCI Express ports for example, so you can provide up to 16 different displays off a single system if you so choose. The point is you get all that power right there at the desktop, plus it’s virtually silent in operation.
In the HFT environment, it’s all about speed and performance where milliseconds could mean millions, so these systems are being used for a variety of applications.
CB: The usage depends a little bit on which CPU we are putting in there. So if it’s the Intel 5698 (Everest), which we can over-clock to 4.8GHz, then that’s two cores, which tends to be used more for statistical arbitrage and simpler algorithms. If it’s stuff where they need multiple cores then we generally put in the 5690, which we can over-clock to 4.3GHz now. That’s a dual processor of 4.3GHz in a desk-side unit, which is typically about 30% faster than any equivalent air-cooled system could ever be. And it’s stable, which is the important thing.
CA: The other thing to remember about this is we are not just over-clocking the processor. Just because the engine is revving at a higher rate doesn’t mean the wheels are spinning. We have, from the ground up, built a motherboard for this product to make sure that we’ve optimized our speed performance, memory, PCI Express, everything. We’ve optimized SATA so that we can do sub-30 second boots using three internal solid-state drives and RAID, for example. We can do much better on the benchmarking because all the bottlenecks that are typically there, the things that people forget or just go by the standard layouts they are given and don’t really focus on optimizing, LAN links and things like that within the board layout, they miss. So anytime you have a timing issue, there’s a potential for data collisions, which slow the system down. We are fully optimized so that even if you were to compare us clock speed to clock speed, we are still going to give you better performance, because we are hitting on all cylinders.
HFTR: For firms who want to run these systems, how much work is typically involved in swapping out their existing hardware and replacing it with your submersion-cooled technology?
CB: One of the key benefits of this technology is that all the software and programming that clients have already done on their existing systems can very easily transport across to these machines, for an immediate performance improvement. Unlike replacing core architecture with FPGAs or GPUs or any of the other technologies around, it is just a case of replacing the hardware, then running the same native C code or C++ or whatever it may be. It just runs faster, it is as simple as that.
But coming back to your question about how these systems are being used, if we look at GPUs specifically, we are also seeing a lot of interest from hedge funds in particular who want to do back testing, where they’re taking historical market data for a week or a year or whatever and running their algorithms to test strategies to see what would and wouldn’t work. If they want to do that running GPUs, we can put GPUs in these boxes and we can over-clock those quite significantly too.
CA: We currently have the scalability to put three Tesla M2090 GPUs in a box and run them optimally, at roughly half the temperature they would be running in air. In our liquid-cooled systems, temperature ramps up much slower and it cools off much faster, which means there is a lot less thermal fatigue in the thermal cycling effects on the boards, so they are going to last longer. Plus they are going to be much more reliable in the day-to-day operation, because you are not getting the voltage fluctuations you would normally get from temperature fluxes and you are not getting overheating conditions on certain components, which can create voltage regulation issues, which in turn can lead to bit failure. So we are going to be much more reliable millisecond to millisecond. Predictability is much better, as is the longevity of the product and the reliability over a longer period of time.
HFTR: Which GPUs do you work with?
CB: Primarily it’s been NVIDIA to date, and I would say that will probably continue because the difference is that those guys actually have the double precision capability and ECC error checking, which is critical for most trading applications.
CA: We are also partnered with companies like Fusion-IO for high-speed, high-volume flash storage. We support up to their 5-terabyte Octal cards in our systems.
Also, there’s a variety for connectivity options. We work with Endace cards, for example, that allow for up to four Infiniband or 10GigE connections along with hardware time-stamping capabilities. We fully support that technology on both the desktop and on our server products.
HFTR: So what sort of reaction have you been getting from people who have been testing and using these workstations?
CA: Oh, people that have been using them are delighted, and not just in the financial industry.
We worked, for example, with a geological survey group that gave us a simulation of a modeling application where they were looking at rainfall over five years on a stretch of land and calculating things like erosion effects. They gave us this test application and said, “Can you run this and tell us what the result is and how long it took to run?” We called them back two hours later after downloading it and installing it and said, “Yeah, it’s done and here is the result.” There was a silence on the other end of the phone and we said, “What? Is there a problem?” Their response was, “Well, normally this takes over a day for us to run on our current systems!” They bought a system from us the next day!
We’ve seen similar results in the medical industry, where one of our clients is doing things like tissue analysis using visualization applications and AI routines, looking at cell structures to determine what the problem is – the pathogen or whatever is wrong with this tissue, so they can actually diagnose it. We are typically seeing seven to one improvement in test times.
HFTR: That’s interesting because there are a number of high frequency trading firms and quant funds that use artificial intelligence and complex pattern matching across both CPUs and GPUs. Are you seeing particular interest from those guys?
CB: Yes, certainly for the financial services industry. In the short-term we’re seeing the desk-side unit being used both on trading desks and for doing real-time risk across FX, equities and various other asset classes. The firms that have been lucky enough to test it so far have been delighted with the results in terms of capability, power, performance, quietness and upgradeability.
In fact we recently did a road show with a number of events in Paris and London with Mathematica and in collaboration with NVIDIA. At literally the first show, we sold a system to a hedge fund, not just because it’s so quick but also because it’s so quiet. They’ve bought from us, love it, and their goal is now to deploy another ten units within the quant group at the hedge fund for doing real-time risk calculations, because they can do that on a desk-side unit without the whole team having to wear ear-muffs!
CA: The desktops really are designed to be a beautiful beast! I mean they are large pieces of art. A hundred pound system, that sits at hip height when you are in seated position, all the I/O connectors and everything is right there with easy access. There is a lot of ergonomic design in it and a lot of engineering went into it. And how you maintain it and everything else is actually very well thought out.
HFTR: There are a lot of vendors out there that offer various types of spot-cooling. What are the key differences between your submersion-cooled systems and those?
CA: One benefit of direct submersion cooling, i.e. total liquid submersion cooling is that we use dielectric (not electrically conductive). With spot cooling of the CPU or maybe the GPU using a water-based cold plate solution, every connection is a potential failure point – it is highly risky, it is just too fraught with potential failures. The failures can be catastrophic. You don’t want that in an office environment. Plus, you are only cooling the CPU and GPU. You are only as good as the weakest link in the chain. If you are not cooling all the electronics, every capacitor, resistor, mosfet, anything in the voltage regulation chain; if you are not cooling all of it, anything can go. And you are still, in those systems relying almost completely on air cooling. For us, we have a heat exchanger on the back of our desktop systems for highly effective heat removal from the liquid. The liquid pulls heat away from all of the components.
HFTR: Any final thoughts?
CA: From the desktop perspective, we are one of only five companies licensed by Intel to sell the Everest 5698 processor. We enjoy a very good relationship with them, which means we can do things with their CPUs that nobody else is doing, like providing products that can a provide reliable 4.8GHz performance.
CB: One of Supermicro’s goals is to always be first to market with the highest quality and performance platforms supporting a vast range of application and configuration requirements. So as part and parcel of that, Chad’s strategy is to have the latest and greatest over-clocked and submersion-cooled systems on the market, and we are here to support it.
HFTR: Thank you Chad, Chris.