Stand by for light speed: high performance computing in financial services
Most debates about High Performance Computing in financial services quickly turn into conversations about high frequency trading, but there are many more reasons for getting the best of out of systems. Electronics and computer technology have always been pushing the boundaries of smaller, faster, cheaper (or at least, ‘more affordable’) and financial services firms have always been quick to take advantage of the latest advances.
High frequency trading, co-location and using line-of-sight microwave comms links between tethered balloons to reduce signal delays all make for exciting headlines, but the reality is that understanding the hardware limitations of a system and the way that it interacts with software can make an enormous difference to the efficiency of almost any system.
In a world that is moving rapidly to virtualisation and the cloud, there is every reason to look at the code structure and tweak it as much as possible – it may not matter too much that a few clock cycles are being shaved off routing programs but they all add up, and if you can get more onto your servers, you’ll be paying less for more.
It’s a highly specialised world, though and there are many different approaches touted by their evangelists – in-memory processing, solid state drives, DRAM drives, Field Programmable Gate Arrays, General Purpose Graphic Processor Units and so on. Quantum entanglement is likely to be a game-changer.
Typically, it is about getting the right element of the system doing the right job at the right time, rather than having a hugely powerful CPU waste its resources on trivial tasks or sit around waiting for data to process.
“What we are doing is having a FIX offload engine, in the same way that you have a TCP/IP offload engine,” says Kevin Houstoun, chairman of Rapid Addition, which specialises in the processing of FIX Protocol trading messages. TCP/IP offloading is used in high speed networking such as 10 Gigabit Ethernet, and improves performance by having the TCP/IP packets processed by the network controller rather than the attached computer.
Case study: processing tick data
RSJ, the biggest trader on NYSE Liffe and a large trader on other derivatives exchanges, uses Kx’s kdb+ to support its algorithmic trading.
RSJ is the largest trader of financial derivatives in the Czech Republic and needs to process vast quantities of data at very high speeds. The company collects data on numerous instruments, with over 10 million records per day on Eurodollar futures alone. It uses several months of tick data of the most liquid securities in the world, mostly Eurex, NYSE Liffe and CME futures, to run intra-day trading simulations and what-if scenarios.
During the evaluation process RSJ pushed potential systems to the limit before concluding that Kx Systems had won in terms of speed and the ability to process complex requests and large data sets, the key requirements.
Martin Duchácek, head of algorithmic system development at RSJ, says: “We are seeing substantial improvements with Kx’s kdb+. As well as very significant reductions in processing times, where previously a query on a day’s data would take a couple of hours – which is far too slow – with kdb+ we can write a query in a couple of minutes and see the results in seconds. This allows us to react to market situations almost immediately. kdb+ provides us with quick support for brainstorming and allows us to do things we were previously unable to do.”
Simon Garland, chief strategist at Kx Systems, says: “RSJ is the ideal client for making good use of Kx’s combination of language and high-performance database. They collects huge quantities of tick data and needs to access and query it very quickly in order to be able to create and test models, test strategies, identify unusual market situations and react to them.”
“What our FIX offload engine does is handle the FIX packet processing. This is taking a function that is normally always serial and offloading it so that it isn’t being done by the CPU, and that has multiple advantages, including lower latency: in a software-based FIX engine the round trip from tick to trade through a CPU is around 10 microseconds – with this technology it drops to 5 microseconds.”
The second benefit is that, in an Intel CPU environment, the parsing of FIX messages isn’t taking up any cache memory, which can then be used for whatever process you are running, such as a trading algorithm.
Currently the software based FIX engines from Rapid Addition handle some 100,000 messages a second. “By offloading it – and we have yet to get the final results to see how much the CPU can cope with – but we can actually process FIX messages at line rate, and line rate on a 10G card is 6.25 million 200 byte messages a second. It is a step change,” says Houstoun.
But more surprisingly, he points to the big benefit as being a dramatic reduction in the number of servers users have to deploy to service their customer bases. “For something as predictable as taking FIX ASCII and turning it into the binary code that the CPU needs you can make great savings and the challenge of having to do more with less means that scalability is important,” he says.
Simon Garland, chief strategist at Kx Systems, a specialist in high performance databases and time series analysis, known for its kdb+ database programming language, agrees. “CPUs today are extremely fast, but it takes a lot to get the data in and out, so there is a big premium on having clever programmers to make sure that everything is running efficiently.”
One the other hand, he cautions, you can go too far in search of perfection in coding. “You may be the first to spot a trade opportunity, but a trading algo can have a lifespan of a couple of weeks – if you’re crafting that in assembler code, you’ll probably be out of date by the time you’ve finished.”
Garland says that an understanding of what data needs to be where, and when. The key requirement is to have the most relevant data as close to the CPU as possible. These days that means using Dynamic RAM disks up close to the CPU, and then going down a hierarchy though solid state drives – SSDs – to traditional disks and on down to offline, off site archiving.
In performance terms, SSDs – solid state disks – make a huge amount of sense compared to their predecessors, the Winchester hard disk drives (“spinning rust”, as SSD types call it dismissively) that have been in use since the early 1980s in various physical sizes and data capacities. The one thing that is still holding back their adoption is the relative cost.
Case study: FPGAs speed up risk controls in Canada
In line with changes introduced in other markets around the world following the so called ‘Flash-Crash’ in 2010 and the collapse of Knight Capital following a near catastrophic trading error in 2012; the Investment Industry Regulatory Organization of Canada provided their regulated dealers with clear supervisory and gatekeeper responsibility to protect against errors related to electronic trading.
The changes were to become effective as of March 1, 2013 however, in recognition of the technology enhancements required to automate the controls, IIROC dealers had until May 31, 2013 to fully test and implement their automated controls and replace existing systems or introduce new functionality.
One Canadian investment bank was predominant in providing direct exchange access market in Canada and wanted to retain this position by continuing to provide a ‘white gloves service’ to their client base. The conundrum however, was how to effectively introduce the changes without introducing unnecessary latency that would reduce clients trading performance.
Fixnetix already had a software solution in place in with investment banks and their clients in Europe and were working on reducing risk control introduced latency further by porting to a FPGA based proprietary hardware solution (iX-eCute), thereby removing the software stack from the in-line controls and minimise the latency. Fixnetix had an iX-eCute test rig already established in the US on an equity exchange but had nothing available at that point in Canada. Fixnetix and the bank embarked on an educational program to make clients aware of what was achievable with such a solution and to make sure that the solution was sufficiently altered in order to be fit for purpose to the vagaries of the Canadian marketplace. Clients were encouraged to connect first to the US market and then to a test environment initially set up at the TMX co-location centre and then at the Equinix TR1 data centre thereafter.
Fixnetix and the Canadian bank worked to deliver a bespoke solution in order to meet client needs in terms of functionality, latency, scale, reliability and implementation. The controls were tested in conjunction with clients over a period of weeks, with new functionality drops becoming available periodically containing over 100items such as maximum and minimum share quantity, price deviation, order value, trading limits, short sale and preborrow rules, GTD, GTC, single cancel, multi-cancel, cancel on disconnect and ‘kill-all’.
Multi-user profile monitoring and intra-day amendments were made possible via a Fixnetix proprietary GUI called iXEye. Client configuration and status was viewable and filterable with various warning alerts configured so that client limit breaches could be highlighted and avoided prior to any breach occurring. Additionally, a full audit trail was required as a matter of course.
The system needed to be robust enough to take high volume client order flow over multiple sessions, refer to start of day files, client configuration files, real-time market data and provide an order-by-order report of the days’ trading.
Message stores needed to cover millions of messages with hour-long burst rates of more than 10,000 orders per second. The latency of the system needed to be imperceptible to the ‘static’ in the marketplace.
The bank was able to launch within the regulatory timeframe, keep its client base happy, protect clients and the bank and yet maintain the sophisticated trading regime that they had enjoyed up to that point. The environment has provided a robust platform for future volume growth and a solid platform to allow the bank to target further customers.
A recent deployment is at GFI Group, a provider of wholesale brokerage, electronic execution and trading support which it is deploying Flash Memory Arrays from Violin Memory to increase the speed and capacity of its trading platforms across all assets classes.
Replacing disks with solid state storage is part of a larger project GFI is implementing to prepare its electronic trading infrastructure for its planned Futures Exchange and Swap Execution Facility. Jerry Dobner, chief technology officer at GFI Group said: “We looked to increase the speed, capacity, and density of our shared data storage platform. By embracing this new technology, our clients will benefit from faster transaction speeds and a highly scalable electronic trading infrastructure.”
But having all the data in the right place in the right kind of memory on the fast CPUs won’t help that much if the whole IT infrastructure is not performing, which is where people like ITRS Group come into the picture. ITRS’s Geneos is a performance monitoring and management platform, used by some 90 global financial institutions, including investment banks, exchanges and trading venues, hedge funds, brokers and data vendors around the world. It has been most recently deployed to monitor a new multi-asset trading platform being rolled out by one of South Africa’s largest banks.
The rise of African online trading, in preference to phone-based trading, means the bank has extended its FX offering via a new, web-based trading platform. Focussing on FX and precious metals, for corporate clients in particular, the offering provides prices, analysis and trade execution services. The platform covers a range of instruments, including spot, forwards and swaps.
Kevin Covington, chief executive of ITRS said: “Our client’s new strategy required a scalable and robust solution to monitor the whole backbone of the platform, including the complete flow, from pricing through to execution. ITRS Geneos [maintains] high levels of performance availability of key components, including everything from FIX to the pricing engine.”
As African traders continue to move towards e-trading, the region’s banks are increasingly adopting diverse infrastructures to meet the needs of both developing and developed markets. Consequently, the South African bank was faced with the challenge of implementing technology new to the African market that meets its need for high-performance multi-asset trading.
Covington likens Geneos to industrial systems. “You can’t run a decent factory without instrumentation,” he said. As institutions increasingly move to offload elements of their infrastructure, these measurements become ever more crucial. “Firms are not going to change their technology stacks overnight, so managing existing assets is important to have availability, performance and scalability.”
And if you are going to measure things at this level, you’ll need a good stopwatch, for which you might turn to the likes of Perseus Telecom, which recently announced its High Precision Time offering, providing deterministic synchronisation with the international atomic clock-based Coordinated Universal Time (UTC) standard.
There are several time platforms in use, generally using a combination of GPS time data and Network Time Protocol, but both are proving less reliable with high latency time synchronisation reckons Perseus, whose service uses 1 pulse per second electrical signal Perseus to set its customer server clocks with a better than 1 nanosecond accuracy guaranteed.
The atomic clock installation at the Equinix LD4 data center was built to serve local and international broker dealers and buy-sides participants. Stewart Orrell, managing director, global capital markets at Equinix said: “We are very pleased to have Perseus provide new critical infrastructure for certifiable time data across the Equinix ecosystem both in the US and now in Europe. Having financial firms able to reference UTC time data across existing trading ecosystems helps meet client requirements as they communicate and trade from market to market.”