Scratching the Itch

A couple of weeks ago I found an article by Ryan Day about implementing the Nasdaq ITCH protocol in python. I was really excited about this because I had always thought the barriers to entry for writing a program relating to the stock market would be very high. Turns out writing an application for stock markets is not the barrier, it’s getting the live data needed to be able to make money with it.

I wanted to use RabbitMQ because I’ve been hearing a lot of great things about it. It has a really nice web interface you can use to check on the status of your queues and see how many messages are passing through. Also, since I had used MassTransit in the past with MSMQ, and MassTransit now supports RabbitMQ, I figured it’d be a breeze to setup.

RabbitMQ was very easy to get up and running, along with the plugin for the web interface. It turns out I don’t remember as much as I thought I did about MassTransit, but Joshua Arnold was able to help me get setup pretty quick and I had messages flowing through RabbitMQ in what seemed like no time at all.

The message rate is very slow… I realized that I’m only able to publish about 3,000 messages per second to the message queue. After taking MassTransit out of the mix and publishing directly to RabbitMQ, that number raises to about 10,000 messages per second. If I don’t publish any messages at all and only parse the data from the file, I can parse about 450,000 messages per second. So, RabbitMQ and MassTransit are causing a bottleneck.

Consuming messages is even slower. I got MassTransit to pull messages off the queue at a top speed of about 1,200 messages per second. Way too slow to keep up with a trading day, where I think I would need to be able to handle at least 20,000 messages per second in order to keep up with bursts of traffic.

After doing a little digging, I found a nice comparison of some different queues. It sounds to me like ZeroMQ would be a perfect fit for this project. If I have some free time I’m going to give it a shot and see if it performs as well as they say.

Take a look at the code and let me know what you think: http://github.com/RexMorgan/Itch

  • http://twitter.com/monadic alexis richardson

    Rex, 

    I recommend posting questions about performance to the RabbitMQ mailing list.  We support lots of customers doing high throughput market data.  For example, in normal hardware you should be able to consume at 20,000/sec in Java.  Here is someone in the gaming industry who has 48,000/sec ingress and 48,000/sec egress: http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/2011-April/012321.html  If you don’t want or need a broker, ZeroMQ is definitely worth a look and has good performance.alexis

    • http://www.rexflex.net/ Rex Morgan

      Thanks Alexis, I’ll give the mailing list a shot and see if it helps. It’s also possible that MassTransit is adding some overhead to each call.

  • http://www.ryanday.net/ Ryan

    My largest problem is the queue. Of course I’m not using Rabbit, I’m using Python’s built-in multiprocess Queue. It still destroys my performance though. I’m fine tuning things a little, and am finding that it is just quicker to process everything as I read it.

    When I just read, and process, I get ~250k message per second.
    When I use the queue I get almost 20k.

    This is (I’m fairly certain) because there isn’t much done to the data after its read. So it will take longer to copy memory between processes over a queue then to just go ahead and append an order to a list and move on. It makes threading difficult to do correctly.

    I’ve made some modifications to my original code that I’ll have to push. I’m going for optimizing operations instead of trying to be distributed. I do have a thread that runs every 5 seconds to give me updates on tickers that I’m following, other then that I am using a single thread for all processing.

    I’m glad to see that our message rates are pretty similar for similar operations though!

    • http://www.rexflex.net/ Rex Morgan

      Yeah, I saw that you were using a queue, and that’s what really got me thinking about using something like RabbitMQ. I’m going to try and tweak RabbitMQ a bit and see if I can get the speeds up, if I can’t, then I’m going to give ZeroMQ a shot.

      Thanks for the post, it’s really cool to see this stuff in action.

  • veyron

    Put a smile on my face to see this …