Monday 2 December 2013

Announcing the JAQ(Java Alternative Queues) Project


{This post is part of a long running series on lock free queues, checkout the full index to get more context here}
To quote Confucius:
"To learn and then practice it time and again is a pleasure, is it not? To have friends come from afar to share each others learning is a pleasure, is it not? To be unperturbed when not appreciated by others is gentlemanly, is it not?" - Analects 1:1
It is obvious to me the old man was talking about open source software, where we repeat what we learn, share with friends from afar, and try and behave when no one seems to get it. In this spirit I am going to try and apply lessons learnt and put together a concurrent queues library for Java - Java Alternative Queues.
It's early stages, but at this point I would value some feedback on:
  1. Intentions
  2. Interfaces and usability
  3. Project roadmap

Intentions

When concurrent queues are concerned, it is my opinion that the JDK offering has been robust, but too generic to benefit from the performance gains offered by a more explicit declaration of requirements. JAQ would tackle this by providing queues through a requirements focused factory interface allowing the user to specify upfront:
  1. Number of producers/consumers
  2. Growth: Bounded/Unbounded
  3. Ordering (FIFO/other)
  4. Size
  5. Prefer throughput/latency
To see a wider taxonomy of queues see 1024Cores.net excellent analysis. At this point all the queues I plan to implement are non-blocking and lock-free as my main interest lies in that direction, but many of the principals may hold for blocking implementations and those may get added in the future.

Interfaces and Usability

I like the idea of separating several entities here:
  • ConcurrentQueueFactory - Tell me what you need, through a ConcurrentQueueSpec.
  • ConcurrentQueue - The queue data structure, provided by the factory. At the moment it does nothing but hand out producer/consumer instances. This is where pesky methods such as size/contains may end up. I'm not keen on supporting the full Queue interface so feedback on what people consider essential will be good.
  • ConcurrentQueueConsumer - A consumer interface into the queue, provided by the queue. I'm planning to support several consumption styles,.
  • ConcurrentQueueProducer - A producer interface into the queue, provided by the queue.
The producer/consumer instances are thread specific and the appropriate thread should call into the queue provider method. Here is the old QueuePerfTest converted to use the new interface (I cleared out the irrelevant cruft for timing this and that):

I realize this goes against the current Queue interface, but part of the whole point here is that the more we know about the calling code the better performance/correctness we can hope to offer.

Roadmap

I'd like to tackle the work in roughly the following order:
  • Specify/document/settle on high level interfaces (initial cut done)
  • SPSC implementations/tests/benchmarks (good bounded SPSC implementation is done, some benchmarks)
  • MPSC implementations/tests/benchmarks (some bounded MPSC variants are included but not integrated)
  • SPMC implementations/tests/benchmarks (not started)
  • MPMC implementations/tests/benchmarks (not started)
There's more I want to do in this area, but the above will keep me busy for a while so I'll start with that and increase the scope when it reaches a satisfying state.
I'm using JMH (and getting valuable support from @shipilev) for benchmarking the queues and hoping to use JCStress to test multi-threaded correctness. 

Contributors/Interested Parties

I know I'll be using this library in the near future for a few projects, but I hope it will be generally useful so your feedback, comments and observations are very welcome. I've not been involved much in open-source projects before, so any advise on project setup is also welcome. Finally, if you feel like wading in and cutting some code, adding some tests or benchmarks, reporting some bugs or expressing interest in features BRING IT ON  :-) pull requests are very welcome.

6 comments:

  1. That is a great idea! I am really looking forward for low latency MPSC queue :) Have you plans for different delivery modes as well? E.g. queue, that does not guarantee delivery

    ReplyDelete
  2. Hi Nitsan,

    had a look at the API, some comments from a pure library user perspective:

    * in large enterprise projects, you'll find like 5 classes named "ConcurrentQueue" inherited from some mvn dependency. Even if this looks stupid, "JaqConcurrentQueue" could prevent trouble here ...
    * supporting full queue interface certainly isn't necessary. However its nice if one could switch between distinct queue implementations without too many code changes.
    * Will there be support for blocking queues with pluggable Blocking (Spin, yield, ...) strategies ? This would ease use in existing code.
    * In practice the exact number of Producers/Consumers of a MPSC/SPMC is unknown at construction time.


    ReplyDelete
    Replies
    1. Thanks:

      * A quick search for ConcurrentQueue.java has brought up nothing... which dependencies did you have in mind?

      * The factory interface is supposed to allow you to switch the implementation by changing your requirements. The factory hides the mapping between your requirements and the implementation. This mapping can be made configurable if need be.

      * I'm currently not that interested in blocking, but adding a blocking wrapper is easy enough. I'll consider the interface if there is interest, currently thinking that the strategy is applied to the queue consumer: "E e = strategy.poll(consumer);" What do you think?

      * The number of consumers/producers is sometimes known at construction time. Certainly for SPMC/MPSC type relationships. Further more for some thread pool based solutions the size of the pool is often fixed and can be set at construction time. I agree that often 'many' is an open ended number which may change at runtime, but the premise here is "the more you tell me, the more I can do for you"

      Keep the feedback coming :-)

      Delete
    2. 1) Naming: Well, you are right. It still is a very generic name ..
      2) Factory: Chances existing code makes use of your factory is not too high. The easier your stuff can be plugged into existing code, the more users you'll find. If I am about to fumble with like 5 alternative implementations of a common datastructure, I'll be biased to solutions which easily plug into existing code.
      3) But users will be interested in that. Agree its not hard to wrap, its just for convenience of the first 10 minutes, else you get "Shit, no add method, I'll try another ..". 99% of developers have no clue what you are doing inside your queues, they just will test if yours is "faster" (if it doesn't make too much effort). I like the strategy.poll() approach, but I'd predict your code will be easier adapted if you also provide a java.util.Queue'ish blocking add() / remove() wrapper using a default strategy. People love defaults (because they never heard of "spin" and "yield" ;-) ). Additionally it might be more convenient to provide the blocking strategy at construction time, so you do not need 2 object refs to operate a queue.

      Delete
    3. er.. 3) I meant BlockingQueue.put+take not Queue.add/remove

      Delete
    4. 2) If you got existing code, you'll probably want the Queue interface. Not sure if the factory will be your stumbling block. Constructing via a Spec can probably be made more friendly.
      3) I agree allot of people want blocking queues... I could add a wrapper to the consumer/producer to give you that, maybe that is a friendlier option... Something like: "BlockingConsumer bc = new BlockingConsumer(q.consumer(), waitStragegy); E e = bc.take();"
      Thanks :-) it's good to have a discussion about this, I'll try and get you on google+

      Delete

Note: only a member of this blog may post a comment.