I believe I finally managed to create the structure I wanted to. The main problem that I was having is that I was still thinking in C (or sequential languages), so what I wanted to do was build a single binary search tree (bst) that scales well. What I realized though, is that rather than doing that, I should create N bsts where N is proportional to the number of processors available in the system.
So what I'm doing now is spawn a number of threads (servers), each thread is responsible for a bst that covers a range of keys, and whenever an operation is performed, the client sends it to the particular server that's responsible for the tree in that range. If this doesn't make sense, then maybe my code will make it clearer:-
The main problem with this implementation though is that if the distribution of the keys isn't uniform within the expected range, it won't perform well. But I guess that is a problem with non-balanced bsts in general.
That said, this solution turned out to be simpler than other solutions I've attempted, performs better in the single-threaded case, and scales pretty well.
Here's a link to a graph that shows how well this scales on an 8 core machine:-
Compare with my previous attempt:-
Can't really ask for better than that.
What I need to do now is figure out a way to balance the load somehow when it's not uniform.