libstdc++/41975
François Dumont
frs.dumont@gmail.com
Wed Nov 9 20:39:00 GMT 2011
Hi
I just wanted to let you know that I am making progress on this issue.
I have finally removed the begin/end nodes in buckets. However I
now know how to also replace the doubly linked list with a singly linked
one but I need a couple of additional days to finalize this patch. So I
hope to be able to submit a new proposition by Monday. I hope it will be
ok for 4.7.0.
Regards
On 10/21/2011 09:25 PM, François Dumont wrote:
> Here is an other step. This time I have changed the _Bucket type,
> it used to be a simple _Node pointer, it is now a struct containing 2
> nodes, the bucket begin and past-the-end nodes. This way I have been
> able to remove the newly introduced std::fill calls.
>
> I also change the performance test in such a way that it clearly
> show the performance issue reported by 41975 DR. It is now so clear
> that I had to rollback the number of tested element to 200 000, with 5
> 000 000 the current implementation was really too slow. I also add
> tests to compare erase from iterator from erase from key.
>
> Here is the performance results:
>
> - Current implementation:
> 41975.cc first insert
> 2r 2u 0s 8452192mem 0pf
> 41975.cc erase from iterator 2832r
> 2817u 3s -6400096mem 0pf
> 41975.cc second insert
> 2r 1u 0s 6400032mem 0pf
> 41975.cc erase from key
> 1r 1u 0s -6400032mem 0pf
>
> - With patch:
> 41975.cc first insert
> 3r 2u 0s 13704256mem 0pf
> 41975.cc erase from iterator
> 1r 1u 0s -9600064mem 0pf
> 41975.cc second insert
> 2r 2u 0s 9600000mem 0pf
> 41975.cc erase from key
> 1r 1u 0s -9600000mem 0pf
>
> I also started to document a little bit the new data structure, I
> will enhance it if it is finally adopted.
>
> With this implementation there is only one place left were bigs
> holes in buckets will have an impact. It is when inserting a node in a
> bucket for the first time, we need to look for the previous non-empty
> bucket to link the nodes correctly. I will try to find a test showing
> this problem. A good way to avoid it would still be to shrink the
> hashtable which is possible in the insert methods.
>
> I have started to think about having the same performance without
> adding so many memory but I fear that this will have impact on
> operations. I will have a try however.
>
> François
>
>
> On 10/12/2011 10:42 PM, Paolo Carlini wrote:
>> Hi,
>>> Here is an other step forward to fix this issue. In this step nodes
>>> are now doubly linked to each other. It avoids some loops on buckets
>>> but performance are still not ok. I have created a performance test,
>>> see attached 41975.cc.
>> I think we can as well commit it. I can do it.
>>> Before the patch the result was:
>>>
>>> 41975.cc Container generation
>>> 2r 1u 0s 8452192mem 0pf
>>> 41975.cc Container erase
>>> 21r 21u 0s -6400096mem 0pf
>>> 41975.cc Container iteration
>>> 10r 10u 0s 32mem 0pf
>>>
>>> With the attached patch it is:
>>>
>>> 41975.cc Container generation
>>> 3r 2u 0s 11652176mem 0pf
>>> 41975.cc Container erase
>>> 185r 182u 0s -9600080mem 0pf
>>> 41975.cc Container iteration
>>> 0r 0u 0s 48mem 0pf
>>>
>>> Of course there are more memory manipulated. Regarding operations
>>> CPU consumption iteration is much better but erase is much worst. I
>>> think the problem is in the invocation of std::fill that cost a lot
>>> as soon as we start having a lot of buckets to update. This is why I
>>> have already started to work on a last evolution to avoid those calls.
>> Excellent. I'm afraid I still don't have much to say/help and given
>> that I don't want to distract you anyway, but where does the
>> additional complexity of erase come from? Could you describe in a few
>> sentences the steps you go through at erase time?
>>
>> I was also thinking that it would be great if you could add comments
>> to the code, even if some have to be redone when you rework the code
>> in the following drafts. And an introductory comment right before
>> _Hashtable describing the data structures. Like, eg, the one in
>> std::deque, very well done, IMO.
>>
>> Tomorrow, besides committing the performance testcase, I will also
>> add those missing count testcases. Just to be of some small help ;)
>>
>> As soon as I'm done with a couple of pending things, I should be able
>> to assist you a bit.
>>
>> Thanks,
>> Paolo.
>>
>
>
More information about the Libstdc++
mailing list