Look, there is someone on the tree!: List data optimizations experiment.

First and foremost it seems like a very important thing to tell that this test is entirely synthetic: it does not represent any significant information that you should consider when writing your applications.

This said, if you are interested in the use case I will share it at the end of the post.

Last week I had a discussion with a colleague about memory, garbage collection and CPU utilization for list-like data structures - basically we wanted to check a theoretical postulate deducted from the knowledge about how JavaScript VM works and some assumptions about the speed it operates on different objects types.

The riddle was as follows: given a list of objects representing points on a drawing board, consider possible implementation as to achieve speed and low memory footprint.

Two competing ideas were considered initially: Plain JavaScript arrays and linked list implementation as JavaScript objects with 'next' property.

Additionally we wanted to see what is the overhead of having garbage collection involved, for that reason we have implemented four versions of the test code:

Using array, recreating the array at each pass and recreate the objects in the array each time.
Using linked list, recreating the items in the list each time and linking them, de-reference the head of the list when cleaning.
Using array but taking the items from a pool and returning them before de-referencing the array on cleanup.
Using linked list and taking items from a pool and returning them before de-referencing the first item on the list.

To make it even more interesting we decided to implement the same thing in JavaScript and a language compiled to JavaScript, in our case Dart.

Then we had to come up with a value that is way out of the actual use case, but seemed like could be large enough to have a measurable performance implications. In our case we tried with 100BLN - however this simply crashed Chrome. After some experimentation we settles on 1MLN.

To simulate real world case we separated the creation and destruction of the data by binding it to buttons in the HTML,this was because we wanted to prevent any hidden optimization that could be performed if the scenario was run without pauses between or equal pauses. I am not aware if such optimization even exist, but I just wanted to be sure. Using setTimeout was not an option as I wanted to keep the code as clean as possible. Before running each test we were making a full refresh and force clean of memory, then we measured 6-10 creations and destruction and then we repeated the test and took the average time and memory of the execution.

The computer the test were run on is Intel(R) Core(TM) i5-3330 CPU @ 3.00GHz, 8GB mem, Fedora Linux (with all latest updates), Chrome 44.0.2403.130 in incognito window.

The results

## JavaScript

1) 210ms total time (75MB fluctuation)
2) 140ms total time (55MB fluctuation)
3) 70ms total time (23MB on 1 fold, 45MB on 2 folds)
4) 17ms total time (2K on 1 fold)

## Dart

1) 290ms total time (74MB fluctuation)
2) 130ms total time (56MB fluctuation)
3) 100ms total time (28MB on 1 fold, 55MB on 2 fold)
4) 29ms total time (1.5K fluctuation)

The first number of each measurement is the total time it took to execute the test function (i.e. without the handler of the click event of the button), the second one is the amount of memory the heap had to grow and subsequently clean for each run. Where 2 folds are used - the memory was cleaned every two runs, the first number is the amount it grew for single run. The last test does not trigger garbage collection for the small number of tests so it only grew.

As a side note: the time to clean things up in the case of a pool is almost the same as the time it takes to get items out of the pool, but it is not included in the total time as it happens asynchronously.

Because we had feedback on the idea even before deciding to actually implement it I will share some of the feedback:

arrays are very fast and you can use them everywhere, no need to optimize further
garbage collection is very fast for new generation objects so you do not need pools
arrays do not trigger garbage collection is you clean them with 'array.length = 0'
iteration over arrays is as fast as accessing properties on an object so you do not need linked list

Because we were never involved in the process of optimizing for garbage collected languages, let alone V8 it was safe to assume that we might be wrong and any of the above statement are true.

The first test was no surprise: 1 million objects created and de-referenced on each run was expected to be memory costly.

What took us by surprise was that a list with 1 million items on it actually took ~20MB of memory. Second test proved that having an array to contain all the items with such a big number of items was a significant overhead. At this stage we have also discovered that the 'array.length = 0' trick was no better than simply create a new array: arr = [].

There was simply no difference when comparing things in the context of 1 mln items.

The third test was interesting because it allowed us to limit the memory fluctuation using items from a pool and basically allowed us to measure much better how much memory the array actually consumes. The amount of memory used by the pool remains constant as it does not change the whole time.

As you might notice the time of creating an array and crating objects is identical to the time of using only the objects (linked list) and only the array (when using pool). This was logical and expected.

The final test recreates destroys the linked list entirely from the pool. We made sure to clean the 'next' property correctly when returning items to the pool. At this point we almost made it to the point of making it into time for a RAF, but not entirely. The code will take longer on a lower specs computer of course. At this point I was really surprised, for some reason most people I talked about this with (me including) expect that adding an item to an array should not be so costly.

I am almost sure that this is not the case for smaller arrays.

Potential use case

The reason I wanted to use the lowest memory solution was because of the context the data will be used in: drawing and animating on large canvas. I wanted to be sure that adding to the list and iterating over it is as efficient as possible as to avoid any pauses in the animation and drawing.

Dart

The reason I wanted to implement the same test in Dart is that it is marketed as a solution for large scale applications that perform just as well as JavaScript and is scale-able and more suitable for large apps and teams.

As one can see from the results there is a different between the two implementation, while I tried to follow the exact same semantics as close as possible. One difference that can be noticed in the dev tools is that the call stack is 3 levels deeper, however the measurement state that all those nested invocation have the same time share. I am not sure if this might be a good reason for the slower execution.

A side note for the Dart to JS compilation: it was done from the command line with the following transformation flags:

- $dart2js:

checked: false

minify: true

commandLineOptions: ['--trust-type-annotations', '--trust-primitives']

To my knowledge this should produce the smallest, fasted code possible.

Surely Dart VM could have beaten these results, but it is not in discussion anymore, JavaScript is the way to go. What worries me a little bit is if dart code when transpiled to JS will ever run as fast 'native' JavaScript. Of course this test is totally superficial and should be taken with a ton of salt, but still, it is one of the simplest things to do in computing (working with lists). Is it possible that small, insignificant slow-downs here and there in Dart could make a whole large application slower than a similar one written in JS? This is for the Dart developers to answer.

Conclusion

Test are fun. We had a hypothesis and we wanted to see what will happen. We did. Should we always prefer linked list over arrays (where possible, as indexing is much slower as well as random access)? Of course not, arrays are great!

But from time to time you might be in doubt how things work and if you correctly understand the execution of your code and the implication it has. In that case just write a test. Even if it is flowed, even if it is unrealistic - write a test and try to understand more about your target platform through it!

Co-authored with LexMihaylov

Look, there is someone on the tree!

Общо показвания

август 10, 2015

List data optimizations experiment.

Няма коментари: