XAML Anti-Patterns: Virtualization

Various incarnations of XAML, such as WPF, WinRT, and Windows Phone, represent a tremendously powerful, productive, and flexible UI development environment. However, with great power also comes great power to do things incorrectly, which leads to many projects based on XAML technologies that are in trouble in some way. In our position as a magazine (the author works for CODE) as well as a company with a consulting and training arm, not to mention our involvement in various events of all sizes, we here at EPS/CODE Magazine are exposed to a very large number of projects.

I've been involved in numerous project-rescue operations for XAML projects. In analyzing what went wrong with each project that's brought to us, the others on the team and I started to notice that there are several patterns prevalent in these troubled projects, and we've started to catalog the most important ones. We're now turning them into a series of articles that will appear in CODE Magazine.

You may have read about “anti-patterns” in other contexts before. Wikipedia describes the term as “a common response to a recurring problem that is usually ineffective and risks being highly counterproductive.” In other words, the term describes some goings-on that can easily be recognized as troublesome and will generally lead to horrible outcomes. This description matches the issues I aim to describe quite closely. Where things get a little more complicated for XAML anti-patterns is that in XAML Projects, things aren't quite as black and white as they might be for other anti-patterns.

For instance, virtualization, the subject of this article, can be a very powerful and positive technology. Nevertheless, it can also lead to very serious problems. Whether a XAML pattern is positive or a very problematic anti-pattern is often hard to discern. It takes careful analysis to understand when a certain pattern is positive, and when it might turn into an anti-pattern. My aim is to raise potential anti-patterns and help you analyze when the pattern is troublesome.

Whether a XAML pattern is positive or negative often requires further analysis.

In this first installment of this series, I will explore the world of virtualization, a technology Microsoft worked very hard to get right, and which can be powerful and greatly improve the performance of sophisticated UIs. Ironically, it can also turn into a source of very serious performance problems.

Let's start out with the basics and explore what virtualization is.

What is Virtualization?

Virtualization is a technique that can be used to improve the performance of data-bound item controls that contain a very large list of items. Imagine a ListBox in a WPF application (or any of the other XAML technologies, for that matter) that's data bound to a source that contains a 100,000 items. Without virtualization, the UI engine has to iterate over all the items in the data source and create a corresponding UI item for it. This presents a number of problems.

For one, if there 100,000 data items bound to a ListBox, the system has to create 100,000 UI objects and arrange them on the screen. This is an obvious performance as well as memory consumption issue. It's even a bit worse than it seems at first, because each item is made out of multiple objects. There's a ListBoxItem, which usually contains another UI element, such as a Grid or a StackPanel. This, in turn, contains a number of UI controls, such as TextBlocks, TextBoxes, Images, and more, which make up the appearance of each item. Thus 100,000 ListBox rows are likely to require well over a million objects that all have to be loaded and arranged and managed in the UI. Furthermore, individual controls that make up each ListBoxItem may themselves be templated controls, made up of a multitude of other controls. TextBoxes are a good example for that, since they are made out of border rectangles and a scrolling area (ScrollViewer) and triggers and so forth. It's likely that the object count is increased by an order of magnitude or more. It's not unreasonable to expect that the hypothetical ListBox of 100,000 items requires well over 10,000,000 objects to represent itself.

Typically, the process of populating a ListBox with items is aided by item templates, which allow developers to define a template that is to be applied to items within the ListBox. You can think of such a template as an example for what each row is to be like. The system can then use that template to create a copy of it for each actual row as needed. This is a very powerful process, but it's also a somewhat expensive process in terms of performance.

A further problem in this scenario is that, by definition, it involves data binding. Data binding in itself is not problematic, and it's even quite fast. However, 100,000 items are a lot of bindings. Especially when each row in the ListBox shows multiple contained controls, all of which have to be data bound to display something useful. Oh, and the templates that apply to each of these controls also use bindings internally (template bindings), and so on. This means that you easily also face millions of binding operations (binding objects) that have to be created, kept in memory, and need to be updated and observe changes in the data source to update the UI.

In short, loading 100,000 items into a ListBox is likely to be slow and consume oodles of memory. Not a solution that promises to be particularly workable. A better approach is needed for such large lists, which is why Microsoft introduced virtualization.

Virtualization allows binding the same ListBox to the same 100,000 items, but the setup of the ListBox is smart enough to determine that only a small sub-set of the 100,000 items (perhaps 30-50 or so) are visible at any given time. Therefore, the system only creates ListBoxItems for the items that are actually visible, and it pretends that the non-visible objects are there so that things like scroll bars work correctly (or close enough so nobody notices...scroll bars in ListBoxes can be off when each item has a flexible height). Voila! The 100,000 item ListBox now becomes feasible. It won't take hugely long to load it up, and it won't consume those oodles of memory.

How to Use Virtualization

Using virtualization is easy, since its use is the default (in WPF and other XAML environments). Therefore, if you were to load a ListBox bound to 100,000 items, it automatically virtualizes the items. Here is an example of a virtualized ListBox:

<ListBox ItemsSource="{Binding LargeList}" />

You would have to go out of your way to disable virtualization. You could do it like this:

<ListBox ItemsSource="{Binding LargeList}"
  VirtualizingStackPanel.IsVirtualizing="false" />

This is an odd property name. In fact, this isn't a normal ListBox property at all. Instead, it's an attached property (see my recent article in the May/June 2014 issue of CODE Magazine about attached properties) that's provided by a class called VirtualizingStackPanel. As it turns out, ListBoxes need to lay out the items within the ListBox (as a stack from top to bottom, but you could customize this to be completely different. For more, see my article about customizing ListBoxes in CODE Magazine Jan/Feb 2011. By default, this is done - wait for it - by the VirtualizingStackPanel class, which, in turn, respects the IsVirtualizing property to turn virtualization behavior on and off.

You could have also changed the layout template for the ListBox like this:

<ListBox ItemsSource="{Binding LargeList}">
    <ListBox.ItemsPanel>
        <ItemsPanelTemplate>
            <VirtualizingStackPanel IsVirtualizing="false"/>
        </ItemsPanelTemplate>
    </ListBox.ItemsPanel>
</ListBox>

This replaces the default item layout template with a new version of the VirtualizingStackPanel with virtualization turned off. The previous version is really just a shorter version of the later. It would not make all that much sense to go with the longer syntax as long as you want to use a virtualizing panel. However, you could also argue that it doesn't make much sense to use a virtualizing stack panel with virtualization turned off. If that is what you wanted to do, you might as well use a plain old stack panel, like this:

<ListBox ItemsSource="{Binding LargeList}">
    <ListBox.ItemsPanel>
        <ItemsPanelTemplate>
            <StackPanel />
        </ItemsPanelTemplate>
    </ListBox.ItemsPanel>
</ListBox>

This will most certainly turn virtualization off, since the standard StackPanel class doesn't support any kind of virtualization.

When Virtualization Goes Bad

At this point, you might wonder what might possibly be bad about virtualization. It seems that all of this makes a lot of sense and is highly desirable. And you would be right! In many scenarios, this is indeed a very good thing. But there are scenarios where virtualization leads to horrible performance, although the source of these performance problems is usually not recognized, because virtualization is what is supposed to make performance better.

Performance problems with virtualization are caused by the need to create and destroy objects all the time. The example with 100,000 ListBoxItems, 50 of which may be visible at any given time, leads to a ListBox that initially creates 50 items rather than the whole list. If you were to then scroll down a page in that list, 50 more items would be created, and the original 50 would be kicked out of memory, keeping memory consumption low. Scrolling further down causes more objects to get created and existing ones to be released. Scrolling back up to the top causes the system to create another 50 items and kick the ones you were just looking at further down, out of memory. It doesn't matter that you had already seen the first 50 items. They have to be re-created, as they had also been kicked out of memory.

You can easily observe this process yourself by giving your ListBox an ItemsTemplate such as this:

<ListBox ItemsSource="{Binding LargeList}">
    <ListBox.ItemTemplate>
        <DataTemplate>
            <l:CountGrid />
        </DataTemplate>
    </ListBox.ItemTemplate>
</ListBox>

This template uses a custom class as its item template. Here's an example of what that class could be like:

public class CountGrid : Grid
{
    public CountGrid()
    {
        Count++;
        Console.WriteLine("Instance # " + Count);
        Thread.Sleep(100);
    }

    public static int Count;
}

This specialized Grid keeps an instance count (as a static property) and echoes the count to the Console window every time an instance is created (this is a WPF example - WinRT has no Console). There also is a Thread.Sleep(100) in the constructor to simulate slow loading of each template so you can observe things a bit more easily.

When you run this, you can watch the Output window and see the counter go up to the number of items that fit into your current ListBox (say, the 50 items that might fit on screen, even if the total list was much longer). If you turned virtualization off, you would see the counter go up to the total items count (and potentially take a long time due to the thread sleep). With virtualization turned on, you can also observe how additional items are instantiated as you scroll up and down. For instance, you can mouse-wheel to the end of the list and see how all items are instantiated one by one, and then you can mouse-wheel way back to the top, and observe the same re-creation process run all over again. You can continue this process ad-nauseam and objects will continue to be created and destroyed.

In many scenarios, all this is fine. ListBox items may be small, the system can create and destroy them very quickly, and the user never notices any slow-down. The problem starts when this last statement isn't true. I've observed that developers create increasingly advanced data templates to create advanced virtualizations (for example, our own multi-column ListBoxes in CODE Framework, which turn ListBoxes into data grids). In that case, loading an item template may be more time consuming. In itself, it probably isn't slow, but as you do this continuously, it becomes quite noticeable. The example in the last code snippet simulates this by adding a Sleep() to each item creation. If you run this example, you'll notice that scrolling up and down is extremely sluggish and a very bad experience. Even if you reduce the sleep duration quite a bit, the impression of poor performance is quite severe, since any minor slowdown is very noticeable during scrolling. The system can even fall behind as people try to scroll up and down, and the problem grows bigger and bigger.

How big a problem is virtualization? It can be huge! Applications can become unusable.

How big a problem is this scrolling delay? Well, it can be huge! As XAML-based applications have grown more and more sophisticated, and data display requirements have grown; many lists contain increasingly complex items, shifting more and more scenarios from benefactors of virtualization to victims of it.

I have recently performed a project rescue on a WPF application that displayed very complex and highly customizable lists of data that were completely unusable, because scrolling from the top to the bottom of the list took well over half a minute. Even scrolling past a few items took several seconds. If the user grabbed the scroll bar with the mouse and started moving it down, the system quickly went into a state where it appeared completely hung for several minutes, trying to catch up with the user's interactions. It comes as no surprise that users, and even developers, thought the system was literally hung and unresponsive and wouldn't ever return to responsive status, and thus terminated the process. Only after I did some extensive testing did I discover that the system returned to a responsive state after several minutes. It was busy catching up with loading and unloading all the items. And to add insult to injury, the list never contained more than perhaps 100 or 150 items! All of this caused management to consider the project to be a failure, until I was asked to help at the last minute, and we managed to turn it around just in time for the looming release date.

How to Solve the Problem

As it turns out, in many scenarios, the simple solution is to not use virtualization! Just turn it off! If your system's main bottleneck is not the large number of items in the list, but the complexity of each item, then perhaps it is feasible to take the one-time hit during startup and then have everything in memory and ready to go. In the real-world example I described above, that was part of the answer. Loading 150 items during startup may take a few seconds, but users didn't perceive that as a particular problem since they were aware they were opening a complex form. (Besides, they are used to going to Web pages, and loading a page is generally slower than loading that particular form in the state we finalized on.) To the users, it was much more important that once they had the UI open and started to interact with it (in this case, potentially for several hours at a time) it would be responsive and a joy to use.

Modern apps tend to pull a small subset of data and display it very richly. That is when virtualization turns into a nasty anti-pattern!

Turning virtualization off was not the complete answer to the problem. The templates for each item in the list were so complex and customizable that they cumulatively took quite a bit of time to load, well over half a minute for 150 items in their original version. I had to pay specific attention to optimizing the load performance of each row. I paid particular attention to the number of objects needed for each row and thereby managed to eliminate quite a few containers, generally found ways to construct a UI that didn't need as many UI elements and associated control templates, and so on. The original version made use of dynamic elements quite a bit. It even did some text-based XAML parsing for customization options (and that code ran for each item every time the user scrolled a tiny bit). I also paid attention to the number of data bindings needed and optimized those. Rather than relying on virtualization to magically make things fast, I did some old-fashioned optimization and managed to get the load time from more than half a minute down to just two or three seconds.

Have a small(ish) list of complex items? Turn virtualization off!

So perhaps not using virtualization is a good solution. Do you have a small(ish) list of complex items? Turn virtualization off! In many recent apps, systems pull a small subset of data and display it very richly, in which case virtualization truly turns into a nasty anti-pattern.

But what if your scenario falls somewhere in the middle? Perhaps you have relatively complex items and a list that is also relatively large (perhaps more than 1000 items). In that case, not using virtualization may lead to unacceptable launch performance. Now what?

As it turns out, you have the ability to further tweak virtualization. Rather than creating and destroying objects all the time, objects can be recycled. Using this approach, the system creates the maximum number of items that may appear on the screen and then keeps reusing them. Here's how you can turn this recycling behavior on:

<ListBox ItemsSource="{Binding LargeList}"
  VirtualizingStackPanel.VirtualizationMode="Recycling"/>

Or, alternatively, you can set that property directly in your items panel template, just like you were able to turn virtualization on and off.

VirtualizationMode can be set to Standard (which, not surprisingly, is the default) and Recycling. You have already observed WPF's Standard mode, which creates and destroys elements all the time. Switching to Recycling mode, you can observe that once the first set of items is created, they are reused to display other elements. Objects are not created and destroyed anymore, which essentially solved our expensive object and template creation problem. The only scenario that would force the system to create new instances is a change in the list's size, so more items can be visible at once.

This may be a very good solution. Sadly, it's not all puppies and rainbows. One issue with this approach is that although UI objects can be reused and expensive object creation is thus avoided, each item has to be assigned a new data context whenever it is used to represent a different row. In other words: If you scroll through your list, the objects remain the same, but the data context has to be updated on each object. This causes all the data bindings to be re-evaluated. In many scenarios, this is fine, as binding isn't usually a big performance problem. But it is something to be aware of that can cause performance problems in very complex scenarios.

Another potential issue with this approach is that it only works well if the template for each element is predictably the same. In the real-world example described above, this was a deal-breaker, since each item was highly customizable and adjusted in many details to the underlying data row it represented. Especially due to user customization options, the variations from row to row were literally limitless. (This was much beyond typical template-selector scenarios where rows may be based on a few different templates.) For this reason, object recycling was not usable. It's fair to say that the benefit to object recycling varies greatly from scenario to scenario. (I imagine that this is the reason the WPF team didn't make recycling behavior the default.) But where it's applicable, it works well.

Note that default behavior varies in different systems. Although WPF doesn't recycle unless you turn that behavior on, WinRT recycles by default. In fact, WinRT chooses a bit of an odd path. The default property setting for VirtualizationMode is still Standard, but the standard in WinRT is to recycle. Setting this property to Recycling has no real effect as far as I can tell. I can see that recycling as a default makes sense in WinRT (where item templates tend to not be as sophisticated - yet - as they are in WPF), but not being able to turn it off means that if you use any virtualization, you always have to be prepared to handle the downsides of recycling.

What if you have a large list of items, each of which is complex, and recycling is not an option? In that case, there's no out-of-the-box solution that acts as a magic bullet. You can create custom layout panels to handle the loading of the items. For instance, we've been working on a “one time virtualization panel” or “lazy loading panel” that performs a delayed loading operation on the items so that initial load is fast and other objects are gradually loaded as needed but never removed (at the expense of memory consumption). This will probably ship with CODE Framework as a standard component at some point. But the details of such a solution are much beyond the scope of this article.

Anti-Pattern or Not?

So is virtualization an anti-pattern? In quite a few scenarios, yes it is. It's an anti-pattern in the sense that when you're approached with XAML apps that have performance problems with lists of items, virtualization is the prime suspect. It's the first item to evaluate to see whether the list in question has the key characteristic of having a large list of items (in which case virtualization is probably a positive) or whether its key characteristic is high per-item complexity (in which case virtualization should be turned off or set to recycle, if that is a viable option). As with so many things in programming, it depends.

XAML Anti-Patterns: Virtualization

Published in:

Filed under:

What is Virtualization?

How to Use Virtualization

When Virtualization Goes Bad

How to Solve the Problem

Anti-Pattern or Not?

This article was filed under:

This article was published in:

Have additional technical questions?