Question

I implemented a data virtualization solution using some ideas from CodePlex and the blog of Bea Stollnitz and Vincent Da Ven Berhge's paper (same link). However I needed a different approach so I decided to write my own solution.

I am using a DataGrid to display about a million rows with this solution. I am using UI virtualization as well. My solution is feasible, but I experience some weird behavior in certain situations on how the DataGrid requests data from its source.

About the solution

I ended up writing a list which does all the heavy work. It is a generic class named VirtualList<T>. It implements the ICollectionViewFactory interface, so the collection view creation mechanism can create a VirtualListCollectionView<T> instance to wrap it. This class inherits from ListCollectionView. I did not follow the suggestions to write my own ICollectionView implementation. Inheriting seems to work fine as well.

The VirtualList<T> splits the whole data into pages. It gets the total item count and every time the DataGrid requests for a row via the list indexer it loads the appropriate page or returns it from the cache. The pages are recycled inside and a DispatcherTimer disposes unused pages in idle time.

Data request patterns

  • The first thing I learned, that VirtualList<T> should implement IList (non generic). Otherwise the ItemsControl will treat it as an IEnumerable and query/enumerate all the rows. This is logical, since the DataGrid is not type safe, so it cannot use the IList<T> interface.

  • The row with 0 index is frequently asked by the DataGrid. It is seem to be used for visual item measurement (according to the call stack). So, I simply cache this one.

  • The caching mechanism inside the DataGrid uses a predictable pattern to query the rows it shows. First it asks for the visible rows from top to bottom (two times for every row), then it queries a couple of rows (depending on the size of the visible area) before the visible area (including the first visible row) in a descending order so, from bottom to top. After that it requests for a same amount of rows after the visible rows (including the last visible row) from top to bottom.

    If the visible row indexes are 4,5,6. The data request would be: 4,4,5,5,6,6,4,3,2,1,6,7,8,9.

    If my page size is properly set, I can serve all these requests from the current and previously loaded page.

  • If CanSelectMultipleItems is True and the user selects multiple items using the SHIFT button or mouse drag, the DataGrid enumerates all the rows from the beginning of the list to the end of the selection. This enumeration happens via the IEnumerable interface regardless of that IList is implemented or not.

  • If the selected row is not visible and the current visible area is "far" from the selected row, sometimes DataGrid starts requesting all the items, from the selected row to the end of the visible area. Including all the rows in between which are not even visible. I could not figure out the exact pattern of this behavior. Maybe my implementation is the reason for that.

My questions

  • I am wondering, why the DataGrid requests for non visible rows, since those rows will be requested again when become visible?

  • Why is it necessary to request every row two or three times?

  • Can anyone tell me how to make the DataGrid not to use IEnumerable, except turning off multiple item selection?

Was it helpful?

Solution

I at least found some way to fool the VirtualList. You can read it here.

If you have found another solution (that is even better than mine), please tell me!

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top