TVirtualStringTree - resetting non-visual nodes and memory consumption

https://stackoverflow.com/questions/2814565

26-09-2019
|

Question

I have an app that loads records from a binary log file and displays them in a virtual TListView. There are potentially millions of records in a file, and the display can be filtered by the user, so I do not load all of the records in memory at one time, and the ListView item indexes are not a 1-to-1 relation with the file record offsets (List item 1 may be file record 100, for instance). I use the ListView's OnDataHint event to load records for just the items the ListView is actually interested in. As the user scrolls around, the range specified by OnDataHint changes, allowing me to free records that are not in the new range, and allocate new records as needed.

This works fine, speed is tolerable, and the memory footprint is very low.

I am currently evaluating TVirtualStringTree as a replacement for the TListView, mainly because I want to add the ability to expand/collapse records that span multiple lines (I can fudge it with the TListView by incrementing/decrementing the item count dynamically, but this is not as straight forward as using a real tree).

For the most part, I have been able to port the TListView logic and have everything work as I need. I notice that TVirtualStringTree's virtual paradigm is vastly different, though. It does not have the same kind of OnDataHint functionality that TListView does (I can use the OnScroll event to fake it, which allows my memory buffer logic to continue working), and I can use the OnInitializeNode event to associate nodes with records that are allocated.

However, once a tree node is initialized, it sees that it remains initialized for the lifetime of the tree. That is not good for me. As the user scrolls around and I remove records from memory, I need to reset those non-visual nodes without removing them from the tree completely, or losing their expand/collapse states. When the user scrolls them back into view, I can re-allocate the records and re-initialize the nodes. Basically, I want to make TVirtualStringTree act as much like TListView as possible, as far as its virtualization is concerned.

I have seen that TVirtualStringTree has a ResetNode() method, but I encounter various errors whenever I try to use it. I must be using it wrong. I also thought of just storing a data pointer inside each node to my record buffers, and I allocate and free memory, update those pointers accordingly. The end effect does not work so well, either.

Worse, my largest test log file has ~5 million records in it. If I initialize the TVirtualStringTree with that many nodes at one time (when the log display is unfiltered), the tree's internal overhead for its nodes takes up a whopping 260MB of memory (without any records being allocated yet). Whereas with the TListView, loading the same log file and all the memory logic behind it, I can get away with using just a few MBs.

Any ideas?

Solution

You probably shouldn't switch to VST unless you have a use for at least some of the nice features of VST that a standard listbox / listview don't have. But there is of course a large memory overhead compared to a flat list of items.

I don't see a real benefit in using TVirtualStringTree only to be able to expand and collapse items that span multiple lines. You write

mainly because I want to add the ability to expand/collapse records that span multiple lines (I can fudge it with the TListView by incrementing/decrementing the item count dynamically, but this is not as straight forward as using a real tree).

but you can implement that easily without changing the item count. If you set the Style of the listbox to lbOwnerDrawVariable and implement the OnMeasureItem event you can adjust the height as required to draw either only the first or all lines. Drawing the expander triangle or the little plus symbol of a tree view manually should be easy. The Windows API functions DrawText() or DrawTextEx() can be used both to measure and draw the (optionally word-wrapped) text.

Edit:

Sorry, I completely missed the fact that you are using a listview right now, not a listbox. Indeed, there is no way to have rows with different heights in a listview, so that's no option. You could still use a listbox with a standard header control on top, but that may not support everything you are using now from listview functionality, and it may itself be as much or even more work to get right than dynamically showing and hiding listview rows to simulate collapsing and expanding.

OTHER TIPS

If I understand it correctly, the memory requirement of TVirtualStringTree should be:

nodecount * (SizeOf(TVirtualNode) + YourNodeDataSize + DWORD-align-padding)

To minimize the memory footprint, you could perhaps initialize the nodes with only pointers to offsets to a memory-mapped file. Resetting nodes which have already been initialized doesn't seem necessary in this case - the memory footprint should be nodecount * (44 + 4 + 0) - for 5 million records, about 230 MB.

IMHO you can't get any better with the tree but using a memory-mapped file would allow you to read the data directly from the file without allocating even more memory and copying the data to it.

You could also consider using a tree structure instead of a flat view to present the data. That way you could initialize child nodes of a parent node on demand (when the parent node is expanded) and resetting the parent node when it's collapsed (therefore freeing all its child nodes). In other words, try not to have too many nodes at the same level.

To meet your requirement "to expand/collapse records that span multiple lines", I'd simply use a drawgrid. To check it out, drag a drawgrid onto a form, then plug in the following Delphi 6 code. You can collapse and expand 5,000,000 multiline records (or whatever quantity you want) with essentially no overhead. It's a simple technique, doesn't require much code, and works surprisingly well.


unit Unit1;

interface

uses
  Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls, Forms, Dialogs, Grids, StdCtrls;

type
  TForm1 = class(TForm)
    DrawGrid1: TDrawGrid;
    procedure DrawGrid1DrawCell(Sender: TObject; ACol, ARow: Integer; Rect: TRect; State: TGridDrawState);
    procedure DrawGrid1SelectCell(Sender: TObject; ACol, ARow: Integer; var CanSelect: Boolean);
    procedure DrawGrid1TopLeftChanged(Sender: TObject);
    procedure DrawGrid1DblClick(Sender: TObject);
    procedure FormCreate(Sender: TObject);
  private
    procedure AdjustGrid;
  end;

var
  Form1: TForm1;

implementation

{$R *.dfm}

// Display a large number of multi-line records that can be expanded or collapsed, using minimal overhead.
// LinesInThisRecord() and RecordContents() are faked; change them to return actual data.

const TOTALRECORDS = 5000000; // arbitrary; a production implementation would probably determine this at run time

// keep track of whether each record is expanded or collapsed
var isExpanded: packed array[1..TOTALRECORDS] of boolean; // initially all FALSE

function LinesInThisRecord(const RecNum: integer): integer;
begin // how many lines (rows) does the record need to display when expanded?
result := (RecNum mod 10) + 1; // make something up, so we don't have to use real data just for this demo
end;

function LinesDisplayedForRecord(const RecNum: integer): integer;
begin // how many lines (rows) of info are we currently displaying for the given record?
if isExpanded[RecNum] then result := LinesInThisRecord(RecNum) // all lines show when expanded
else result := 1; // show only 1 row when collapsed
end;

procedure GridRowToRecordAndLine(const RowNum: integer; var RecNum, LineNum: integer);
var LinesAbove: integer;
begin // for a given row number in the drawgrid, return the record and line numbers that appear in that row
RecNum := Form1.DrawGrid1.TopRow; // for simplicity, TopRow always displays the record with that same number
if RecNum > TOTALRECORDS then RecNum := 0; // avoid overflow
LinesAbove := 0;
while (RecNum > 0) and ((LinesDisplayedForRecord(RecNum) + LinesAbove) < (RowNum - Form1.DrawGrid1.TopRow + 1)) do
  begin // accumulate the tally of lines in expanded or collapsed records until we reach the row of interest
  inc(LinesAbove, LinesDisplayedForRecord(RecNum));
  inc(RecNum); if RecNum > TOTALRECORDS then RecNum := 0; // avoid overflow
  end;
LineNum := RowNum - Form1.DrawGrid1.TopRow + 1 - LinesAbove;
end;

function RecordContents(const RowNum: integer): string;
var RecNum, LineNum: integer;
begin // display the data that goes in the grid row.  for now, fake it
GridRowToRecordAndLine(RowNum, RecNum, LineNum); // convert row number to record and line numbers
if RecNum = 0 then result := '' // out of range
else
  begin
  result := 'Record ' + IntToStr(RecNum);
  if isExpanded[RecNum] then // show line counts too
    result := result + ' line ' + IntToStr(LineNum) + ' of ' + IntToStr(LinesInThisRecord(RecNum));
  end;
end;

procedure TForm1.AdjustGrid;
begin // don't allow scrolling past last record
if DrawGrid1.TopRow > TOTALRECORDS then DrawGrid1.TopRow := TOTALRECORDS;
if RecordContents(DrawGrid1.Selection.Top) = '' then // move selection back on to a valid cell
  DrawGrid1.Selection := TGridRect(Rect(0, TOTALRECORDS, 0, TOTALRECORDS));
DrawGrid1.Refresh;
end;

procedure TForm1.DrawGrid1DrawCell(Sender: TObject; ACol, ARow: Integer; Rect: TRect; State: TGridDrawState);
var s: string;
begin // time to draw one of the grid cells
if ARow = 0 then s := 'Data' // we're in the top row, get the heading for the column
else s := RecordContents(ARow); // painting a record, get the data for this cell from the appropriate record
// draw the data in the cell
ExtTextOut(DrawGrid1.Canvas.Handle, Rect.Left, Rect.Top, ETO_CLIPPED or ETO_OPAQUE, @Rect, pchar(s), length(s), nil);
end;

procedure TForm1.DrawGrid1SelectCell(Sender: TObject; ACol, ARow: Integer; var CanSelect: Boolean);
var RecNum, ignore: integer;
begin
GridRowToRecordAndLine(ARow, RecNum, ignore); // convert selected row number to record number
CanSelect := RecNum <> 0; // don't select unoccupied rows
end;

procedure TForm1.DrawGrid1TopLeftChanged(Sender: TObject);
begin
AdjustGrid; // keep last page looking good
end;

procedure TForm1.DrawGrid1DblClick(Sender: TObject);
var RecNum, ignore, delta: integer;
begin // expand or collapse the currently selected record
GridRowToRecordAndLine(DrawGrid1.Selection.Top, RecNum, ignore); // convert selected row number to record number
isExpanded[RecNum] := not isExpanded[RecNum]; // mark record as expanded or collapsed; subsequent records might change their position in the grid
delta := LinesInThisRecord(RecNum) - 1; // amount we grew or shrank (-1 since record already occupied 1 line)
if isExpanded[RecNum] then // just grew
else delta := -delta; // just shrank
DrawGrid1.RowCount := DrawGrid1.RowCount + delta; // keep rowcount in sync
AdjustGrid; // keep last page looking good
end;

procedure TForm1.FormCreate(Sender: TObject);
begin
Caption := FormatFloat('#,##0 records', TOTALRECORDS);
DrawGrid1.RowCount := TOTALRECORDS + 1; // +1 for column heading
DrawGrid1.ColCount := 1;
DrawGrid1.DefaultColWidth := 300; // arbitrary
DrawGrid1.DefaultRowHeight := 12; // arbitrary
DrawGrid1.Options := DrawGrid1.Options - [goVertLine, goHorzLine, goRangeSelect] + [goDrawFocusSelected, goThumbTracking]; // change some defaults
end;

end.

You shouldn't use ResetNode because this method invokes InvalidateNode and initializes node again, leading to opposite effect than expected. I don't know if it's possible to induce VST to free memory size specified in NodeDataSize without actually removing node. But why not set NodeDataSize to size of Pointer ( Delphi, VirtualStringTree - classes (objects) instead of records ) and manage data yourself? Just an idea...

Give "DeleteChildren" a try. Here's what this procedure's comment says:

// Removes all children and their children from memory without changing the vsHasChildren style by default.

Never used it, but as I read it, you can use that in the OnCollapsed event to free the memory allocated to nodes that just became invisible. And then re-generate those nodes in OnExpading so the user never knows the node went away from memory.

But I can't be sure, I never had a need for such behaviour.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow