Question

So I am building a webpart which enumerates all doclibs in the current web, grabbing all checked out files and listing them in a table, allowing a user to apply some required metadata and check in many at once (50).

This webpart must operate on lists over the list view threshold (currently set at 15,000 in production), and also on sites that are VERY large (50,000 - 100,000+ documents).

I'm following the MS Best practices for handling large lists here: http://msdn.microsoft.com/en-us/library/ee557257.aspx

Using an SPquery (with no CAML defined at all), retrieving pages of 2,000 items and parsing that way. The problem with this is, that the webpart is actually causing a timeout to occur on those very large (50k+) sites. So I'm trying to be a bit smarter with my CAML, pulling in only items that are checked out:

spQuery.Query = "<Where><IsNotNull>
    <FieldRef Name=\"CheckoutUser\" LookupId=\"TRUE\"/>
</IsNotNull></Where>";
spQuery.RowLimit = 2000;
spQuery.ViewAttributes = "Scope=\"Recursive\"";

I'm using a CrossListQueryInfo query to query the whole web with this same caml, and it works wonderfully when no list is over the LVT. If one is, I catch that exception and re-try with the 'slower' SPQuery on each individual library.

From everything I am reading, as long as my CAML is returning less items than the LVT, it should work. But using the CAML above causes the error The attempted operation is prohibited because it exceeds the list view threshold enforced by the administrator to be thrown when SPList.GetItems(spQuery) is called. Since I'm setting a rowlimit of 2,000, shouldn't that never happen? MS Suggests to execute an SPQuery with no CAML defined at all - basically grabbing all items from the library in pages of 2,000. So I can't make any sense of why my CAML is failing only on lists over the view threshold.

Edit: Upon further research, I am attempting to utilize the ContentIterator class to meet my needs (http://msdn.microsoft.com/en-us/library/microsoft.office.server.utilities.contentiterator.aspx). Using examples from this post: http://extreme-sharepoint.com/2012/07/17/data-access-via-caml-queries/

I am still failing on the contentiterator with the same LVT error.

Does the 'CheckoutUser' field need to be indexed on every list we want to execute this query against?

Update 2: This comes down to the 'CheckoutUser' field not being indexed, and trying to query against it. Unfortunately it is not an option for us to go out and force this index on every library in the farm. I believe my only option at this point is to implement some sort of paging scheme on very large sites to process items in batches.

Final Update: As a solution, I've decided to enforce indexing the 'CheckoutUser' column for libraries. This should vastly improve the overall performance of the webpart, and allow support for very large sites. There will be a bit of headache immediately following deployment, as we will need to manually set the column index on lists over the list view threshold, but in the long run this will be best.

Was it helpful?

Solution

My understanding of how the List View Threshold works is limited, however, I suspect the reason your CAML query is failing is because it is filtering on the non-indexed field, and the RowLimit is applied "after".

I think that indexing the CheckoutUser field would solve your problem, however it sounds like you have a lot of sites.

Allow me to suggest an alternate approach - paging. Since the ID column on every list is indexed, implement a CAML query where you are filtering FIRST (very important) on the ID column less than, say, 2000, and secondly on the CheckoutUser field. If that query does not return the desired number of results, then increase 2000 to 4000 and repeat.

I have not implemented this solution myself, it is just a thought.

OTHER TIPS

If you are querying large lists, you should definitely index the fields on which you are querying. Throttling does not occur because the indexed fields are stored in a separate table in the database.

This blog post of mine should be of some help I think: http://vrdmn.blogspot.in/2012/11/sharepoint-list-indexes-under-hood.html

Licensed under: CC-BY-SA with attribution
Not affiliated with sharepoint.stackexchange
scroll top