This problem can be solved using the Priority-Flood algorithm. It's been discovered and published a number of times over the past few decades (and again by other people answering this question), though the specific variant you're looking for is not, to my knowledge, in the literature.
You can find a review paper of the algorithm and its variants here. Since that paper was published an even faster variant has been discovered (link), as well as methods to perform this calculation on datasets of trillions of cells (link). A method for selectively breaching low/narrow divides is discussed here. Contact me if you'd like copies of any of these papers.
I have a repository here with many of the above variants; additional implementations can be found here.
A simple script to calculate volume using the RichDEM library is as follows:
#include "richdem/common/version.hpp"
#include "richdem/common/router.hpp"
#include "richdem/depressions/Lindsay2016.hpp"
#include "richdem/common/Array2D.hpp"
/**
@brief Calculates the volume of depressions in a DEM
@author Richard Barnes (rbarnes@umn.edu)
Priority-Flood starts on the edges of the DEM and then works its way inwards
using a priority queue to determine the lowest cell which has a path to the
edge. The neighbours of this cell are added to the priority queue if they
are higher. If they are lower, then they are members of a depression and the
elevation of the flooding minus the elevation of the DEM times the cell area
is the flooded volume of the cell. The cell is flooded, total volume
tracked, and the neighbors are then added to a "depressions" queue which is
used to flood depressions. Cells which are higher than a depression being
filled are added to the priority queue. In this way, depressions are filled
without incurring the expense of the priority queue.
@param[in,out] &elevations A grid of cell elevations
@pre
1. **elevations** contains the elevations of every cell or a value _NoData_
for cells not part of the DEM. Note that the _NoData_ value is assumed to
be a negative number less than any actual data value.
@return
Returns the total volume of the flooded depressions.
@correctness
The correctness of this command is determined by inspection. (TODO)
*/
template <class elev_t>
double improved_priority_flood_volume(const Array2D<elev_t> &elevations){
GridCellZ_pq<elev_t> open;
std::queue<GridCellZ<elev_t> > pit;
uint64_t processed_cells = 0;
uint64_t pitc = 0;
ProgressBar progress;
std::cerr<<"\nPriority-Flood (Improved) Volume"<<std::endl;
std::cerr<<"\nC Barnes, R., Lehman, C., Mulla, D., 2014. Priority-flood: An optimal depression-filling and watershed-labeling algorithm for digital elevation models. Computers & Geosciences 62, 117–127. doi:10.1016/j.cageo.2013.04.024"<<std::endl;
std::cerr<<"p Setting up boolean flood array matrix..."<<std::endl;
//Used to keep track of which cells have already been considered
Array2D<int8_t> closed(elevations.width(),elevations.height(),false);
std::cerr<<"The priority queue will require approximately "
<<(elevations.width()*2+elevations.height()*2)*((long)sizeof(GridCellZ<elev_t>))/1024/1024
<<"MB of RAM."
<<std::endl;
std::cerr<<"p Adding cells to the priority queue..."<<std::endl;
//Add all cells on the edge of the DEM to the priority queue
for(int x=0;x<elevations.width();x++){
open.emplace(x,0,elevations(x,0) );
open.emplace(x,elevations.height()-1,elevations(x,elevations.height()-1) );
closed(x,0)=true;
closed(x,elevations.height()-1)=true;
}
for(int y=1;y<elevations.height()-1;y++){
open.emplace(0,y,elevations(0,y) );
open.emplace(elevations.width()-1,y,elevations(elevations.width()-1,y) );
closed(0,y)=true;
closed(elevations.width()-1,y)=true;
}
double volume = 0;
std::cerr<<"p Performing the improved Priority-Flood..."<<std::endl;
progress.start( elevations.size() );
while(open.size()>0 || pit.size()>0){
GridCellZ<elev_t> c;
if(pit.size()>0){
c=pit.front();
pit.pop();
} else {
c=open.top();
open.pop();
}
processed_cells++;
for(int n=1;n<=8;n++){
int nx=c.x+dx[n];
int ny=c.y+dy[n];
if(!elevations.inGrid(nx,ny)) continue;
if(closed(nx,ny))
continue;
closed(nx,ny)=true;
if(elevations(nx,ny)<=c.z){
if(elevations(nx,ny)<c.z){
++pitc;
volume += (c.z-elevations(nx,ny))*std::abs(elevations.getCellArea());
}
pit.emplace(nx,ny,c.z);
} else
open.emplace(nx,ny,elevations(nx,ny));
}
progress.update(processed_cells);
}
std::cerr<<"t Succeeded in "<<std::fixed<<std::setprecision(1)<<progress.stop()<<" s"<<std::endl;
std::cerr<<"m Cells processed = "<<processed_cells<<std::endl;
std::cerr<<"m Cells in pits = " <<pitc <<std::endl;
return volume;
}
template<class T>
int PerformAlgorithm(std::string analysis, Array2D<T> elevations){
elevations.loadData();
std::cout<<"Volume: "<<improved_priority_flood_volume(elevations)<<std::endl;
return 0;
}
int main(int argc, char **argv){
std::string analysis = PrintRichdemHeader(argc,argv);
if(argc!=2){
std::cerr<<argv[0]<<" <Input>"<<std::endl;
return -1;
}
return PerformAlgorithm(argv[1],analysis);
}
It should be straight-forward to adapt this to whatever 2d array format you are using
In pseudocode, the following is equivalent to the foregoing:
Let PQ be a priority-queue which always pops the cell of lowest elevation
Let Closed be a boolean array initially set to False
Let Volume = 0
Add all the border cells to PQ.
For each border cell, set the cell's entry in Closed to True.
While PQ is not empty:
Select the top cell from PQ, call it C.
Pop the top cell from PQ.
For each neighbor N of C:
If Closed(N):
Continue
If Elevation(N)<Elevation(C):
Volume += (Elevation(C)-Elevation(N))*Area
Add N to PQ, but with Elevation(C)
Else:
Add N to PQ with Elevation(N)
Set Closed(N)=True