Question

I'm investigating improvements in the new GridGain release and wanted to know how GridGain 6 handles tasks with many jobs.

Consider a situation where tasks spawn a large number of jobs (hundreds of thousands). In GridGain 4, we observed that the jobs got queued up in memory on the nodes potentially causing "out of memory" issues. We got around the issue by throttling job submission by creating a disk based queue and submitting the queued jobs as jobs finish.

Can (how?) GridGain 6 handle this situation and are there any specific recommendations? I see that there is a Streaming API available but can this handle our situation.

Thanks

Was it helpful?

Solution

I think you need to take advantage of GridComputeTaskContinuousMapper class which allows you to have a constant number of outstanding jobs within a task and then emit new jobs once other jobs complete.

Take a look at ComputeContinuousMapperExample shipped with GridGain (also available on GitHub).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top