There are several architectural issues that I can see in your code:
First if you want to execute something at a fixed rate, use the ScheduledThreadPoolExecutor.scheduleAtFixedRate(...) function. It will make your entire delay code part obsolete as well as ensuring that certain OS timing issues will not interfere with your scheduling.
Then to make things faster you need to take your code apart a bit. As far as I can see you have 3 tasks: the capture, the mouse-drawing/conversion and the stream writing. If you put the capture part in a scheduled Runnable, the conversion into multi-parallel execution as Callables into an Executor, and then in a 3rd thread take the results from a result list and write it into the stream, you can fully utilize multi-cores.
Pseudocode:
Global declarations (or hand them over to the various classes):
final static Executor converterExecutor = Executors.newFixedThreadPoolExecutor(Runtime.getRuntime().availableProcessors());
final static LinkedBlockingQueue<Future<IVideoPicture>> imageQueue = new LinkedBlockingQueue<>();
// ...
Capture Runnable (scheduled at fixed rate):
capture = captureScreen();
final Converter converter = new Converter(capture);
final Future<IVideoPicture> conversionResult = converterExecutor.submit(converter);
imageQueue.offer(conversionResult); // returns false if queue is full
Conversion Callable:
class Converter implements Callable<IVideoPicture> {
// ... variables and constructor
public IVideoPicture call() {
return convert(this.image);
}
}
Writer Runnable:
IVideoPicture frame;
while (this.done == false) {
frame = imageQueue.get();
writer.encodeVideo(0, frame);
}
You can ensure that the imageQueue does not overflow with images to render if the CPU is too slow by limiting the size of this queue, see the constructor of LinkedBlockingQueue.