Two threads reading from the same table:how do i make both thread not to read the same set of data from the TASKS table

StackOverflow https://stackoverflow.com/questions/8134733

문제

I have a tasks thread running in two separate instances of tomcat. The Task threads concurrently reads (using select) TASKS table on certain where condition and then does some processing.

Issue is ,sometimes both the threads pick the same task , because of which the task is executed twice. My question is how do i make both thread not to read the same set of data from the TASKS table

도움이 되었습니까?

해결책

I think you need have some variable (column) where you keep last modified date of rows. Your threads can read same set of data with same modified date limitation.

Edit: I did not see "not to read"

In this case you need have another table TaskExecutor (taskId , executorId) , and when some thread runs task you put data to TaskExecutor; and when you start another thread it just checks that task is already executing or not (Select ... from RanTask where taskId = ...). Нou also need to take care of isolation level for transaсtions.

다른 팁

It is just because your code(which is accessing data base)DAO function is not synchronized.Make it synchronized,i think your problem will be solved.

If the TASKS table you mention is a database table then I would use Transaction isolation.

As a suggestion, within a trasaction, set an attribute of the TASK table to some unique identifiable value if not set. Commit the tracaction. If all is OK then the task has be selected by the thread.

I haven't come across this usecase so treat my suggestion with catuion.

I think you need to see some information how does work with any enterprise job scheduler, for example with Quartz

For your use case there is a better tool for the job - and that's messaging. You are persisting items that need to be worked on, and then attempting to synchronise access between workers. There are a number of issues that you would need to resolve in making this work - in general updating a table and selecting from it should not be mixed (it locks), so storing state there doesn't work; neither would synchronization in your Java code, as that wouldn't survive a server restart.

Using the JMS API with a message broker like ActiveMQ, you would publish a message to a queue. This message would contain the details of the task to be executed. The message broker would persist this somewhere (either in its own message store, or a database). Worker threads would then subscribe to the queue on the message broker, and each message would only be handed off to one of them. This is quite a powerful model, as you can have hundreds of message consumers all acting on tasks so it scales nicely. You can also make this as resilient as it needs to be, so tasks can survive both Tomcat and broker restarts.

Whether the database can provide graceful management of this will depend largely on whether it is using strict two-phase locking (S2PL) or multi-version concurrency control (MVCC) techniques to manage concurrency. Under MVCC reads don't block writes, and vice versa, so it is very possible to manage this with relatively simple logic. Under S2PL you would spend too much time blocking for the database to be a good mechanism for managing this, so you would probably want to look at external mechanisms. Of course, an external mechanism can work regardless of the database, it's just not really necessary with MVCC.

Databases using MVCC are PostgreSQL, Oracle, MS SQL Server (in certain configurations), InnoDB (except at the SERIALIZABLE isolation level), and probably many others. (These are the ones I know of off-hand.)

I didn't pick up any clues in the question as to which database product you are using, but if it is PostgreSQL you might want to consider using advisory locks. http://www.postgresql.org/docs/current/interactive/explicit-locking.html#ADVISORY-LOCKS I suspect many of the other products have some similar mechanism.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top