Question

I am a Computer Science student and one of my first class was Object Oriented Design (Java). Unfortunately, we never interacted with a database during that class. I am currently working on a CRUD app as a side-project and I am confused about the relationship between Objects and DB Tables.

Let's say I am making a persistent browser war game in Python (such as Travian), where each user has 1 or more village, each village has different buildings and troops.

Using MySQL, I could represent a village like this:

CREATE TABLE `village` (
  `village_id` int NOT NULL AUTO_INCREMENT,
  `user_id` int NOT NULL,
  `city_name` varchar(100) NOT NULL,
  `location-x` int(3),
  `location-y` int(3),
  `population` int(5),
  KEY `user_id` (`user_id`),
  CONSTRAINT `user_id` FOREIGN KEY (`user_id`) REFERENCES `user` (`user_id`),
  PRIMARY KEY (`village_id`)
)

I could also make a class to represent a village using Python. But if my Class components are the same as my Table columns, isn't a waste of resources? Like rather than updating the object, why not directly interact with the database table and directly read, write, update values directly on the table rather than the object?

Was it helpful?

Solution

These are two different things:

  • DB data is passive: it just stays there available for any processing that could use or misuse it.
  • Objects are active: it’s not data but behaviours, consistency and privacy. Data as well, but it should in principle be encapsulated.

Both can be related. Like food that is dehydrated for longer preservation, you can store the data of an object into a database for persistence. And you then have some way to recreate object from the data. There are plenty of ways to do this, but if you’re in a relational database, you may look for Object/Relational Mapping (ORM) for finding different known techniques to do it well.

OTHER TIPS

I could also make a class to represent a village using Python. But if my Class components are the same as my Table columns, isn't a waste of resources?

No it will not waste any resources, but it will make:

  1. your code more clean
  2. your design/architecture comprehensible
  3. your classes are encapsulated (OOP)
  4. you could re-use some/all of your classes in the next project/version

You can extend the "properties" (like: Age, State, CurrentValue, NextMove, IsFighting,...) or functionallities (like: "load(), save(), toXml(), toJSON(),...) of each type (Village, User, Troop,...) whithout the need of re-engineering or changing code of all types.

Imagine you would use the same database table for saving car data (id, name, age) and person data (id, name, age). Then one day you may would like to add "manufacturer", "HorsePower" or "TotalCylinders" etc. to the car. Your persons would then also have the same properties like: manufacturer, HorsePower and TotalCylinders! Or you would define a function "toVehicle()", then your persons all had the same function and one could convert a person "John Doe" to a vehicle!

Like rather than updating the object, why not directly interact with the database table and directly read, write, update values directly on the table rather than the object?

Because of "side effects" (state changes while runtime, notifications/events and listeners/event-handling-methods etc.) and separation of resposibility (single responsibility principle).

You mentioned you are a computer-science student, so I recommend you to search, study and use following concepts/paradigms:

Designing architcture and data models need some time plus practice to get experienced. I recommend you also to search and study "Entity Relationship Model".

I would suggest that database tables are optimized for persistence and retrieval; your data exists entirely in the form of tables of fields and rows.

Modeling your data using classes allows your program to be more expressive.

For example, part of your village class could be a reference to a live user object, with it's own properties. Admittedly, this information can be determined in the database via the user_id <-> user.id connection, but it's not exposed as part of the village row; the user details are external to the village.

You wrote

Like rather than updating the object, why not directly interact with the database table and directly read, write, update values directly on the table

which gives me the impression what is missing here is "the bigger picture".

Indeed, for very small programs, just using bare SQL operations on database tables can be perfectly sufficient to create a working program. If all you want to implement is a program for fixing a single value in a single column of one table, once, using something beyond a simple SQL script would probably be overengineered.

Unfortunately, this approach does not really scale well as soon as real-world requirements have to be implemented. For example, it is quite common within a use cases to query some data from the database, do some operations and calculations in-memory, validate the results, show them to a user, and let them (or some algorithm) make a decision if those results or changes shall be persisted in the database or not.

Even for for such a simple use-case one will need some local storage for the intermediate values and results, "directly interacting with a database table" would not be suitable:

  • the intermediate results might not even fit to some database table
  • the data would be visible intermediately to other accessors of the database even if the data will be thrown away
  • the operations could easily become pretty slow and the coding very cumbersome. For example, where in Java you write x++ for a variable, using SQL you would have to write something along the lines of ExecSql("UPDATE MyTable SET X = X + 1 WHERE ID=?", myIdVal)

But how do you organize local variables for values queried from a database table, or values which shall be written back to a database? A common way is indeed to map classes and objects (for example, in Java) directly to database tables and records within the table in a 1:1 fashion. The "boring" read/write/update SQL code for such a mapping may be generated by some OR mapper.

Of course, this is not really Object Oriented, but it will bring you already a lot farther than directly operating on database tables. These classes will allow you using the native elements of your programming languages, and they allow you to separate the persistence code from the operating code. You may also consider to use these classes as a starting point for a more sophisticated design. I don't know the game you mentioned, but I would expect this kind of design to be way more suitable for implementing most games than one which relies on using SQL directly.

why not directly interact with the database table and directly read, write, update values directly on the table rather than the object?

Because in any real-world scenario (or even a moderately sized browser game), you're going to be dealing with many data operations, and you're going to be dealing with a database that's only available over the network.

Note: even if the database is still on the same machine, it's still going to be a performance issue, and it's fairly uncommon to deploy databases and applications to the same machine.

Small data operations are much faster done in-memory, and then you save it to the database once in the end, instead of sending multiple requests.

Let's use an analogous situation. When you wrote this question, you wrote the text locally (in the textbox, in the browser, on your machine), and you only posted it to StackExchange when you were finished.
If we use your "work with the database directly" approach here, instead of having a textbox in your own browser, whenever you press a key you would have to connect to StackExchange, have it register the key you pressed, and then refresh your page (or page content) to reflect that change.

I'm not sure how fast you type, but even a novice typist is going to get stuck on the performance drag that this new system would bring with it.

But if my Class components are the same as my Table columns, isn't a waste of resources?

It's not a waste of resources, it's a minor amount of extra development time which renders you massive performance gains.

The classes exist specifically for the database data to be pulled into memory once, operated on, and then sent back to the database. It's of course also possibly to only fetch, or only write to the database.

Think of it like this:

Corporeal humans (code) and ethereal spirits (database) live on a different plane of existence. Crossing that plane is hard (network performance, query formatting and parsing). If you want to have many and frequent interactions with a spirit, you would have to hold seances (run queries) all the time, which is going to cost you heaps of time and effort (performance).

A smarter idea would be to hold a longer seance once, and use it to bind the spirit (database entry) to a corporeal body (class component) once. From that point on, you can just interact with this corporeal body (class), no seance (network call and query) required. And after your many interactions (data operations), when you're done, you just perform another seance to release the spirit back to his own realm (store the data in the database).

In short, it's not that your approach is impossible, but it does entail prohibitively bad performance.


This thought pattern is very common for beginners.

You are focused more on the effort of developing your code. To you, you'd prefer only writing one thing, not both a table and a class, and since it's technically possible to only write one, you're wondering why you'd ever write two things. That's more work, after all.

But you're missing the bigger picture. In the real world, you tend to be on the hook for supporting the software you release, i.e. what we call ownership. This means that any effort you saved in the beginning, could spell doom for you if it ends up costing you more time during the maintenance phase of the application.
Your shortcut "works" when you only think about the development phase, but it's actually counterproductive when you think about the bigger picture.

Because I like analogies: we're going to have a race. We have to take our car, inflate its tires, and drive to the finish line. Start!

I inflate all four tires, get in the car and take the highway to the finish.

You, on the other hand, decide that inflating three tires takes less effort than inflating four, and therefore why would you ever inflate the fourth tire? After all, it would just mean that the tire inflation process would take even longer. It'd be a waste of resources.

But now you're on the road, and you realize that a car with three inflated tires is not as stable as a car that has four inflated tires. You have to brake more often, you take corners slower, and you can't go on the highway because you can't control your car at those speeds.

In the end, I win because I took the extra time end effort to inflate the fourth tire.

The reason I mention this is because it'd be very good if you learned this lesson early in your career: always remember the bigger picture. A corner cut today may end up biting you tomorrow, and likely more than it benefited you the day before.

Most if not all good practice development entails advocating for taking extra effort (which people instinctively try to avoid) because it surprisingly saves time in the end, and it's the end that matters.

Licensed under: CC-BY-SA with attribution
scroll top