Question

I come from a strong experience in MySQL, and I am now starting with Oracle. But I find really difficult to understand what a DATABASE is in Oracle, given that they use similar concepts which I am struggling to differentiate. In mysql, there is a simple concept of "database" instead of a mixture of

  • SCHEMA concept (User's woprkspace logically divided by TABLESPACES)
  • TNS and SID/SERVICES concept
  • CONNECTION concept (in both ODBC definition and SQLDeveloper)

I won't ask a pure definition of them as I am still reading, but just some guidance on how can I map a mysql database in the closest Oracle concept.

Was it helpful?

Solution

This is the information I can give you coming from the perspective of a developer. I don't know huge amounts about Oracle, but I have done some fairly significant work with deploying to it for some applications that are now in production.

Database

A database, in Oracle terms, is a group of files that live on disk and are managed as a cohesive unit. The database contains almost everything: logins, roles, tables, indexes, temporary space, transaction logs, and so on. Creating one is a nontrivial task in Oracle. It basically requires direct access (as in SSH or Windows Remote Desktop) to the machine. It's common for a DBA to create one during installation and for that to be the only one the server ever hosts. Unlike in MySQL, PostgreSQL, and SQL Server, you can't really use this level for basic grouping. E.g., giving each developer their own database is uncommon because of the overhead in recreating it.

Schema

Oracle schemas conflate two purposes: users and namespaces.

  • Each schema is a user, and it can be associated with credentials (a password, a user in Active Directory). Note that all accounts are database specific; there is no way to create a user that can log into multiple databases (aside from pointing both databases at the same LDAP server or otherwise involving some external service).

  • The schema also acts as a namespace that contains objects (e.g., tables, views, procedures, and indexes), and the schema name can be used explicitly to qualify exactly which object you're trying to refer to. For example, if I say MYOWNER.MYTABLE, Oracle will look for the MYTABLE table owned by MYOWNER. If you need multiple copies of all the same objects, this is the easiest level to group them in, which makes them the best level for having per developer copies of the database.

It is common to divide the two concepts manually: a schema can be locked out of logging in, and permissions can be granted to another user on its objects. This is something of a hassle, though, since there's no way to grant permissions across an entire schema; each object must be granted explicitly to some user or role. There's also no way to force users to create objects in a specific schema besides their own; permissions can only be granted to either create objects in the user's own schema or globally in any schema.

Complete aside: in PostgreSQL and SQL Server, schemas are only namespaces, not users.

Tablespace

Tablespaces are sets of files on disk that contain everything you need to store, including both data and the metadata (such table definitions). A single database can use multiple tablespaces, and different objects within a schema can even be on different tablespaces. A tablespace can be one or many files, but they're managed as one logical unit. Each schema has a default tablespace for its objects if a tablespace isn't specified when creating the object. Sharing them between databases is somewhere between impossible and unheard of.

In practice, it's common to not even bother with tablespaces and just leave the default configuration alone. The default is one tablespace named USERS with one file, and it's the default tablespace for all schemas in the database. If you change these at all, you usually set a default for each schema and then never think about it again until disk space becomes an issue.

Instance

You didn't ask about these specifically, but you'll need to understand them before we can talk about connecting to the database.

An instance is the actual process running on the server that listens for connections. Like databases, these require direct access to the database server to set up. You can have multiple or a single one on the server. It's common to have one per database.

An instance can be identified two ways: an SID or a service name. The SID identifies a single instance, while the service name is an alias that can refer to several instances. The details of how that works are usually unimportant; just know that you need to know of them to connect.

Connecting

To connect from a client, you need a connect descriptor. This is a jumbled string containing the host, port, and either SID or service name. They look like this, for example: (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=myoracleserver)(PORT=1521))(CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=orclservice))). They can get more complicated, but that's the basic form. To use an SID instead of a service name, you would replace SERVICE_NAME=orclservice with SID=orclinstance. There's also a newer, more compact format called "EZ connect" that looks like this instead: myoracleserver:1521/orclservice; it only supports the basic parameters.

TNS is short for "Transparant Network Substrate," and it consists of the entire networking stack that is used to communicate with the database. You virtually never need to concern yourself with it as a whole.

What you encounter often is TNS names. TNS names are aliases to the connect descriptors. They're stored in a plain text file on the client machine, and they're typically global to the entire machine. Here's an example mapping that you mind find in the file: mydatabase=(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=myoracleserver)(PORT=1521))(CONNECT_DATA=(SERVER=DEDICATED)(SID=orcl))). In my experience, most of the time you can actually avoid bothering with TNS names entirely and just use the connect descriptor directly.

A connect identifier is anything that can stand in for a connect descriptor. It can be a full connect descriptor, an EZ connect descriptor, a TNS name, or several other things. But generally, they identify a server and the particular database on it that you want to connect to.

With all that in mind, connections become a little more straightforward. Conceptually, they're pretty much the same as in other database. The thing that might be confusing about them is that you connect as a schema, as described earlier. The "username" is the schema name, and the schema can have a password or some other form of authentication associated with it. The connection string differs according to the client software, much like in any other database. For SQL*Plus (Oracle's command line client), connection strings look like this: [USERNAME]/[PASSWORD]@[connect identifier]. So if your user is MY_SCHEMA, the password is PASS, and the server is like above, it might look like

MY_SCHEMA/PASS@(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=myoracleserver)(PORT=1521))(CONNECT_DATA=(SERVER=DEDICATED)(SID=orclinstance)))

For a .NET application, it might look like

Data Source=(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=myoracleserver)(PORT=1521))(CONNECT_DATA=(SERVER=DEDICATED)(SID=orcl)));User Id=MY_SCHEMA;Password=PASS

which is pretty similar to any other database. Note that anywhere you see that nasty server information, you could replace that with any connect identifer (such as a TNS name).

As far as SQL Developer is concerned, a "connection" is really just a saved connection string. ODBC connects like any other database; you just need the right connection string and drivers.

Drivers

The drivers can be a pain point in Oracle, depending on language. I believe Java has some decent stand alone clients, but other languages generally depend on the binary version. The binary version does have an installer that puts the binaries on PATH, but the installer is pretty difficult to use and best avoided. When I can, I avoid installing the client and make use of what's called "instant client". Usually, if you can get the instant client binaries in a place where the app can find them, they just work. If not, then it's preferable to just prepend PATH in memory for your application than to modify it globally for your machine.

If you happen to be developing using .NET, use the ODP.NET provider on NuGet from Oracle. It's written in full .NET, eliminating the need to deal with native binaries.

Summary

So in short:

  • A database is part of the server set up
  • A schema is both a user and how you divide your database
  • A tablespace is the physical files that hold the database
  • TNS names are just a naming convenience on the client side
  • SID/Service Name are just names used when connecting

I find this arrangement far too complex, personally.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top