what is a better way than arrays overlap? (postgresql)

Question

Short short answer: What you're doing is reasonably sane,but consider using int arrays rather than strings, as they're faster to compare, and mind the caveats.

Personally, I'd normalize it: add a user_roles table, along with role2report and user2role. Performance-wise, the optimal case in my own experience is to pre-compute the current user's role_ids in your app, and then query with an IN clause for roles. This means:

select from reports join role2report ...

The same in triggers and such: the key is to compute the role_ids (or perm_ids), and then query. You do NOT, under any circumstance, want:

select from reports join role2report join crazy_user2role_role2role_rec_view

The biggest optimization from there involves caching a user's role for convenience using an int array or memcached or whatever. This avoids constantly using a crazy user2role joined with recursive role2role view definition, and whatever other types of craziness your specs' edge cases lead you to. Mind cache invalidation.

Caching the access lists is much trickier in my experience: should you cache who can read? Write? Both? Are some objects public? Can non-logged in guests access them to? It's a deluge of questions.

If you do cache that, use an int array as well. Toss in e.g. -1 to stand for public/guest access, and 0 in it to stand for registered/user access. And then use array overlaps in your queries (with registered users getting rows 0 and -1 automatically). Optimize your arrays accordingly to keep them small: if it contains -1, that should be the only value; else the same for zero; else list the role ids with grant access.

One caveat of using arrays, btw: at least until a recent version of Postgres (not sure now), no stats were collected on an array's contents. This made using an array sub-optimal for data sets in which a certain role_id who can access most things should lead to Postgres ignoring the GIN index. That's a real performance killer right there, because it means PG will basically fetch the entire table to fetch top-10 rows with appropriate perms instead of index scanning it with a filter.