Find rows with duplicate values in a column

Question 1

I suggest a window function in a subquery:

SELECT author_id, author_name  -- omit the name here if you just need ids
FROM (
   SELECT author_id, author_name
        , count(*) OVER (PARTITION BY author_name) AS ct
   FROM   author_data
   ) sub
WHERE  ct > 1;

You will recognize the basic aggregate function count(). It can be turned into a window function by appending an OVER clause - just like any other aggregate function.

This way it counts rows per partition. Voilá.

It has to be done in a subquery because the result cannot be referenced in the WHERE clause in the same SELECT (happens after WHERE). See:

Best way to get result count before LIMIT was applied

In older versions without window functions (v.8.3 or older) - or generally - this alternative performs pretty fast:

SELECT author_id, author_name  -- omit name, if you just need ids
FROM   author_data a
WHERE  EXISTS (
   SELECT FROM author_data a2
   WHERE  a2.author_name = a.author_name
   AND    a2.author_id <> a.author_id
   );

If you are concerned with performance, add an index on author_name.

Question 2

You are half way there already. You need to just use the identified Author_IDs and fetch the rest of the data.

try this..

SELECT author_id, author_name
FROM author_data
WHERE author_id in (select author_id
        from author_data
        group by author_name
        having count(author_name)>1)

Question 3

You could join the table onto itself, which is achievable with either of the following queries:

SELECT a1.author_id, a1.author_name
FROM authors a1
CROSS JOIN authors a2
  ON a1.author_id <> a2.author_id
  AND a1.author_name = a2.author_name;

-- 9 |ernest jordan
-- 15|ernest jordan
-- 14|k moribe
-- 36|k moribe

--OR

SELECT a1.author_id, a1.author_name
FROM authors a1
INNER JOIN authors a2
  WHERE a1.author_id <> a2.author_id
  AND a1.author_name = a2.author_name;

-- 9 |ernest jordan
-- 15|ernest jordan
-- 14|k moribe
-- 36|k moribe

Question 4

If you want the answer you mentioned in the question, the whole query will fetch for you but if you just want the duplicate one, you can use the inner query. You can use windows functions, Row, Dense rank also, to get your answers

select a.author_id, 
a.author_name 
from authors a JOIN
   (
    select author_name
    from authors
    group by author_name
    having count(author_name) >1
   ) as temp
on a.author_name = temp.author_name