get count of words in column sql

https://stackoverflow.com/questions/22665362

21-06-2023
|

Question

after the following queries

SELECT * FROM table;

SELECT REGEXP_REPLACE(description || '!', '[^[:punct:]]') 
    FROM table;

SELECT REGEXP_REPLACE ( description, '[' ||  REGEXP_REPLACE ( description || '!', '[^[:punct:]]')  || ']') test 
    FROM table;

SELECT REGEXP_REPLACE(UPPER(TEST), ' ', '#') test 
    FROM (SELECT REGEXP_REPLACE (description, '[' ||  REGEXP_REPLACE (description || '!', '[^[:punct:]]')  || ']') test 
    FROM table);

I have a column in an oracle sql looking like:

TEST
 ---------------------------------------------
 SPOKE#WITH#MR#SMITHS#ASSISTANT
 EMAILED#FOR#VISIT
 SCHEDULING#OFFICE#LM#FOR#VISIT
 LM#FOR#VISIT
 LM#FOR#VISIT
 PHONE#CALL
 ---------------------------------------------

all of the words are separated by #'s. I would like to get counts of the occurrences of words, for example:

word | count
------------
LM   |  3
FOR  |  4
VISIT|  4
PHONE|  1

etc etc. I'm new to oracle sql and am only familiar with rudimentary mysql commands. any help or pointers to tutorials would also be helpful. thank you.

edit: there are approximately 1500 rows with about 250 unique responses that i'm trying to account for

Solution

WITH mydata AS
  ( SELECT 'SPOKE#WITH#MR#SMITHS#ASSISTANT' AS str FROM dual
    UNION ALL
    SELECT 'EMAILED#FOR#VISIT' FROM dual
    UNION ALL
    SELECT 'SCHEDULING#OFFICE#LM#FOR#VISIT' FROM dual
    UNION ALL
    SELECT 'LM#FOR#VISIT' FROM dual
    UNION ALL
    SELECT 'LM#FOR#VISIT' FROM dual
    UNION ALL
    SELECT 'PHONE#CALL' FROM dual
  ),
  splitted_words AS
  (
    SELECT REGEXP_SUBSTR(str,'[^#]+', 1, level) AS word
    FROM mydata
      CONNECT BY level   <= LENGTH(regexp_replace(str,'[^#]')) + 1
    AND PRIOR str         = str
    AND PRIOR sys_guid() IS NOT NULL
  )
SELECT word,
      COUNT(1)
FROM splitted_words
GROUP BY word;

If your table is YOUR_TABLE and column is YOUR_COLUMN

  WITH splitted_words AS
  (
    SELECT REGEXP_SUBSTR(YOUR_COLUMN,'[^#]+', 1, level) AS word
    FROM YOUR_TABLE
      CONNECT BY level   <= LENGTH(regexp_replace(YOUR_COLUMN,'[^#]')) + 1
    AND PRIOR YOUR_COLUMN         = YOUR_COLUMN
    AND PRIOR sys_guid() IS NOT NULL
  )
SELECT word,
      COUNT(1)
FROM splitted_words
GROUP BY word;

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow