haskellで繰り返される単語のリストを表示する

https://stackoverflow.com/questions/402391

03-07-2019
|

質問

文字列から繰り返される単語を表示する関数を作成し、文字列のリストをその出現順に返し、非文字を無視できるようにする必要があります

e.gハグプロンプトで

repetitions :: String -> [String]

repetitions > "My bag is is action packed packed."
output> ["is","packed"]
repetitions > "My name  name name is Sean ."
output> ["name","name"]
repetitions > "Ade is into into technical drawing drawing ."
output> ["into","drawing"]

解決

文字列を単語に分割するには、words関数（プレリュード内）を使用します。単語以外の文字を削除するには、filterとData.Char.isAlphaNumを使用します。リストをその末尾と一緒に圧縮して、隣接するペア(x, y)を取得します。リストを折りたたみ、すべてのxを含む新しいリストを作成します。ここで、y == <=>。

次のようなもの：

repetitions s = map fst . filter (uncurry (==)) . zip l $ tail l
  where l = map (filter isAlphaNum) (words s)

それが機能するかどうかはわかりませんが、大まかなアイデアが得られるはずです。

他のヒント

私はこの言語に慣れていないので、Haskellのベテランの目には私の解決策は一種のugいものになる可能性がありますが、とにかく：

let repetitions x = concat (map tail (filter (\x -> (length x) > 1) (List.group (words (filter (\c -> (c >= 'a' && c <= 'z') || (c>='A' && c <= 'Z') ||  c==' ') x)))))

この部分は、文字列からすべての非文字および非スペースを削除します s ：

filter (\c -> (c >= 'a' && c <= 'z') || (c>='A' && c <= 'Z') ||  c==' ') s

これは文字列 s を単語に分割し、同じ単語をリストのリストにグループ化します：

List.group (words s)

このパートが2つ未満の要素を持つすべてのリストを削除する場合：

filter (\x -> (length x) > 1) s

すべてのリストを1つに連結した後、1つの要素を削除します

concat (map tail s)

これは不適切な場合がありますが、概念的には非常に単純です。例のように連続した重複する単語を探すことを想定しています。

-- a wrapper that allows you to give the input as a String
repititions :: String -> [String]
repititions s = repititionsLogic (words s)
-- dose the real work 
repititionsLogic :: [String] -> [String]
repititionsLogic [] = []
repititionsLogic [a] = []
repititionsLogic (a:as) 
    | ((==) a (head as)) = a : repititionsLogic as
    | otherwise = repititionsLogic as

アレクサンダー・プロコフィエフの答えに基づいて構築：

repetitions x = concat (map tail (filter (\x -> (length x) > 1) (List.group (word (filter (\c -> (c >= 'a' && c <= 'z') || (c>='A' && c <= 'Z') || c==' ') x)))))

不要な括弧を削除：

repetitions x = concat (map tail (filter (\x -> length x > 1) (List.group (word (filter (\c -> c >= 'a' && c <= 'z' || c>='A' && c <= 'Z' || c==' ') x)))))

さらに括弧を削除するには、$を使用します（終了括弧が式の末尾にある場合、各$は開始括弧を置き換えることができます）：

repetitions x = concat $ map tail $ filter (\x -> length x > 1) $ List.group $ word $ filter (\c -> c >= 'a' && c <= 'z' || c>='A' && c <= 'Z' || c==' ') x

文字範囲をData.Charの関数で置き換え、連結とマップをマージ：

repetitions x = concatMap tail $ filter (\x -> length x > 1) $ List.group $ word $ filter (\c -> isAlpha c || isSeparator c) x

(\x -> length x > 1) to ((>1) . length)を簡素化するために、セクションを使用し、ポイントフリースタイルでカリー化する。これは、右から左へのパイプラインでlengthと（<！> gt; 1）（部分的に適用された演算子、またはセクション）を組み合わせます。

repetitions x = concatMap tail $ filter ((>1) . length) $ List.group $ word $ filter (\c -> isAlpha c || isSeparator c) x

明示的な<！> quot; x <！> quot;を削除します。全体的な式をポイントフリーにする変数：

repetitions = concatMap tail . filter ((>1) . length) . List.group . word . filter (\c -> isAlpha c || isSeparator c)

右から左に読む関数全体は、アルファベットまたは区切り文字のみをフィルタリングし、単語に分割し、グループに分割し、複数の要素を持つグループをフィルタリングし、残りのグループを削減するパイプラインですそれぞれの最初の要素に。

ライセンス： CC-BY-SA と帰属

所属していません StackOverflow