Supprimer une variété de lignes dans un fichier texte

https://stackoverflow.com/questions/1617568

06-07-2019
|

Question

J'ai essayé d'implémenter un script bash qui lit la base de données en ligne de wordnet et je me demandais s'il était possible de supprimer une variété de fichiers texte avec une seule commande.

Exemple FileDump:

**** Noun ****
(n)hello, hullo, hi, howdy, how-do-you-do (an expression of greeting) "every morning they exchanged polite hellos"
**** Verb ****
(v)run (move fast by using one's feet, with one foot off the ground at any given time) "Don't run--you'll be out of breath"; "The children ran to the store"
**** Adjective ****
(adj)running ((of fluids) moving or issuing in a stream) "as mountain stream with freely running water"; "hovels without running water"

Il me suffit de supprimer les lignes décrivant des aspects de la grammaire, par exemple.

**** Noun ****
**** Verb ****
**** Adjective ****

Pour que je dispose d'un fichier propre contenant uniquement les définitions des mots:

(n)hello, hullo, hi, howdy, how-do-you-do (an expression of greeting) "every morning they exchanged polite hellos"
(v)run (move fast by using one's feet, with one foot off the ground at any given time) "Don't run--you'll be out of breath"; "The children ran to the store"
(adj)running ((of fluids) moving or issuing in a stream) "as mountain stream with freely running water"; "hovels without running water"

Les symboles * autour des termes grammaticaux me font trébucher.

La solution

Si vous souhaitez sélectionner des lignes entières à partir d'un fichier uniquement en fonction du contenu de ces lignes, grep est probablement l'outil le plus approprié disponible. Cependant, certains caractères, tels que vos étoiles, ont une signification particulière pour grep , ils doivent donc être "échappés". avec une barre oblique inverse. Cela n’imprimera que les lignes commençant par quatre étoiles et un espace:

grep "^\*\*\*\* " textfile

Toutefois, vous souhaitez conserver les lignes qui ne correspondent pas , vous avez donc besoin de l'option -v pour grep . rien que ça: affiche les lignes qui ne correspondent pas au motif.

grep -v "\*\*\*\* " textfile

Cela devrait vous donner ce que vous voulez.

Autres conseils

sed '/^\*\{4\} .* \*\{4\}$/d'

ou un peu plus lâche

sed '/^*\{4\}/d'

 sed 's/^*.*//g' test | grep .

# awk '!/^\*\*+/' file
(n)hello, hullo, hi, howdy, how-do-you-do (an expression of greeting) "every morning they exchanged polite hellos"
(v)run (move fast by using one's feet, with one foot off the ground at any given time) "Don't run--you'll be out of breath"; "The children ran to the store"
(adj)running ((of fluids) moving or issuing in a stream) "as mountain stream with freely running water"; "hovels without running water"

Licencié sous: CC-BY-SA avec attribution

Non affilié à StackOverflow