Rimuovi una varietà di righe in un file di testo

https://stackoverflow.com/questions/1617568

06-07-2019
|

Domanda

Ho cercato di implementare uno script bash che legge dal database online di wordnet e mi chiedevo se c'è un modo per rimuovere una varietà di file di testo con un solo comando.

Esempio FileDump:

**** Noun ****
(n)hello, hullo, hi, howdy, how-do-you-do (an expression of greeting) "every morning they exchanged polite hellos"
**** Verb ****
(v)run (move fast by using one's feet, with one foot off the ground at any given time) "Don't run--you'll be out of breath"; "The children ran to the store"
**** Adjective ****
(adj)running ((of fluids) moving or issuing in a stream) "as mountain stream with freely running water"; "hovels without running water"

Devo solo rimuovere le righe che descrivono gli aspetti della grammatica, ad es.

**** Noun ****
**** Verb ****
**** Adjective ****

In modo che io abbia un file pulito con solo le definizioni delle parole:

(n)hello, hullo, hi, howdy, how-do-you-do (an expression of greeting) "every morning they exchanged polite hellos"
(v)run (move fast by using one's feet, with one foot off the ground at any given time) "Don't run--you'll be out of breath"; "The children ran to the store"
(adj)running ((of fluids) moving or issuing in a stream) "as mountain stream with freely running water"; "hovels without running water"

I simboli * intorno ai termini grammaticali mi stanno facendo scattare in sed.

Soluzione

Se si desidera selezionare intere righe da un file basandosi solo sul contenuto di tali righe, grep è probabilmente lo strumento più adatto disponibile. Tuttavia, alcuni personaggi, come le tue stelle, hanno significati speciali per grep , quindi è necessario essere "scappati". con una barra rovesciata. Questo stamperà solo le linee che iniziano con quattro stelle e uno spazio:

grep "^\*\*\*\* " textfile

Tuttavia, vuoi mantenere le righe che non corrispondono, quindi hai bisogno dell'opzione -v per grep che solo che: stampa le linee che non corrispondono al modello.

grep -v "\*\*\*\* " textfile

Questo dovrebbe darti quello che vuoi.

Altri suggerimenti

sed '/^\*\{4\} .* \*\{4\}$/d'

o un po 'più sciolto

sed '/^*\{4\}/d'

 sed 's/^*.*//g' test | grep .

# awk '!/^\*\*+/' file
(n)hello, hullo, hi, howdy, how-do-you-do (an expression of greeting) "every morning they exchanged polite hellos"
(v)run (move fast by using one's feet, with one foot off the ground at any given time) "Don't run--you'll be out of breath"; "The children ran to the store"
(adj)running ((of fluids) moving or issuing in a stream) "as mountain stream with freely running water"; "hovels without running water"

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a StackOverflow