سؤال

I have a file containing list and sublist and I want to extract the longest sublist using command line tools.

File example:

* Item1
** SubItem1
** ...
** SubItemN

* Item2
** SubItem1
** ...
** SubItemN

* ...
** ...

* ItemN
** SubItem1
** ...
** SubItemN

I am trying to know if this can be done easily, otherwise I will write a Perl script.

هل كانت مفيدة؟

المحلول

The Perl one-liner:

perl -00 -ne '$n=tr/\n/\n/; if ($n>$m) {$m=$n; $max=$_}; END {print $max}' file

Just using bash:

max=0
while read bullet thingy; do
    case $bullet in
         "*") item=$thingy; count=0 ;;
        "**") ((count++)) ;;
          "") (( count > max )) && { max_item=$item; max=$count; } ;; 
    esac
done < <(cat file; echo)
echo $max_item $max

The <(cat file; echo) part is to ensure that there is a blank line after the last line of the file, so that the last sublist group can be compared against the max

That only keeps the count. To save the items in the biggest sublist:

max=0
while read bullet thingy; do
    case $bullet in
         "*") item=$thingy; unset sublist; sublist=() ;;
        "**") sublist+=($thingy) ;;
          "") if (( ${#sublist[@]} > max )); then
                  max=${#sublist[@]}
                  max_item=$item
                  max_sublist=("${sublist[@]}")
              fi
              ;;
    esac
done < <(cat file; echo)
printf "%s\n" "$max_item" "${#max_sublist[@]}" "${max_sublist[@]}"

if using sudo_O's example, this outputs

letters
6
a
b
b
d
e
f

نصائح أخرى

$ cat file    
* letters
** a
** b
** b
** d
** e
** f

* colors 
** red
** green
** blue

* numbers
** 1
** 2
** 3
** 4
** 5

Show length of each sublist by reversing file with tac and using awk:

$ tac file | awk '/^\*\*/{c++}/^\*[^*]/{print c,$2;c=0}'
5 numbers
3 colors
6 letters

Print length of largest sublist only:

$ tac file | awk '/^\*\*/{c++}/^\*[^*]/{if(c>m){m=c;l=$2}c=0}END{print m,l}'
6 letters
cat file.txt | grep -nE "^\*[^\*].*" | cut -d ":" -f 1,1 | tee tmp | awk 'NR==1{s=$1;next}    {print $1-s;s=$1}' > tmp2
echo 0 >> tmp2
res=`paste tmp tmp2 | sort -nrk 2,2 | head -n 1`
line=`echo "$res" | cut -f 1,1`
ln=`echo "$res" | cut -f 2,2`
cat file.txt | tail -n +$line | head -n $ln
rm tmp tmp2

There is definitely a shorter solution :)

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top