Domanda

I have a data that are alphabetic in some columns and have colon denoting the attribute number. How can I convert the alphabets (single capital alphabets) to numbers, A to 1, B to 2 ... Z to 26 and also remove the colons and number before the colons in each column. For example,

A,1:2.33,2:3.18
M,1:8.72,2:7.25
Y,1:9.55,2:3.43
C,1:5.78,2:4.32

I want to convert this to

1,2.33,3.18
13,8.72,7.25
25,9.55,3.43
3,5.78,4.32

There is no colon in first column and there are no alphabets after the first column.

È stato utile?

Soluzione

You can use awk to convert text to number like with this command:

split("ABCDEFGHIJKLMNOPQRSTUVWXYZ",a,"")

So you can get some like this:

awk -F":|," 'BEGIN {split("ABCDEFGHIJKLMNOPQRSTUVWXYZ",a,"");for (i=1;i<=27;i++) x[a[i]]=i} {print x[$1],$3,$5}' OFS=, file
1,2.33,3.18
13,8.72,7.25
25,9.55,3.43
3,5.78,4.32

It makes an array with a[1]=A a[2]=B etc, then convert it so x[A]=1 x[B]=2 etc
Split the data by : and , then convert first column from text to value and print the rest.

Using google would help to solve this: How to print ASCII value of a character using basic awk only


To handle multiple column:

awk -F":|," 'BEGIN {split("ABCDEFGHIJKLMNOPQRSTUVWXYZ",a,"");for (i=1;i<=27;i++) x[a[i]]=i} {printf "%s,",x[$1];for (i=3;i<NF;i+=2) printf "%s,",$i;print $NF}' OFS=, file

Altri suggerimenti

Jotne's ABCD...Z solution works, but I would post another more general(hopefully) solution:

awk -F '[,:]' -v OFS="," 'BEGIN{for(n=65;n<100;n++)ord[sprintf("%c",n)]=++x}
{print ord[$1],$3,$5}' file

The trick is in the BEGIN block. here I limited it, so that only [A-Z] will be acceptable, which may enough for your questions. But if you have other chars (like all chars in ascii), or you want to return other index logic, like A->5, B->6... or a-97,b-98...A-65, B-66... you can just change the n and the x in above one-liner.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top