سؤال

I have a data that are alphabetic in some columns and have colon denoting the attribute number. How can I convert the alphabets (single capital alphabets) to numbers, A to 1, B to 2 ... Z to 26 and also remove the colons and number before the colons in each column. For example,

A,1:2.33,2:3.18
M,1:8.72,2:7.25
Y,1:9.55,2:3.43
C,1:5.78,2:4.32

I want to convert this to

1,2.33,3.18
13,8.72,7.25
25,9.55,3.43
3,5.78,4.32

There is no colon in first column and there are no alphabets after the first column.

هل كانت مفيدة؟

المحلول

You can use awk to convert text to number like with this command:

split("ABCDEFGHIJKLMNOPQRSTUVWXYZ",a,"")

So you can get some like this:

awk -F":|," 'BEGIN {split("ABCDEFGHIJKLMNOPQRSTUVWXYZ",a,"");for (i=1;i<=27;i++) x[a[i]]=i} {print x[$1],$3,$5}' OFS=, file
1,2.33,3.18
13,8.72,7.25
25,9.55,3.43
3,5.78,4.32

It makes an array with a[1]=A a[2]=B etc, then convert it so x[A]=1 x[B]=2 etc
Split the data by : and , then convert first column from text to value and print the rest.

Using google would help to solve this: How to print ASCII value of a character using basic awk only


To handle multiple column:

awk -F":|," 'BEGIN {split("ABCDEFGHIJKLMNOPQRSTUVWXYZ",a,"");for (i=1;i<=27;i++) x[a[i]]=i} {printf "%s,",x[$1];for (i=3;i<NF;i+=2) printf "%s,",$i;print $NF}' OFS=, file

نصائح أخرى

Jotne's ABCD...Z solution works, but I would post another more general(hopefully) solution:

awk -F '[,:]' -v OFS="," 'BEGIN{for(n=65;n<100;n++)ord[sprintf("%c",n)]=++x}
{print ord[$1],$3,$5}' file

The trick is in the BEGIN block. here I limited it, so that only [A-Z] will be acceptable, which may enough for your questions. But if you have other chars (like all chars in ascii), or you want to return other index logic, like A->5, B->6... or a-97,b-98...A-65, B-66... you can just change the n and the x in above one-liner.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top