سؤال

I have a course (reverse engineering) in uni and i have a homework. I got a .obj file (which was compiled with visual studio 2008), and i have to disassemble it, figure out the control structure and call it in a little c program.

I used IDA decompiler, here is the asm code:

_FB3:
  00000000: 55                 push        ebp
  00000001: 56                 push        esi
  00000002: 57                 push        edi
  00000003: 8B 7C 24 10        mov         edi,dword ptr [esp+10h]
  00000007: 83 3F 00           cmp         dword ptr [edi],0
  0000000A: 74 79              je          00000085
  0000000C: 8D 64 24 00        lea         esp,[esp]
  00000010: 8B 2F              mov         ebp,dword ptr [edi]
  00000012: 8B 75 00           mov         esi,dword ptr [ebp]
  00000015: 8B 44 24 14        mov         eax,dword ptr [esp+14h]
  00000019: 8B CE              mov         ecx,esi
  0000001B: EB 03              jmp         00000020
  0000001D: 8D 49 00           lea         ecx,[ecx]
  00000020: 8A 10              mov         dl,byte ptr [eax]
  00000022: 3A 11              cmp         dl,byte ptr [ecx]
  00000024: 75 1A              jne         00000040
  00000026: 84 D2              test        dl,dl
  00000028: 74 12              je          0000003C
  0000002A: 8A 50 01           mov         dl,byte ptr [eax+1]
  0000002D: 3A 51 01           cmp         dl,byte ptr [ecx+1]
  00000030: 75 0E              jne         00000040
  00000032: 83 C0 02           add         eax,2
  00000035: 83 C1 02           add         ecx,2
  00000038: 84 D2              test        dl,dl
  0000003A: 75 E4              jne         00000020
  0000003C: 33 C0              xor         eax,eax
  0000003E: EB 05              jmp         00000045
  00000040: 1B C0              sbb         eax,eax
  00000042: 83 D8 FF           sbb         eax,0FFFFFFFFh
  00000045: 85 C0              test        eax,eax
  00000047: 7D 05              jge         0000004E
  00000049: 8D 7D 0C           lea         edi,[ebp+0Ch]
  0000004C: EB 32              jmp         00000080
  0000004E: 8B 44 24 14        mov         eax,dword ptr [esp+14h]
  00000052: 8B CE              mov         ecx,esi
  00000054: 8A 10              mov         dl,byte ptr [eax]
  00000056: 3A 11              cmp         dl,byte ptr [ecx]
  00000058: 75 1A              jne         00000074
  0000005A: 84 D2              test        dl,dl
  0000005C: 74 12              je          00000070
  0000005E: 8A 50 01           mov         dl,byte ptr [eax+1]
  00000061: 3A 51 01           cmp         dl,byte ptr [ecx+1]
  00000064: 75 0E              jne         00000074
  00000066: 83 C0 02           add         eax,2
  00000069: 83 C1 02           add         ecx,2
  0000006C: 84 D2              test        dl,dl
  0000006E: 75 E4              jne         00000054
  00000070: 33 C0              xor         eax,eax
  00000072: EB 05              jmp         00000079
  00000074: 1B C0              sbb         eax,eax
  00000076: 83 D8 FF           sbb         eax,0FFFFFFFFh
  00000079: 85 C0              test        eax,eax
  0000007B: 7E 1E              jle         0000009B
  0000007D: 8D 7D 08           lea         edi,[ebp+8]
  00000080: 83 3F 00           cmp         dword ptr [edi],0
  00000083: 75 8B              jne         00000010
  00000085: 6A 10              push        10h
  00000087: E8 00 00 00 00     call        _malloc
  0000008C: 83 C4 04           add         esp,4
  0000008F: 89 07              mov         dword ptr [edi],eax
  00000091: 85 C0              test        eax,eax
  00000093: 75 14              jne         000000A9
  00000095: 5F                 pop         edi
  00000096: 5E                 pop         esi
  00000097: 33 C0              xor         eax,eax
  00000099: 5D                 pop         ebp
  0000009A: C3                 ret
  0000009B: 8B C5              mov         eax,ebp
  0000009D: FF 40 04           inc         dword ptr [eax+4]
  000000A0: 5F                 pop         edi
  000000A1: 5E                 pop         esi
  000000A2: B8 01 00 00 00     mov         eax,1
  000000A7: 5D                 pop         ebp
  000000A8: C3                 ret
  000000A9: 8B 74 24 14        mov         esi,dword ptr [esp+14h]
  000000AD: 8B C6              mov         eax,esi
  000000AF: 8D 50 01           lea         edx,[eax+1]
  000000B2: 8A 08              mov         cl,byte ptr [eax]
  000000B4: 40                 inc         eax
  000000B5: 84 C9              test        cl,cl
  000000B7: 75 F9              jne         000000B2
  000000B9: 2B C2              sub         eax,edx
  000000BB: 40                 inc         eax
  000000BC: 50                 push        eax
  000000BD: E8 00 00 00 00     call        _malloc
  000000C2: 8B 0F              mov         ecx,dword ptr [edi]
  000000C4: 89 01              mov         dword ptr [ecx],eax
  000000C6: 8B 07              mov         eax,dword ptr [edi]
  000000C8: 83 C4 04           add         esp,4
  000000CB: 83 38 00           cmp         dword ptr [eax],0
  000000CE: 74 C5              je          00000095
  000000D0: 8B 10              mov         edx,dword ptr [eax]
  000000D2: 8B CE              mov         ecx,esi
  000000D4: 8A 01              mov         al,byte ptr [ecx]
  000000D6: 88 02              mov         byte ptr [edx],al
  000000D8: 41                 inc         ecx
  000000D9: 42                 inc         edx
  000000DA: 84 C0              test        al,al
  000000DC: 75 F6              jne         000000D4
  000000DE: 8B 17              mov         edx,dword ptr [edi]
  000000E0: C7 42 04 01 00 00  mov         dword ptr [edx+4],1
            00
  000000E7: 8B 07              mov         eax,dword ptr [edi]
  000000E9: C7 40 08 00 00 00  mov         dword ptr [eax+8],0
            00
  000000F0: 8B 0F              mov         ecx,dword ptr [edi]
  000000F2: 5F                 pop         edi
  000000F3: 5E                 pop         esi
  000000F4: C7 41 0C 00 00 00  mov         dword ptr [ecx+0Ch],0
            00
  000000FB: B8 01 00 00 00     mov         eax,1
  00000100: 5D                 pop         ebp
  00000101: C3                 ret

IDA made me a nice control structure as well: enter image description here

As you can see the code is something like this:

for(...)
 {
    for1(...){...}
    ...
    for1(...){...}
 }

 malloc
 ....
 for3() ...
 malloc
 ...
 for2(...)
 {
    ...
 }

As i know the for1 and for2 has nearly the same structure, only the activity is different, and the for3's implemented function is in the functionfamily as for1 and for2. The for3 uses the result of second malloc as a parameter, so i think that the for2 should be some kind of array copy loop. The for1, for2 and for3 are known stdc inline implementations.

Can someone help me how to figure out this f3 function's purpose?

The second question: how can i use this .obj file in a little sample C program? How can i call its function in VS?

Thanks in advance, any help is appreciated.

UPDATE: Jester: interesting. How did you know about the node's structure? I'm still trying to figure out this whole thing (with your help), but nothing yet.

I figured out, IDA disassembler have a pseudocode viewing feature. here is the pseudo:

signed int __cdecl FB3(int a1, const char *a2)
{
  int v2; // edi@1
  const char **v3; // ebp@2
  void *v4; // eax@7
  signed int result; // eax@8
  int v6; // edx@11
  const char *v7; // ecx@11
  const char v8; // al@12

  v2 = a1;
  while ( *(_DWORD *)v2 )
  {
    v3 = *(const char ***)v2;
    if ( strcmp(a2, **(const char ***)v2) >= 0 )
    {
      if ( strcmp(a2, **(const char ***)v2) <= 0 )
      {
        ++v3[1];
        return 1;
      }
      v2 = (int)(v3 + 2);
    }
    else
    {
      v2 = (int)(v3 + 3);
    }
  }
  v4 = malloc(0x10u);
  *(_DWORD *)v2 = v4;
  if ( v4 && (**(_DWORD **)v2 = malloc(strlen(a2) + 1)) != 0 )
  {
    v6 = **(_DWORD **)v2;
    v7 = a2;
    do
    {
      v8 = *v7;
      *(_BYTE *)v6++ = *v7++;
    }
    while ( v8 );
    *(_DWORD *)(*(_DWORD *)v2 + 4) = 1;
    *(_DWORD *)(*(_DWORD *)v2 + 8) = 0;
    *(_DWORD *)(*(_DWORD *)v2 + 12) = 0;
    result = 1;
  }
  else
  {
    result = 0;
  }
  return result;
}

From this maybe it counts a number's occurence in a string? This pseudocode is a little misty for me.

I've tried to call this function in a sample program, but with no success. I used: extern signed int fb3(int a1, const char *a2); then i tried to call it, but the linker gives me "unresolved external symbol _fb3 referenced in function _main" error (so, there is no fb3 function with this signature in the .obj file which i declared with that extern keyword i guess. so the signature is wrong).

Here is the sample program (main.c) i've tried to use:

#include <stdio.h>
extern signed int fb3(int a1, const char *a2);

int main(void)
{
    char b[3] = {'e','3','y'};

    signed int i = fb3(3,b);
    printf("%d",i);

    return 0;
}

I've set the linker input (vs2010) to f3.obj as well.

UPDATE2: I implemented the node struct, and used a case sensitive function name, now i can compile successfully.

The sample program:

#include <stdio.h>

typedef struct node
{
    int count;
    const char * text;
    struct node* right;
    struct node* left;
} node;

extern int FB3(node* root, const char *text);

int main(void)
{
    node* root;
    signed int i;
    int j;

    root = (node*)malloc(sizeof(node));  

    root->count = 0;
    root->text = "textone";
    root->right = NULL;
    root->left = NULL;

    printf("value = %d\n", FB3(root,"v"));
    printf("value = %d\n", FB3(root,"b"));
    printf("value = %d\n", FB3(root,"c"));
    printf("value = %d\n", FB3(root,"3dasf"));
    printf("value = %d\n", FB3(root,"3ssdfs"));
    printf("value = %d\n", FB3(root,"dsda"));
    printf("value = %d\n", FB3(root,"v"));
    printf("value = %d\n", FB3(root,"gsda"));
    printf("value = %d\n", FB3(root,"gsda"));
    printf("value = %d\n", FB3(root,"a"));
    printf("value = %d\n", FB3(root,"ab"));

    return 0;
}

The output is:

 value=1
 value=1
 value=1
 ... (only value=1)

The interesting thing is that the 7th printf should printf "value=2", because the "v" is already in the tree, no?

هل كانت مفيدة؟

المحلول

From a quick glance this seems to be a binary tree used to count string occurrences. Tree node looks like:

const char* text;
int count;
node* left;
node* right;

The function itself is int addstring(node** root, const char* text) First the code checks if the tree is empty and skips the search if it is. Search starts at 0x10, by doing if (strcmp(current->text, text) > 0) current = current->right; and looping back. This code doesn't look optimized, at 0x4E it does the same comparison, this time checking for < 0 and goes left. At 0x9B is the "found" branch, it increments the counter and returns 1.

If the text is not found a new node is created at 0x85, inserted into the tree and the text is copied into it using strdup (implemented as malloc(strlen()) + strcpy). Both left and right of the new node are set to NULL and the count to 1.

Update: The node size is 16 bytes as can be seen from the malloc invocation. Offset 0 is used to compare the text, so that must be the text. Offset 4 is incremented, so that must be the counter. Offset 8 and 12 are the two child pointers because they are used as such.

The prototype IDA has come up with is nonsense, the first argument must be a pointer otherwise it will blow up. Also, C is case sensitive so try FB3 (in capitals). Something like this:

#include <stdio.h>
extern int FB3(void** root, const char *text);

int main(void)
{
    void* root = NULL;
    int i = FB3(&root, "e3y");
    printf("%p %d", root, i);

    return 0;
}

If that works, you can go ahead and add the node struct so you can then traverse and print the tree from C.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top