why addresses of elements in the stack are reversed in ubuntu64?

Question 1

Even though a clear ABI has been defined for both architectures, compilers do not guarantee that this is respected. You might wonder why, the reason is usually performance. Passing variables into the stack is more expensive in terms of speed than using registers since the application needs to access the memory for retrieving them. Another example of this habit is how compilers use EBP/RBP register. EBP/RBP should be the register which contains the frame-pointer, that is, the stack base address. The stack base register allows for local variables to be easily accessible. However, the frame-pointer register is often used as a general register for increasing the performance. This avoids the instructions to save, set up and restore frame pointers; it also makes an extra register available in many functions, particular important in X86_32 architecture, where usually programs are eager of registers. The main drawback is that makes debugging impossible on some machines. For more info check -fomit-frame-pointer option of gcc.

The calling function between x86_32 and x86_64 are rather different. The most relevant difference is that the x86_64 tries to use general registers to pass the function-arguments and only if there is no register available or the arguments is bigger than 80 bytes, it will use the stack.

We start from the x86_32 ABI, I have slightly changed your example :

#include <stdio.h>
#include <stddef.h> 
#include <stdint.h>

#if defined(__i386__)
  #define STACK_POINTER "ESP"
  #define FRAME_POINTER "EBP" 
#elif defined(__x86_64__)
  #define STACK_POINTER "RSP"
  #define FRAME_POINTER "RBP" 
#else 
  #error Architecture not supported yet!!
#endif

void foo(int i,int j,int k)
{
    int a =20;
    uint64_t stack=0, frame_pointer=0; 

    // Retrieve stack 
asm volatile( 
#if defined (__i386__)
                  "mov %%esp, %0\n"
                  "mov %%ebp, %1\n"
#else 
                  "mov %%rsp, %0\n"
                  "mov %%rbp, %1\n"
#endif
                  : "=m"(stack), "=m"(frame_pointer)
                 : 
                 : "memory");
  // retrieve paramters x86_64 
#if defined (__x86_64__)

    int  i_reg=-1, j_reg=-1, k_reg=-1;

asm volatile  ( "mov %%rdi, %0\n"
                "mov %%rsi, %1\n"
                "mov %%rdx, %2\n"
                 : "=m"(i_reg), "=m"(j_reg), "=m"(k_reg)
                 : 
                 : "memory");
#endif

    printf("%s=%p %s=%p\n", STACK_POINTER, (void*)stack, FRAME_POINTER,  (void*)frame_pointer); 
    printf("%d, %d, %d\n", i, j, k);
    printf("%p\n%p\n%p\n%p\n",&i,&j,&k,&a);


#if defined (__i386__)
      // Calling convention c 
      // EBP --> Saved EBP
      char * EBP=(char*)frame_pointer;   
      printf("Function return address : 0x%x  \n",      *(unsigned int*)(EBP +4)); 
      printf("- i=%d &i=%p \n",*(int*)(EBP+8)  ,  EBP+8 );   
      printf("- j=%d &j=%p \n",*(int*)(EBP+ 12),  EBP+12);   
      printf("- k=%d &k=%p \n",*(int*)(EBP+ 16),  EBP+16);  
#else 
      printf("- i=%d &i=%p \n",i_reg, &i  );   
      printf("- j=%d &j=%p \n",j_reg, &j  );   
      printf("- k=%d &k=%p \n",k_reg ,&k  );  
#endif
}

int main()
{
    foo(1,2,3);
    return 0;
}

The ESP register is being used by foo to point to the top of the stack. The EBP register is acting as a "base pointer". All arguments have been pushed in reverse order into the stack. The arguments passed by main to foo and the local variables in foo can all be referenced as an offset from the base pointer. After calling foo the stack should look like : stack fram x86 32bit .

Assuming that the compiler is using the stack pointer, we can access the function arguments by summing an offset of 4 byte to the EBP register. Note the first arguments is located at offset 8 because the call instruction push in the stack the return address of the caller function.

  printf("Function return address : 0x%x  \n",      *(unsigned int*)(EBP +4)); 
  printf("- i=%d &i=%p \n",*(int*)(EBP+8)  ,  EBP+8 );   
  printf("- j=%d &j=%p \n",*(int*)(EBP+ 12),  EBP+12);   
  printf("- k=%d &k=%p \n",*(int*)(EBP+ 16),  EBP+16);

This is more or less how arguments are passed to a function in x86_32.

In x86_64 there are more registers available, it makes sense to use them to pass the parameter of a function. The x86_64 ABI can be found here : http://www.uclibc.org/docs/psABI-x86_64.pdf. The calling convention starts at page 14.

First the parameters are divided into classes. The class of each parameter determines the manner in which it is passed to the called function. Some of the most relevant are :

INTEGER This class consists of integral types that ﬁt into one of the general purpose registers. For example (int, long, bool)
SSE The class consists of types that ﬁts into a SSE register. (float, double)
SSEUP The class consists of types that ﬁt into a SSE register and can be passed and returned in the most signiﬁcant half of it. ( float_128, __m128,__m256)
NO_CLASS This class is used as initializer in the algorithms. It will be used for padding and empty structures and unions.
MEMORY This class consists of types that will be passed and returned in memory via the stack ( structure types)

Once the a parameter is assigned to a class, it is passed to the function according to these rules :

MEMORY, pass the argument on the stack.
INTEGER, the next available register of the sequence %rdi, %rsi, %rdx, %rcx, %r8 and %r9 is used.
SSE, the next available SSE register is used, the registers are taken in the order from %xmm0 to %xmm7.
SSEUP, the eight bytes is passed in the upper half of the last used SSE register.

If there are no registers available for any eightbyte of an argument, the whole argument is passed on the stack. If registers have already been assigned for some eightbytes of such an argument, the assignments get reverted. Once registers are assigned, the arguments passed in memory are pushed on the stack in reversed order.

Since you are passing int variables, the arguments will be inserted into the general purpose registers.

%rdi --> i 
%rsi --> j
%rdx --> k

So you can retrieve them we the following code :

#if defined (__x86_64__)

    int  i_reg=-1, j_reg=-1, k_reg=-1;

asm volatile  ( "mov %%rdi, %0\n"
                "mov %%rsi, %1\n"
                "mov %%rdx, %2\n"
                 : "=m"(i_reg), "=m"(j_reg), "=m"(k_reg)
                 : 
                 : "memory");
#endif

I hope I have been clear.

In conclusion,

why addresses of elements in the stack are reversed in ubuntu64?

Because they are not stored into the stack. The addresses you have retrieved in that manner are the addresses of the local variables of the caller function.

Question 2

There is absolutely no restriction on how arguments are passed to a function, nor where they go on the stack (or in a register, or in shared memory for that matter). It is up to the compiler to instrument passing the variables in such a manner that the caller and callee agree upon. Unless you force a specific calling convention (for linking code that was compiled with different compilers), or unless there is a hardware dictated ABI - there is no guarantee.