could not understand the behavior of read system call
-
27-10-2019 - |
Question
So this is the code I am trying to run:
#include<fcntl.h>
#include<stdio.h>
#include<errno.h>
#include<string.h>
#include<unistd.h>
int main(){
int ret;
ret = read(STDIN_FILENO,(int*)2000,3);
printf("%d--%s\n",ret,strerror(errno));
return 0;
}
and this is the output I get at the terminal
anirudh@anirudh-Aspire-5920:~/Desktop/testing$ gcc test.c
anirudh@anirudh-Aspire-5920:~/Desktop/testing$ ./a.out
lls
-1--Bad address
anirudh@anirudh-Aspire-5920:~/Desktop/testing$ ls
a.out htmlget_ori.c mysocket.cpp Packet Sniffer.c resolutionfinder.c test.c
anirudh@anirudh-Aspire-5920:~/Desktop/testing$
Question 1: When I type the address 2000 in th read call read(STDIN_FILENO,(int*)2000,3);
then where does the address lies. I think this is the absolute address of the RAM that I am trying to access. am I right or is it offset and is added to the Stack Segment Base Address. I do not know. The program is not giving me a SEGFAULT
for memory violation rather gives me Bad address
Question 2: Okay so the code crashes when I give the input as lls
and bash executes the "ls" part of that "lls". The reason is that the code crashes after reading the first "l" and the rest "ls" part is executed by bash. but why bash is executing the left "ls" part. Why is bash doing so because my code is crashed and even if bash was its parent process it should not read from the file-descriptor (STDIN_FILNO) opened by the code I wrote. ( I think so)...
Thanks for your time.
Solution
You are running on a CPU with paging. Your OS maintains page tables which translate from virtual to physical addresses. The page table for your process doesn't contain anything for virtual address 2000, so read()
notices, and returns -EFAULT
.
stdin
is connected to your terminal device (/dev/tty
). Your process inherits that terminal from your shell, and the shell gets it back on process exit.
OTHER TIPS
The 2000
that you are trying to use as an address is a process-specific virtual address. Chances are good that nothing is mapped into that range; you can add this code to see what your mappings currently are:
char cmd[20];
sprintf(cmd, "pmap -x %i", getpid());
printf("%s\n", cmd);
system(cmd);
If you really must gain access to system RAM around 2000
(and I can't imagine that you are), use the iopl(2)
system call to map that address range into your process memory space. And beware the consequences. :)
As for the rest of the ls
behaviour, try adding a \n
to your printf()
format string, I've found that not properly flushing output can lead to confusing-looking interaction, perhaps this is just confusing, rather than outright strange. :)
I will only answer question 1. I don't fully understand what you mean by question 2, plus it might take care of itself once you fix your first problem.
To answer your question 1, without being 100% sure, I would wager that (int*)2000
specifies a location in your program's data segment, i.e. that the 2000
is only the offset part. The reason why I think so is that generally, with any modern operating system, you hardly ever have unrestricted access to physical memory. The linker and the OS' program loader handle all the segment-related stuff for you -- your program only ever gets to see the offset portion of (virtual — see P.S. below) memory addresses. All things data-related usually happen in the data segment; code-related stuff (such as function calls) are usually bound to the code segment.
As I see it, you have no guarantee that any specific data structure will be located at offset 2000
of your data segment. Your read
destination is therefore almost always invalid, as it basically means that you're writing data to a random location in memory.
P.S.: By virtual memory address, I mean that your program's segment will possibly be loaded at different physical memory addresses by the OS. So offset 2000
(for example) of any segment will not always mean the same absolute, physical memory address; rather it's an offset, i.e. relative to a segment's base address, which itself lies at an arbitrary physical memory address.