C Tutorial: I/O Redirection

Redirecting I/O

A powerful capability in the shell is the ability to treat terminal input and output interchangeably with file input/output for most commands. For example, to save the list of all files ending with .c into a file called files.txt, one can run:


find . -iname '*.c' >files.txt

Even though many commands optionally support one or more file names on the command line, we can also feed them file data as the standard input. For instance, to count the number of files in files.txt:


wc -l <files.txt

Standard input, standard output, and standard error

In UNIX systems (and derivatives), open files and devices are identified by file descriptors. A file descriptor is a small integer that, internally, is simply an index into a per-process table of information about open files. Each process expects to have three open file descriptors upon startup. The standard input, file descriptor 0, is used as the default input source. Typically, this is data that the user enters via the keyboard to the virtual terminal (window running a shell) but may be redirected to another source such as a file (through input redirection) or the output of another program (via a pipe). The standard output, file descriptor 1, is used as the default output source. This, too, is the user's virtual terminal by default. The standard error, file descriptor 2, is an alternate output source that is typically used for sending error messages. Typically, this is the same as the standard output. However, if the standard output is redirected, the standard error is still the terminal so that the user can see any error messages that are generated by the program. The user can redirect this on the shell via the 2> redirector:


find . -iname '*.c' >files.txt 2>errors.txt

The user can also direct the standard error to the same stream as standard output even if the standard output is redirected with this syntax, which explicitly states that the standard error is redirected to file descriptor 1::


find . -iname '*.c' >files.txt 2>&1

What about stdin, stdout, and stderr?

A lot of programs use functions calls such as printf, scanf, fprintf, fscanf, fgets, fgetc, getc, puts, fputs, putc, fputc, fread, and fwrite for input and output.

These functions are not system calls and are part of the stdio package. The stdio package is a set of functions that provides user-level buffering for input and output. File descriptors are not accessed directly since doing so will bypass any buffering that is done by the library. Instead, the libary keeps track of open files with FILE structures. The variables stdin, stdout, and stderr are pointers to FILE structures that correspond to the standard input, output, and error streams. Internally, the stdio library of course uses file descriptors along with read, write, open, and close system calls since those are the only facilities provided by the operating system.

dup2: Duplicating a file descriptor

A custom program is welcome to open or create any files it wishes to and use the resultant file descriptor for accessing that file. There is rarely any particular need to change the standard input, standard output, or standard error. However, in cases where other programs need to be run, one's ability to change where data is read from or written to is often limited to redefining the standard input and output streams. As we've seen, the shell uses this facility extensively for providing I/O redirection and pipe capabilities to programs it runs.

The kernel allows us to do this via the dup2 system call. The dup2(int f_orig, int f_new) system call takes two file descriptors as parameters and duplicates the first one (f_orig) onto the second one (f_new). If the second file descriptor was referencing a file, that file is closed. After the system call, both file descriptors are valid references to the file.

Example

This is a really small program that illustrates how dup2 works. We get a file name from the command line and create it as a new file, getting a file descriptor for it. We then write something to the standard output using printf and the stdio library. After that, we use dup2 to copy the file descriptor for the new file (newfd) onto the standard output file descriptor (1). Any printf functions will continue to go to the standard output, but that has now been changed to the file we just opened.

/* output redirection with dup2() Super-simple example Paul Krzyzanowski */ #include <stdlib.h> #include <stdio.h> #include <fcntl.h> int main(int argc, char **argv) { int pid, status; int newfd; /* new file descriptor */ if (argc != 2) { fprintf(stderr, "usage: %s output_file\n", argv[0]); exit(1); } if ((newfd = open(argv[1], O_CREAT|O_TRUNC|O_WRONLY, 0644)) < 0) { perror(argv[1]); /* open failed */ exit(1); } printf("This goes to the standard output.\n"); printf("Now the standard output will go to \"%s\".\n", argv[1]); /* this new file will become the standard output */ /* standard output is file descriptor 1, so we use dup2 to */ /* to copy the new file descriptor onto file descriptor 1 */ /* dup2 will close the current standard output */ dup2(newfd, 1); printf("This goes to the standard output too.\n"); exit(0); }

Download this file

Save this file by control-clicking or right clicking the download link and then saving it as dup2-a.c.

Compile this program via:

gcc -o dup2-a dup2-a.c

If you don't have gcc, You may need to substitute the gcc command with cc or another name of your compiler.

Run the program:

./dup2-a

There's one glitch with the program. If you redirect the output to another file and run:


dup2 abc.txt >def.txt

You'll find that all of the output went to the file def.txt! This is a side-effect of the stdio library buffering its output to minimize the amount of times it calls the write system call. In the case where we did not redirect the standard output, stdio was smart enough to realize that we are writing to a terminal device and all data should be flushed out so that the user is not delayed waiting for output. In the case where we redirected the output, the data was not written until after the file descriptor was changed. It was just sitting in a memory buffer. To get the desired behavior, we need to ensure that we flush any pending data with the fflush function (part of stdlib):

/* output redirection with dup2() Super-simple example Paul Krzyzanowski */ #include <stdlib.h> #include <stdio.h> #include <fcntl.h> int main(int argc, char **argv) { int pid, status; int newfd; /* new file descriptor */ if (argc != 2) { fprintf(stderr, "usage: %s output_file\n", argv[0]); exit(1); } if ((newfd = open(argv[1], O_CREAT|O_TRUNC|O_WRONLY, 0644)) < 0) { perror(argv[1]); /* open failed */ exit(1); } printf("This goes to the standard output.\n"); printf("Now the standard output will go to \"%s\".\n", argv[1]); fflush(stdout); /* this new file will become the standard output */ /* standard output is file descriptor 1, so we use dup2 to */ /* to copy the new file descriptor onto file descriptor 1 */ /* dup2 will close the current standard output */ dup2(newfd, 1); printf("This goes to the standard output too.\n"); exit(0); }

Download this file

Save this file by control-clicking or right clicking the download link and then saving it as dup2-b.c.

Compile this program via:

gcc -o dup2-b dup2-b.c

If you don't have gcc, You may need to substitute the gcc command with cc or another name of your compiler.

Run the program:

./dup2-b

Redirection: executing a process after dup2

The more useful example of dup2 is input or output (or both) redirection. Here we get the name of the output file from the command line as before and set that to be the standard output but now execute a command (ls -al / in this example). The command sends its output to the standard output stream, which is now the file that we created.

/* output redirection with dup2() send the output of a command to a file of the user's choice. Paul Krzyzanowski */ #include <stdlib.h> #include <stdio.h> #include <fcntl.h> int main(int argc, char **argv) { int pid, status; int newfd; /* new file descriptor */ char *cmd[] = { "/bin/ls", "-al", "/", 0 }; if (argc != 2) { fprintf(stderr, "usage: %s output_file\n", argv[0]); exit(1); } if ((newfd = open(argv[1], O_CREAT|O_TRUNC|O_WRONLY, 0644)) < 0) { perror(argv[1]); /* open failed */ exit(1); } printf("writing output of the command %s to \"%s\"\n", cmd[0], argv[1]); /* this new file will become the standard output */ /* standard output is file descriptor 1, so we use dup2 to */ /* to copy the new file descriptor onto file descriptor 1 */ /* dup2 will close the current standard output */ dup2(newfd, 1); /* now we run the command. It runs in this process and will have */ /* this process' standard input, output, and error */ execvp(cmd[0], cmd); perror(cmd[0]); /* execvp failed */ exit(1); }

Download this file

Save this file by control-clicking or right clicking the download link and then saving it as dup2-c.c.

Compile this program via:

gcc -o dup2-c dup2-c.c

If you don't have gcc, You may need to substitute the gcc command with cc or another name of your compiler.

Run the program:

./dup2-c

Redirecting in a new process

The above example uses execvp, which overwrites the current process with the new program. If we want the command to run in a separate process while sending its output to the file that we created, we can simply fork a new process. Note that we perform our dup2 call in the child so that we do not overwrite the standard output of the parent. We also close the standard input since we do not want the forked process trying to read from there (the ls command in this example will not but other programs might). Finally, the parent waits for the child process to terminate before continuing.

/* output redirection with dup2() send the output of a command to a file of the user's choice. Paul Krzyzanowski */ #include <stdlib.h> #include <stdio.h> #include <fcntl.h> void runcmd(int fd, char **cmd); int main(int argc, char **argv) { int pid, status; int newfd; /* new file descriptor */ char *cmd[] = { "/bin/ls", "-al", "/", 0 }; if (argc != 2) { fprintf(stderr, "usage: %s output_file\n", argv[0]); exit(1); } if ((newfd = open(argv[1], O_CREAT|O_TRUNC|O_WRONLY, 0644)) < 0) { perror(argv[1]); /* open failed */ exit(1); } printf("writing output of the command %s to \"%s\"\n", cmd[0], argv[1]); runcmd(newfd, cmd); /* run the command, sending the std output to newfd */ printf("all done!\n"); exit(0); } /* runcmd(fd, cmd): fork a child process and run the command cmd, sending the standard output to the file descriptor fd. The standard input is closed. The parent waits for the child to terminate. */ void runcmd(int fd, char **cmd) { int status; switch (fork()) { case 0: /* child */ dup2(fd, 1); /* fd becomes the standard output */ execvp(cmd[0], cmd); perror(cmd[0]); /* execvp failed */ exit(1); default: /* parent */ while (wait(&status) != -1) ; /* pick up dead children */ break; case -1: /* error */ perror("fork"); } return; }

Download this file

Save this file by control-clicking or right clicking the download link and then saving it as dup2-d.c.

Compile this program via:

gcc -o dup2-d dup2-d.c

If you don't have gcc, You may need to substitute the gcc command with cc or another name of your compiler.

Run the program:

./dup2-d

For more examples, see the pipe tutorial.

What happened to dup?

This entire discussion focussed on the dup2 system call. Given that name, one cannot help but wonder what happened to dup1. There actually is a dup system call and it predates dup by many years. Unlike dup2, dup takes a single parameter: the open file descriptor. It then duplicates it onto the lowest numbered file descriptor that is currently not used by that process and returns the number of that file descriptor.

A process starts off with the first three file descriptors in use (0, 1, and 2; standard in, standard out, and standard error, respectively). To duplicate a new file descriptor onto, say, the standard output, you would close the standard output file descriptor (1) and then call dup. Since the lowest unused file descriptor is now 1, dup will duplicate the file descriptor to file descriptor 1.


close(1);	/* close std output */
dup(newfd);	/* duplicate newfd to fd 1 */

There are two downsides to using dup. First, you have to close the file descriptor that will be the target of duplication so that it becomes unused. This means that you will be making two system calls instead of one. Second, you cannot rely on the fact that file descriptor 0 is truly in use. If it was closed without your knowledge (say, before a fork and exec) then dup will duplicate onto the wrong file descriptor.

Is there yet another way to accomplish this?

I'm glad you asked. Yes, there's even another way to duplicate a file descriptor: with the fcntl system call. The fcntl call is a call for several miscellaneous operations that you can do to files: get/set file modes, get/set owners/groups, and get/set file locks. Depending on the operating system, it can also do thins such as truncate a file, preallocate storage, read bootstrap code, toggle disk caching, and a slew of other file-related operations.

One of the functions that fcntl can perform is F_DUPFD, which duplicaes a file descriptor in a manner similar to dup.


#include <fcntl.h>

fcntl(fd, F_DUPFD);

is exactly equivalent to dup(fd).

Recommended

The Practice of Programming

 

The C Programming Language

 

The UNIX Programming Environment