Handle ?

the article discusses 'handles' in a system programing(beginner level)

Handle
is often used term when one tries to interact with the API provided by the underlying Operating System(OS) or a system library. With :-) this handle you can handle(i.e. control, manipulate) the object. Thus the name for it. All of the system object are "handled" in this way - like files, sockets, pipes. Thus the life cycle of using a system object is:
  1. Obtain a handle to the object.
  2. Use this handle to manipulate the object .
  3. Dispose of the object, by closing the handle.

For example one uses the system call open to get a handle to a file. With that handle operations like reading and writing are possible:
#include <fcntl.h>
//#include <io.h>
#include <unistd.h>

int main(int argc, char* argv[])
{
    // Open a file by obtain a handle to it
    int handle_file = open("/etc/passwd", O_RDONLY);
    if( handle_file == -1 ) {
        return 1; // file could not be opened
    }

    // With the handle, we could perform the operations
    // that the underlying OS or object provides - like
    // reading, writing, seekeing .. etc.
    char buffer[1024];
    read( handle_file, buffer, 1024 ); //read the first 1K bytes

    // finished using the object
    close( handle_file );
    return 0;
}


This program in Linux like OSes will open a file and read the first 1K of it (in Windows - replace unistd.h with io.h)The same scenario goes for all of the object that the OS provides - create, use and close.

So, yes, a handle is like a c/c++ pointer to the underlying object. Does not matter what the type of the handle is (integer in the example above) - the OS/library knows the actual association to the object :
Internally the OS keeps a list of open(i.e. in use) handles. Each handle has a reference to the actual object it represents and optionally additional information specific to instance of the handle(like the current file position). It is not possible for a handle to point to 2 different objects. But the other way is OK: several different handles could point to the same object. The objects have a reference counting structure - when handle is opened to this object the number of references is increased by 1(starting from 0), when it is closed it is decreased by 1. Thus when an OS/library object has no more handles pointing to it(i.e. number of references is 0) - the OS will destroy the object.

In the example above Program A has open handles to 2 different files. Program B has a handle(Handle 1) to the same file as Program A(handle 3), and also has a handle to a socket. If Program B decides to close the handles - the socket object would be destroyed, but the file will remain in use.

Having the above picture in mind, some notes for further investigation:
  • Different libraries and different OS will use different types for the handle. Mostly used is a simple int or void*. The important thing is not modify or do any operations on it (like pointer dereferencing, arithmetic) and be consistent across the calls.

  • The reason why OS and system libraries does not provide a nice clean classes to manipulate its objects is because when the OSes were being first created in the 70'es and 80'es - the C was the omnipotent language. And still today OS and system libraries are being developed in it.

    Yet there are now available libraries with clean class hierarchies that wrap the handles inside them - like MFC on Windows, or STL, or boost. And since the creation of Java and .Net in the 90es the interaction with OS is build inside the platform.

  • Avoid using a handle after it has been closed. Otherwise in the best case the call will fail with invalid handle error, or you can end up modifying some other object (the handle slot might have been filled with reference to another object). Good rule of thumb is to assign an appropriate invalid value after closing (like -1, or NULL).

  • Close the handle when finished using the object. Failure to do so could lead to process or system memory leaks. The process memory leaks will be cleaned up forcibly by the OS once the process terminates. But the system leaks may remain until the OS is rebooted (Windows is more robust in this respect than Linux). System leaks could happen with objects that are shared among several processes (shared memory, named pipes).

  • In the 70-80'es when the OS were first being created there has been multitude of operating systems. Which easily goes into chaos of not compliant OS.

    Thus an effort to standardize the OS interface was undertaken - POSIX - Portable Operating System Interface. Thus a good idea when you want to create your own OS is to make it POSIX compliant. I.e. all the functions that manipulate the objects in your OS must have a defined interface. Thus the above program, which uses POSIX compliant open and read, will compile and work also on your OS.

    All dominant OS today provide POSIX libraries - Windows, Linux, Unix. Which makes cross development for different OS a little bit smoother than a total chaos. The effort to create cross platform libraries is very sophisticated. Boost, STL(standard template library) and the "standar C library" being good examples.

  • The handle structure inside the OS/library has optionally additional information. For example the current file position. There could be several handles to the same file, and each handle will have its own position in the file.

  • Historically the file handles 0, 1 and 2 when used with the POSIX open/read/write have special meaning. They refer to the input, output and error streams for the current process. They are open by default when the process is created, and can be readily used. For example read from file with handle 0, will actually read from the keyboard if it is the standard input for the process.

  • Usually the OS/library provides a way for duplicating a handle (POSIX - dup(), Windows DuplicateHandle() ). Which creates another entry in OS handle table that points to the same object. This feature is used in advance techniques for sharing a handle between processes, or replacing the standard input/output.

    Handle can also be inherited by a child process.

  • When a process exit, the OS will close all open handles for this process.


Do you want to know more ? Do you want me to elaborate on some detail ? Leave a comment then.

Comments

Popular posts from this blog

Data types: Backend DB architecture

Node.js: Optimisations on parsing large JSON file

Back to teaching