- The Big Intro
- Compiling
- One Big Program
- Breaking It Apart
- Low-hanging Fruit
- Creating the Opaque Data Type
- Summary
- References
The Big Intro
Opaque data types are a method of information hiding.
For example, it’s good practice to expose a stable API to the world that won’t change, and then behind this interface hide all of the complexity of the implementation of those APIs. The implementations can then frequently change as needed without impacting the user code, because it is using the well-defined and stable public API that does not change.
Object-oriented languages do this through data abstraction and data encapsulation of private data members variables, methods and classes. The C language does this through an opaque data type, where any data manipulation details are hidden and only accessible through the declared subroutines that are exposed as the public API in the header file (the interface).
One of the niceties of this pattern is that the application (user code), doesn’t need to be re-compiled, as the implementation that changed is most likely going to be in a shared object.
The canonical example of an opaque data type in C is the FILE
type.
Just to be clear, if someone is really determined to see what’s behind the opaque data type, they will. It’s a deterrent rather than a guarantee.
Before we dig into some code, let’s take a look at the Makefile
that we’ll use to compile the program.
Compiling
Makefile
CC=clang
CFLAGS=-g -Wall
MAIN=main.c
OBJS=person.o
BIN=person
all: $(BIN)
%.o: %.c %.h
$(CC) $(CFLAGS) -c $< -o $@
%.o: %.c
$(CC) $(CFLAGS) -c $< -o $@
$(BIN): $(MAIN) $(OBJS)
$(CC) $(CFLAGS) $^ -o $@
clean:
$(RM) -r $(OBJS) $(BIN)
Ok, now, let’s see some code!
One Big Program
We’ll just going to use a little toy program that is intentionally small so the ideas are clearly expressed without a lot of extraneous noise.
The program is expecting to get two pieces of data from the user: name and age. It is going to create a local variable to store this information and then print a nice message to the terminal.
The interface that the program exposes are:
- structs
person
- functions
make_person
destroy_person
error
reverse
say_hello
slen
Yes, I know that there is a
strlen
function defined instrings.h
.
program.c
#include <stdlib.h>
#include <stdio.h>
#include <time.h>
struct person {
char *name;
int age;
int token;
};
int generate_token() {
srand(time(NULL));
return rand();
}
struct person* make_person(char *name, int age) {
struct person *p = malloc(sizeof(struct person));
p->name = name;
p->age = age;
p->token = generate_token();
return p;
}
void error(char *msg) {
fprintf(stderr, "%s\n", msg);
exit(-1);
}
void destroy_person(struct person *p) {
if (p != NULL)
free(p);
}
int slen(struct person *p) {
if (p == NULL)
error("slen()");
char *s = p->name;
int i = 0;
while (s[i] != '\0')
++i;
return i;
}
int reverse(struct person *p) {
if (p == NULL)
error("reverse()");
char *s = p->name;
int swap, i, j;
int l = slen(p);
for (i = 0, j = l - 1; i < j; ++i, --j) {
swap = s[i];
s[i] = s[j];
s[j] = swap;
}
return l;
}
void say_hello(struct person *p) {
if (p == NULL)
error("say_hello()");
printf("Hi %s, you are %d years of age! Your token is %d.\n", p->name, p->age, p->token);
}
int main(int argc, char **argv) {
if (argc != 3)
error("Not enough arguments.");
struct person *p = make_person(argv[1], atoi(argv[2]));
say_hello(p);
reverse(p);
say_hello(p);
destroy_person(p);
}
$ clang -o program program.orig.c
$ ./program
Not enough arguments.
$ ./program "Fred Moseley" 42
Hi Fred Moseley, you are 42 years of age! Your token is 1595752298.
Hi yelesoM derF, you are 42 years of age! Your token is 1595752298.
Exciting stuff.
The problem with this implementation is that there is no data abstraction. This means that the internal workings of the person
struct can be directly modified by the user.
For instance, the user could define a main function like this:
int main(int argc, char **argv) {
if (argc != 3)
error("Not enough arguments.");
struct person *p = make_person(argv[1], atoi(argv[2]));
p->token = 55555555;
say_hello(p);
destroy_person(p);
}
$ ./program "Kilgore Trout" 42
Hi Kilgore Trout, you are 42 years of age! Your token is 55555555.
Here the person
type is exposed, so the developer can supply their own token
, which may or may not be a problem.
Admittedly, this may not be a big deal, but it’s good practice to hide these kinds of implementation details behind an opaque data type. After all, we want them to use the code paths (the API) that we’ve established, not the least because it allows us to change the interface implementations without any disruption to the user (and without having to recompile the program).
Let’s now look at how we can start improving the program.
Breaking It Apart
The Interface
The first change we’ll make is to create a header file that will define the interface. We’ll move the person
struct into it, as well as create function prototypes for all of the publicly exposed APIs:
person.h
#ifndef PERSON_H
#define PERSON_H
struct person {
char *name;
char *token;
int age;
};
struct person* make_person(char *, int);
void destroy_person(struct person *);
int generate_token(void);
void error(char *);
int reverse(struct person *);
void say_hello(struct person *);
int slen(struct person *);
#endif
The Library
Secondly, we’ll move the implementation into its own file:
person.c
#include <stdlib.h>
#include <stdio.h>
#include <time.h>
#include "person.h"
int generate_token(void) {
srand(time(NULL));
return rand();
}
struct person* make_person(char *name, int age) {
struct person *p = malloc(sizeof(struct person));
p->name = name;
p->age = age;
p->token = generate_token();
return p;
}
void destroy_person(struct person *p) {
if (p != NULL)
free(p);
}
void error(char *msg) {
fprintf(stderr, "%s\n", msg);
exit(-1);
}
void say_hello(struct person *p) {
if (p == NULL)
error("say_hello()");
printf("Hi %s, you are %d years of age! Your token is %d.\n", p->name, p->age, p->token);
}
int slen(struct person *p) {
if (p == NULL)
error("slen()");
char *s = p->name;
int i = 0;
while (s[i] != '\0')
++i;
return i;
}
int reverse(struct person *p) {
if (p == NULL)
error("reverse()");
char *s = p->name;
int swap, i, j;
int l = slen(p);
for (i = 0, j = l - 1; i < j; ++i, --j) {
swap = s[i];
s[i] = s[j];
s[j] = swap;
}
return l;
}
The User Program
Lastly, remove everything from the main program except for the main
function entry point:
main.c
#include <stdlib.h>
#include <stdio.h>
#include "person.h"
int main(int argc, char **argv) {
if (argc != 3)
error("Not enough arguments.");
struct person *p = make_person(argv[1], atoi(argv[2]));
say_hello(p);
reverse(p);
say_hello(p);
destroy_person(p);
}
Looking much better:
$ tree
.
├── main.c
├── person.c
└── person.h
0 directories, 3 files
Low-hanging Fruit
Remove Private Functions from the Interface
In person.h
, remove the generate_token
and slen
function prototypes. The complete header file then looks like the following:
#ifndef PERSON_H
#define PERSON_H
struct person {
char *name;
int age;
int token;
};
struct person* make_person(char *, int);
void destroy_person(struct person *);
void error(char *);
int reverse(struct person *);
void say_hello(struct person *);
#endif
Then, in person.c
, add the static
keyword to the slen
function to make it only available to that library file:
static int slen(struct person *p) {
if (p == NULL)
error("slen()");
char *s = p->name;
int i = 0;
while (s[i] != '\0')
++i;
return i;
}
Recompile and run:
$ make
clang -g -Wall -c person.c -o person.o
clang -g -Wall main.c person.o -o person
$ ./person "Kilgore Trout" 93
Hi Kilgore Trout, you are 93 years of age!
Hi tuorT erogliK, you are 93 years of age!
Creating a typedef
Let’s name our person
struct so we can just reference it by the name person
instead of always prefacing it with the struct
keyword.
So:
struct person {
char *name;
int age;
int token;
};
Becomes:
typedef struct {
char *name;
int age;
int token;
} person;
This allows us to remove the struct
keyword everywhere it prefaces person
. For instance, the header file will now look like this:
#ifndef PERSON_H
#define PERSON_H
typedef struct {
char *name;
int age;
int token;
} person;
person* make_person(char *, int);
void destroy_person(person *);
void error(char *);
int reverse(person *);
void say_hello(person *);
#endif
In other words, you can remove the keyword struct
before any instance of person
:
structperson *p = make_person(argv[1], atoi(argv[2])); ...structperson* make_person(char *name, int age) {structperson *p = malloc(sizeof(structperson)); ... int reverse(structperson *p) { ... Etc.
To save space, I won’t print the other two files, but make sure that you also change
main.c
andperson.c
.
Creating the Opaque Data Type
Now that the program has been restructured and improved, creating the opaque data type is only a couple of extra simple steps.
The first is to move the person
struct definition into the program.c
library file and give it a name p
:
person.c
#include <stdlib.h>
#include <stdio.h>
#include <time.h>
#include "person.h"
typedef struct p {
char *name;
int age;
int token;
} person;
int generate_token(void) {
srand(time(NULL));
return rand();
}
...
Next, remove the body of the struct
and give it the same name p
as we just did in the library file. This is enough so that the function definitions will still compile. The p
name could have been anything, it’s not going to affect the functionality of the program.
person.h
#ifndef PERSON_H
#define PERSON_H
typedef struct p person;
person* make_person(char *, int);
void destroy_person(person *);
void error(char *);
int reverse(person *);
void say_hello(person *);
#endif
So, it’s at this point that person
can now be considered an opaque data type. It’s also what’s known as an incomplete type, as the compiler knows it’s a type and that it’s a struct
, but it can’t know its definition, as there is no struct
body.
Recompile and run:
$ make
clang -g -Wall -c person.c -o person.o
clang -g -Wall main.c person.o -o person
$ ./person "Kilgore Trout" 93
Hi Kilgore Trout, you are 93 years of age!
Hi tuorT erogliK, you are 93 years of age!
Interestingly, if a developer tried futzing with the internals of the person
type as they did before, they’ll get a giant error when compiling:
int main(int argc, char **argv) {
if (argc != 3)
error("Not enough arguments.");
struct person *p = make_person(argv[1], atoi(argv[2]));
p->token = 55555555;
say_hello(p);
destroy_person(p);
}
$ make
clang -g -Wall -c person.c -o person.o
clang -g -Wall main.c person.o -o person
main.c:10:6: error: incomplete definition of type 'struct p'
p->token = 333333;
~^
./person.h:4:16: note: forward declaration of 'struct p'
typedef struct p person;
^
1 error generated.
make: *** [Makefile:16: person] Error 1
The incomplete type error is because the compiler cannot figure out the size of the identifier. This will force the user of your library to use your carefully crafted API.
Weeeeeeeeeeeeeeeeeeeeeeeeeeee
Summary
This is an important summary and should not be skipped.