Data Hiding in C

March 2nd, 2008

Object-oriented programming languages are described as supporting encapsulation, polymorphism, and data hiding. They provide powerful features that allow software components to be designed and implemented for change.

Observing the mechanisms in an object-oriented programming language can potentially lead to a parallel implementation in a procedural language. This allows some object-oriented design knowledge that has been refined through study and experience to be implemented in procedural languages such as C.

A Circle type is a typical classroom example of a user-defined data type. Getting an area or circumference are common operations performed on instances of a Circle type. In C++, a Circle type may be defined as follows:

class Circle
{
  double radius;

  public:
    Circle()
    {
      // not using init list here to make parallelism
      // with example C code more easily seen
      this->radius = 0;
    }

    void setRadius( double r )
    {
       this->radius = r;
    }

    double getRadius()
    {
       return this->radius;
    }

    double getCircumference()
    {
       return 2 * PI * this->radius;
    }

    double getArea()
    {
       return PI * pow( this->radius, 2 );
    }
};

Member functions, which are functions that are called to operate on objects, in object-oriented languages tend to have a reference or pointer, which may be implicit or explicit, to the object on which the function operates. The use of a struct and functions that accept a pointer to the struct instances provides similar behavior in C. C code that mimics the example C++ code is presented here:

typedef struct
{
   double radius;
} Circle;

void circle_construct( Circle *c )
{
  c->radius = 0;
}

void circle_setRadius( Circle *c, unsigned r )
{
  c->radius = r;
}

double circle_getRadius( Circle *c )
{
  return c->radius;
}

double circle_getCircumference( Circle *c )
{
  return 2 * PI * this->radius;
}

double circle_getArea( Circle *c )
{
  return PI * pow( this->radius, 2 );
}

An association between the functions and the data, which is accomplished through encapsulation in the C++ code, is made in the C code through a coding convention. The convention here directs the form of the function signature to be as follows:

returnType typeName_operationName( params )

Data hiding is accomplished by another coding convention. Instead of dependents on Circle objects operating directly on the Circle objects’ fields, the dependents shall only call functions that manipulate the object on behalf of the dependents.

Code that is dependent on Circle objects as implemented in C is presented below:

Circle c;

circle_init( &c );
circle_setRadius( &c, 3 );
printf( "%f", circle_getRadius( &c ) ); /* "3.0" */
printf( "%f", circle_getCircumference( &c ) ); /* "18.85" */
printf( "%f", circle_getArea( &c ) ); /* "28.27" */

circle_setRadius( &c, 4 );
printf( "%f", circle_getRadius( &c ) ); /* "4.0"  */
printf( "%f", circle_getCircumference( &c ) ); /* "25.13" */
printf( "%f", circle_getArea( &c ) ); /* "50.26" */

The above code is more insulated from change than code that accesses the Circle fields directly. A field for the circle’s color and functions that operate on that field can be introduced, for example, and the above code has a good chance of not requiring modification while having its behavior remain unchanged. Ignoring the possibility of floating-point approximation errors, as another example, the internal representation of the Circle type can be overhauled completely by changing the radius field into a field containing the circumference of a circle and modifying the associated functions appropriately, and a necessity of change in the above code remains unlikely. The ability to change the representation of a data type without modification to dependent code evidences this technique’s effectiveness in making code resilient to change.

Parallels to the idioms and adages of languages like C++ can be implemented in C. By mimicking the natural mechanisms of object-oriented languages. the benefits of those features can be introduced in procedural languages where such mechanisms are neither directly supported nor idiomatic.

Weekend Tech Supporter

February 12th, 2008

I spent the weekend by hunting down the reason a computer was running sluggishly. I noticed that the system would wait at the Microsoft Windows XP splash screen for a couple of minutes while starting up. I activated the task manager just as the desktop was being displayed. According to the information that it provided, a process was responsible for 99% of the processor load.

I deduced that the problem was spyware. I believed it would require some modification to the system. Because the system’s owner expressed concerns about data retention, I decided to perform a raw backup before I took corrective action. I rebooted the computer into a LiveCD version of CentOS 4.4, mounted an SMB share, and used the following command from linuxquestions.org:

dd if=/dev/hda | gzip -c \
| split -b 2000m - /mnt/smb/backup.img.gz.

A write back to the hard drive is performed with the following:

cat /mnt/smb/backup.img.gz.* | gzip -dc | dd of=/dev/hda

I thought that in order to improve performance, I would need to address the offensive process. I uninstalled a program called “Spyware Terminator,” the probable owner of the process that was eating up processor resources. I have known that AVG can consume 99% of the processing power at times as Spyware Terminator did, but I did not remove AVG, because of simple brand recognition. Perhaps due to my lack of IT work experience, I have never heard of Spyware Terminator, and it was one of my first targets for removal. It may be an excellent program from what I do not know.

The system seemed to perform better than it did before Spyware Terminator was removed, though it continued to stall on the Microsoft Windows splash screen. A friend suggested that I run AVG in safe mode. I did not think about doing that, because of my past failed attempts to install AVG in safe mode. I observed the computer stall as the listing of files that were being loaded during boot into safe mode was displayed. It seemed to stall on the same file, hup.sys, so I did a lookup on Google and found something about hanging on hup.sys and possible hardware faults.

While booting up the CentOS LiveCD earlier, I noticed messages regarding an inability to read hard drive sectors. The requests were timing out. I did not think much of it at the time. After all, GNU/Linux is a bit more verbose about problems than Microsoft Windows. I checked the event logger in Microsoft Windows, and it indicated a problem with the hard drive as well. I downloaded a hard drive test utility from the system’s hard drive manufacturer. The short self-test passed, but the long self-test failed.

The culprit of the computer’s degraded performance was isolated to a failing hard drive. Microsoft Window’s drive manager may report the drive as healthy, but a number of tools indicate that the problem is with the hard drive. I swapped out the hard drive, loaded the CentOS LiveCD, used dd to write the backup image of the original hard drive to the new hard drive, and rebooted into Windows.

Success was very apparent upon boot. The splash screen was displayed for seven seconds as opposed to the minute or two that it was displayed when the computer was first brought to me. With a sense of accomplishment, I polished up my work by applying updates, setting up auto-updates, and scheduling virus scans. I used ScanDisk and the disk defragmentation tool, and it was ready for release to the computer’s owner.

I believed that I eliminated the problem by removing Spyware Terminator, but I was not satisfied with the slight increase in performance. If it were my computer, I would investigate the problem further, and I did for this computer. Taking ownership in the technical tasks that I perform has helped me bring about quality in my work and satisfaction for the people who use the product of my work. As I take on jobs, I think about how I would want other people to do things for me. I also think about how I would want to do things for myself and apply those thoughts to tasks that are done for others. Not being satisfied with marginally acceptable results lead to the diagnosis of a problem that allowed for a significant increase in performance when resolved.

Functions, Parameters, and Global Variables in C

February 4th, 2008

When I define interfaces to the functions that I implement, I try to be explicit about the variables that the functions will examine and modify. From time to time, a global variable is necessary, and I typically employ intermediary functions for accessing such a variable. Avoiding the direct use of global variables within functions is a sound software implementation principle.

An ideal function under the functional programming paradigm operates only on its parameters and returns a single value. Functions that possess these attributes along with good function names are coherent and cohesive. The function is dependent entirely on its parameters. There are no implicit dependencies. As opposed to a function that operates on a global variable, for example, a function that operates only on its parameters can accept different parameters from its caller.

There are times when a global variable is required, and introducing accessor functions to provide access to these variables is highly recommended. One benefit that accessor functions provide is the possibility of implementing state validation. In general, using accessor functions allows for preprocessing before access to the global variable is provided. Using an accessor function, even if such a function simply returns the global variable, introduces another level of indirection. It effectively provides an interface to the global variable.

Like the good practice of keeping member fields of objects in C++ non-public and preventing dependents of those objects from acting on those objects’ data fields, a level of indirection should be introduced between a global variable and the code that depends on the global variable.

“Rails is 100% magic with 0% design”

January 22nd, 2008
Posted in Ruby | 2 Comments

After delving into Ruby and Rails, I have felt the following but never got around to writing about it. I have since abandoned the adoption of Rails, but I recently stumbled upon a newsgroup post that expresses what I think about the framework (but with examples that I would not have produced without more experience):

Newsgroups: comp.lang.lisp
From: Maciej Katafiasz
Date: Mon, 21 Jan 2008 12:47:45 +0000 (UTC)
Local: Mon, Jan 21 2008 7:47 am
Subject: Re: OT: Rails is shitty
[was top down programming in a bottom up language]
Den Mon, 21 Jan 2008 06:08:27 +0000 skrev Sohail Somani:

> On Mon, 21 Jan 2008 04:10:19 +0000,
> Maciej Katafiasz wrote:
>> Let’s not. Rails is a really shitty way of doing things.

> I don’t really care either way, but Rails has gotten
> Ruby usage. This might be because it is quite Mickey
> Mouse… But why is Rails a really shitty way of doing
> things?

First, I completely agree with what Slava said. Now to expand that a bit with my own thoughts:

Rails is 100% magic with 0% design. It sports all the great quality and consistency you’ve come to expect from PHP, except with loads more magic. There’s no overarching design or scheme of things, it’s just a bucket of tools with some glue poured in. This has a couple of important consequences:

– There’s no reasoning about Rails — more familiarity won’t give you better chances of figuring out something new because that’s what follows from the design, only because that’s how it usually ends up being implemented and because you have memorised more things, so your guesses are better. In essence, being better in Rails means being better at grepping the internet.

– There’s no thought given to the general problem to solve, it’s just improved PHP + “ActiveRecord, lol”. This means Rails doesn’t have solutions that are particularly good or scalable or make sense, only hacks that happened to solve someone’s specific problem at a time and were implementable in 5 minutes or less. Rails is heaps better than PHP, but it’s still only good for what PHP is good, and that’s not writing webapps. This permeates Rails all the way down: it’s got hacky modules that only solve particular problems, those modules have hacky functions that only solve particular problems and those functions only do hacky things that solved someone’s particular problem.

Some examples:
* AR’s find family of functions. It’s a horrible hack, for instance, they support the :group clause, which has semantics (“return a collection of groups of things”) incompatible with find’s base semantics (“return a collection of things”). Rails answer? It implicitly relies on MySQL’s retarded interpretation of SQL and the fact that given a table with two columns, id and colour, it will silently interpret “SELECT * FROM table GROUP BY colour” as “SELECT FIRST(id), colour FROM table GROUP BY colour”. End result? A valid combination of clauses in AR will generate incorrect SQL Postgres will (correctly) choke on.

* AR’s find again, it supports :join (which documentation hilariously describes as “rarely needed”), except that it doesn’t return real objects then, but make-believe fake readonly objects that will only have some attributes filled in (because they have no backing with physical rows), but will *still* have the base type and all the methods of the class you queried on! So if you go with that and try to call one of the methods that depend on unfilled attributes, you die horribly.

– Reading and, in general, understanding Rails is horribly difficult, since it’s no design and layers upon layers of magic. Very often you will find 5-7 layers of functions delegating the work deeper and deeper in, until you arrive to a completely undocumented internal function that in turn splits the work to three other, unrelated internal functions. Given that each of those 10 functions takes a hash called “options”, each layer altering it subtly, but none having the complete picture, and that 9 times out of 10 that hash is not documented on any level, figuring out what your choices are is pure hell. It’s made even more fun by the fact that different parts of Rails are accessible at various points of handling the request, and you can’t just jump in and poke things from the console, since it won’t have 99% of the things that only magically spring to life once a request is live.

– As a rule, there’s no quality assurance, and the average quality is very low. Because it’s such a patchwork, it will only have parts of things implemented, some other (not necessarily overlapping) parts documented, and a whole load of missing things just dangling and waiting for you to run into them. For example, the docs for text_field_with_auto_complete discuss how you can use options to make it complete only parts of entries (which was exactly what I needed, to show “foo (123)” in the popup, but only insert “foo” in the text field). What it doesn’t tell you, however, is that none of the stock completion-generating methods will prepare such splittable data for you, and you’re supposed to copy their code and hack it on your own instead. It took me half a day to finally understand that it wasn’t me being stupid and unable to find it, but it was actually Rails documenting interfaces that _weren’t there_.

– And to stress the first point again, Rails never concerns itself with the big-picture problem of “writing webapps”. It only thinks as big as “outputting HTML strings” and “querying the DB for a list of things”. This means the important, actually hard stuff like handling the stateless nature of HTTP, or sanitising and escaping the user input is just not adressed at all, and you only learn about them when one day you discover 84 possible XSS injection points (actual number from a Rails app I’m somewhat famililar with).

I’m a huge fan of innovative frameworks like Weblocks, because they actually stopped and thought about the whole thing for a moment, and try things that will address the problem as a whole. And even if some specific things will turn out to be harder to do that by just churning out raw HTML, and there will be abstraction leaks, it’s still better because they try out new things to find a solution that has a chance work. Rails doesn’t, because its entire culture seems to be fundamentally incapable of thinking more than 5 minutes into the future.

Cheers,
Maciej

Creating a VNC Connection to Existing X Session

January 3rd, 2008

Either it has been a really slow news week, or I’ve gotten faster at processing news from multiple news aggregators. I’ve been getting to the office early in the morning, and the content of /. and other popular news sites just did not give me my fill of reading for the morning. I began wondering what was going on at Efnet #C++. I had an IRC connection within a continually open X session at home, and I didn’t want to fire up another IRC client, so trying to get VNC to work with an X session was something that was worth attempting.

I did the following for a machine running CentOS 4.4 LiveCD to get it going:

1. Download the x11vnc RPM. I downloaded
   x11vnc-0.9.3-1.el4.rf.i386.rpm.
2. # rpm -ivh x11vnc-0.9.3-1.el4.rf.i386.rpm
3. $ x11vnc -storepasswd
4. $ x11vnc -usepw
Reference: x11vnc HOWTO