Article 4. The Ultimate Data Type

Bruce McKinney

April 18, 1996

Microsoft® Visual Basic® changed the programming world when it introduced the Variant type. The idea that all types can be contained within a single type is a trade of performance, size, and implementation simplicity for user simplicity. A single type that automatically converts between strings, integers, real numbers, and objects is a compelling feature for macro languages. In fact, variants will be the only type in Microsoft's new Internet macro language, Visual Basic, Scripting Edition (VBScript). Variants are less compelling for full-featured languages that aspire to high performance. Still, they offer many features that you can't easily get through intrinsic types--named arguments, optional arguments, parameter arrays, and collections.

Variants have been transformed from a proprietary feature of Visual Basic to a standard OLE type. Delphi® 2.0 from Borland® has an intrinsic Variant type, and the Microsoft Foundation Class Library (MFC) has a COleVariant type that encapsulates some of the Variant functionality.

What Is a VARIANT?

We're going to be talking about three different things in this article: variants, VARIANTs, and Variants.

Lowercase variant is the name of the concept of cramming different subtypes into one supertype.
To a C programmer, a VARIANT is something much more specific: It's a structure containing a union. You have to understand how it works to use it.
To a Visual Basic programmer, a Variant is a type that can contain another type (you don't have to understand how it works). To a C++ programmer, a Variant is like everything else in C++--it's whatever you want it to be. We want to make it into something as close to an intrinsic Variant type as C++ allows, but to make that happen, we'll have to take an inside look at the VARIANT structure.

VARIANT Insides

There's no getting around it. We'll have to look at the VARIANT structure before we can talk about what to do with it.

struct tagVARIANT {
    VARTYPE vt;
    WORD wReserved1;
    WORD wReserved2;
    WORD wReserved3;
    union {
    //  C++ Type      Union Name   Type Tag                Basic Type
    //  --------      ----------   --------                ----------
        long          lVal;        // VT_I4                ByVal Long
        unsigned char bVal;        // VT_UI1               ByVal Byte
        short         iVal;        // VT_I2                ByVal Integer
        float         fltVal;      // VT_R4                ByVal Single
        double        dblVal;      // VT_R8                ByVal Double
        VARIANT_BOOL  boolVal;     // VT_BOOL              ByVal Boolean
        SCODE         scode;       // VT_ERROR
        CY            cyVal;       // VT_CY                ByVal Currency 
        DATE          date;        // VT_DATE              ByVal Date
        BSTR          bstrVal;     // VT_BSTR              ByVal String
        IUnknown      *punkVal;    // VT_UNKNOWN 
        IDispatch     *pdispVal;   // VT_DISPATCH          ByVal Object
        SAFEARRAY     *parray;     // VT_ARRAY|*           ByVal array
        // A bunch of other types that don't matter here...
        VARIANT       *pvarVal;    // VT_BYREF|VT_VARIANT  ByRef Variant
        void          * byref;     // Generic ByRef        
    };
};

The idea is that the union part can contain any one of the supported types. The VARTYPE vt field should contain a type tag (and possibly a tag modifier) indicating what the union contains. Technically, that's all you really need to know. Every time you put a variable into a VARIANT, you also insert its type tag. Every time you want extract a variable, you first check its type so that you know what to do with it. Actually, OLE provides a group of system functions that help manage these tasks. We'll get to them soon.

How big is a VARIANT? The first field is a VARTYPE, which is actually a typedef for an unsigned short. The next three fields are WORD, another typedef for an unsigned short. So VARIANT is 8 bytes plus the size of the union. A union contains enough storage space for its largest member, which in this case is 8 bytes (for a double, currency, or date). So a VARIANT is 16 bytes. That's a healthy chunk of storage if you have an array of variants containing 1 byte each. But of course, you won't be doing that very often.

The most common size of data is 4 bytes (for a long, an error, or any kind of pointer). In this case, 11 of those 16 bytes are just padding. It's rumored that VARIANTS may get a new internal data type that will use up some of those reserved fields in some future version of OLE and Visual Basic.

If you look in OLE documentation, you'll sometimes see parameters that look like VARIANTs, but actually have the type VARIANTARG. In fact, VARIANTARG is just a typedef for VARIANT, but just as a BSTR is more than its typedef, a VARIANTARG is more than a VARIANT. To be more exact, a VARIANTARG includes all of a VARIANT, but a VARIANT doesn't include all of a VARIANTARG. Or, to be less confusing, a VARIANTARG is a VARIANT used as an IDispatch argument and it can contain anything in the VARIANT's union, including by-reference arguments. A VARIANT in other contexts isn't supposed to use all those union members marked VT_BYREF in the second part of the union.

The IDispatch interface is a fine thing, but we won't have get around to it in this series of articles. You can forget about VARIANTARGS and all the by-reference fields of the VARIANT union. We'll only be talking about the standard OLE types introduced in Article 2, and some of them will get short shrift.

Note: The original name for the boolVal field of the VARIANT union was bool, but bool is a future C++ keyword and Microsoft Visual C++® 4.1 gives a warning message if you use it. Presumably, a future version will actually implement the new bool type. In the meantime, there is a slight discrepancy: Readers on the Visual C++ subscription program who have version 4.1 will see boolVal in their VARIANTS, but those who have version 4.0 or earlier will see bool. I solved this for Visual C++ by changing my code to use boolVal and providing a macro that defines boolVal to bool for earlier versions. You may have to hack further if you have a different compiler.

The VARIANT System Functions

I find the documentation for the VARIANT system functions to be sparse, and somewhat confusing. So here's my version, with a few editorial comments on the side. My function declarations are cleaned up a little from the macro-ized portable versions in the include file.

But before we get into the individual functions, let's talk for a minute about the HRESULT type. It's actually a typedef to long, but not just any old long. An HRESULT is a bit field in which some bits specify status or severity and other bits contain an error number. For now, zero means no error (it is usually represented by the macro NO_ERROR or S_OK) and anything else means some kind of failure. Technically an HRESULT uses positive numbers to indicate different kinds of success and negative numbers to indicate different kinds of failure, but the VARIANT system functions return only one kind of success--represented by zero.

void VariantInit(VARIANT * pv);

This statement creates a new VARIANT. The vt field of the VARIANT structure is set to VT_EMPTY. The OLE documentation states that the wReserved field is set to zero. If you step into VariantInit in a debugger, you'll see that it sets the vt field to zero and leaves the wReserved1, wReserved2, and wReserved3 fields untouched.

Example:

// Create a VARIANT and initialize it to a string.
VARIANT varSrc;
VariantInit(&varSrc); 
varSrc.bstrVal = SysAllocString(L"The String To End All Strings")

HRESULT VariantClear(VARIANT * pv);

This function destroys a VARIANT. It does the same thing as VariantInit and more. If the variant contains allocated data (a BSTR, SAFEARRAY, or IDispatch pointer), it calls the proper functions to free that data. Technically you could call VariantInit instead if you knew your VARIANT didn't contain allocatable data, but this is bad practice and you'll rarely be in a situation where you know the type at destruction time.

Example:

// Destroy an existing VARIANT.
hres = VariantClear(&varSrc);

HRESULT VariantCopy(VARIANT * pvDst, VARIANT * pvSrc);

This function copies one VARIANT to another. It frees any allocatable contents of the destination before the copy. It also allocates copies of any allocatable data in the source and copies the data to the other; the result is two identical VARIANTs. If the source contains a BSTR or SAFEARRAY, the destination will have a separate, identical copy. If the source contains an object (IDispatch or IUnknown pointer), the reference count will be incremented to indicate that both VARIANTs have access to the object. If this doesn't make sense, don't worry about it yet.

Example:

// Create a new VARIANT by copying from an existing variant.
VARIANT varDst;
VariantInit(&varDst);
hres = VariantCopy(&varDst, &varSrc);

HRESULT VariantCopyInd(VARIANT * pvDst, VARIANT * pvSrc);

This function copies one VARIANT to another, in the same manner as VariantCopy, but makes sure the destination contains a by-value copy of the data even if the source contained by-reference data. Because we won't be dealing with by-reference data, we won't need this function.

HRESULT VariantChangeType(VARIANT * pvDst, VARIANT * pvSrc, WORD wFlags, VARTYPE vt);

This function changes the type of a VARIANT without changing its value (if possible). To change a variable in place, make the destination the same as the source. To copy a variable while changing its type, make the source the VARIANT containing the desired value and the destination the VARIANT to receive the copy. Put the desired type into the vt parameter. The only flag you can pass to the wFlags parameter is VARIANT_NOVALUEPROP, but you should only do that for objects, and the reasons why you might want to are beyond the scope of this article. Just pass zero. Be sure to check the return value, because some type conversions don't make sense and are bound to fail.

Example:

varSrc.dblVal = 3.1416
// Copy varSrc to varDst while changing its type to Long (value 3).
hres = VariantChangeType(&varDst, &varSrc, 0, VT_I4)
if (hres) throw hres;
// Change varSrc to a string (value "3.1416").
hres = VariantChangeType(&varSrc, &varSrc, 0, VT_BSTR)
if (hres) throw hres;

HRESULT VariantChangeTypeEx(VARIANT * pvDest, VARIANT * pvSrc, LCID lcid, WORD wFlags, VARTYPE vt);

This function (and the OLE documentation for it) is exactly the same as VariantChangeType except that it has an lcid parameter where you can pass a language ID. Apparently passing a different language ID might result in some sort of language translation for strings, dates, and objects, but you won't be able to figure out, from the OLE documentation or from this article, how such a change works. We won't be using this function.

And Lots More

OLE provides 81 additional functions for converting between the various OLE types. If you've seen one, you've seen them all, but as a bonus, I'll show two.

HRESULT VarI2FromI4(long lIn, short * psOut);
HRESULT VarI4FromI2(short sIn, long * plOut);

You pass the first function a long and it returns the converted short through a reference parameter. The second function will convert that short back to a long. This may look simple, but there are overflow problems that OLE wants to handle itself rather than leaving you to guess the approved way. Besides, some conversions--Boolean from BSTR or dispatch from date--are more difficult. The VariantChangeType function uses a lot of these functions internally, but you can use them directly if you want.

Example:

short i2;
varSrc.lVal = 1234;
// Extract a short from a VARIANT containing a long.
hres = VarI2FromI4(varSrc.lVal, &i2);
// Assign a short to a VARIANT, but make it long.
hres = VarI4FromI2(i2, &varDst.lVal);

The Variant Class

The Variant class attempts to hide VARIANT details in the same way that the String class hides BSTR details. In many ways, it does a much better job. Because VARIANT is a structure, you can simply inherit a Variant from it. Internally, a Variant has no private members of its own, just the members of VARIANT. But Variant has one thing that VARIANT doesn't have: member functions including contructors, a destructor, and lots of operator functions.

We'll look briefly at the implementation of Variant later. The important thing here is that Visual Basic can pass a VARIANT to your DLL, but you can receive it as a Variant. Let's take a look.

A Variant API Example

In my book Hardcore Visual Basic I provided a Basic-friendly wrapper function for the SearchPath application programming interface (API) function. My SearchDirs function expected three Long arguments and looked like this:

sFullName = SearchDirs(sEmpty, "vb.exe", sEmpty, iDir, iBase, iExt)

The function would return the full path in the return value and the indexes of the directory, base file, and extension through reference variables. The first and third arguments were for the path to search and the extension, but in most cases you would just pass an empty string (the sEmpty constant from my type library). You could use the returned indexes with the Mid$ function to extract the parts of the full path. This was nice, but you always had to declare and pass those index variables whether you wanted them or not, and you always had to pass empty variables for the path and extension.

The version of SearchDirs shown in this article takes optional Variants instead of Longs for the index variables, and optional Variants for the path and extension arguments. You can find the code for it in WIN32.CPP. The event procedure of the Win32 button in the sample program contains code that excercises SearchDirs. The GetFullPath function in the same location is similar: it expands a filename to a full path specification.

The original SearchDirs tried to maintain a parameter order similar to that of SearchPath. This version orders the arguments logically in order to make it easier to leave off ones that usually aren't needed. Any of the following Basic statements are OK:

sFullName = SearchDirs("vb", ".exe", ".", vBase, vExt, vDir)
sFullName = SearchDirs("vb.exe", , , vBase, vExt)
sFullName = SearchDirs("vb", ".exe", , vBase)
sFullName = SearchDirs("vb.exe")

Notice that the directory parameter has been moved to the end because it is the least likely to be used. The filename parameter has been moved to the start of the list because it is required and is often the only argument needed. You could argue about the handiest order for the path and extension parameters. I chose to make the the extension the second parameter and the path the third parameter.

You can find this sample in the Win32.Cpp module. It is used by the event handler of the Win32 button in the Cpp4VB sample program.

Let's start by examining the ODL declarations:

[
entry("SearchDirs"),
usesgetlasterror,
helpstring("Searches sPath for sFile with sExt..."),
]
BSTR WINAPI SearchDirs([in] BSTR bsFile,
                       [in, optional] VARIANT vExt,
                       [in, optional] VARIANT vPath,
                       [in, out, optional] VARIANT * pvFilePart,
                       [in, out, optional] VARIANT * pvExtPart,
                       [in, out, optional] VARIANT * pvDirPart);

Notice that SearchDirs has two kinds of VARIANT parameters--input parameters that will pass in strings and output Variants that will receive integer indexes. The output Variants are pointers so that they can receive return values. Although the index parameters are only used for output, I chose to make them in/out parameters and overwrite any input. This makes it easier for clients to reuse the same variables for multiple calls.

Let's take a look at the C++ code that makes this possible. Actually, SearchDirs looks a lot like the GetTempFile function described in Article 3, except that it's a little more complicated and includes optional arguments. I'm going to ignore the String part (interesting though it may be) and concentrate on the Variant features:

BSTR DLLAPI SearchDirs(
    BSTR bsFileName,
    Variant vExt,
    Variant vPath,
    Variant * pvFilePart,
    Variant * pvExtPart,
    Variant * pvDirPart
    )
{
  try {
    LPTSTR ptchFilePart;
    Long ctch;
    String sFileName = bsFileName;
    if (sFileName.IsEmpty()) throw ERROR_INVALID_PARAMETER;

    // Handle missing or invalid extension or path.
    String sExt;        // Default initialize to empty
    if (!vExt.IsMissing() && vExt.Type() != VT_EMPTY) {
        if (vExt.Type() != VT_BSTR) throw ERROR_INVALID_PARAMETER;
        sExt = vExt;
    }
    String sPath;       // Default initialize to empty.
    if (!vPath.IsMissing() && vPath.Type() != VT_EMPTY) {
        if (vPath.Type() != VT_BSTR) throw ERROR_INVALID_PARAMETER;
        sPath = vPath;
    }
    
    // Get the file (treating empty strings as NULL pointers).
    String sRet(ctchTempMax);
    ctch = SearchPath(sPath.NullIfEmpty(), sFileName, 
                      sExt.NullIfEmpty(), ctchTempMax, 
                      Buffer(sRet), &ptchFilePart);
    ASSERT(ctch <= ctchTempMax);
    Long iDirPart = 0, iFilePart = 0, iExtPart = 0;
    if (ctch == 0) {
        sRet.Nullify();
        // Not finding a file returns zero, but isn't an error.
        Long err = (Long)GetLastError();
        if (err) throw err;
    } else {
        // Calculate the file part offsets.
        iFilePart = ptchFilePart - (LPCTSTR)sRet + 1;
        GetDirExt(sRet, &iDirPart, &iExtPart);
        // Resize must be after calculation because it may move ANSI buffer.
        sRet.Resize(ctch);
    }
    // Cram into variants regardless of missing optional arguments.
    *pvDirPart = iDirPart;
    *pvFilePart = iFilePart;
    *pvExtPart = iExtPart;
    return sRet;
  } catch(Long e) {
    ErrorHandler(e);
    return BNULL;
  }
}

Visual Basic passes in VARIANTs (as shown in the ODL declaration), but C++ receives Variants. The first two Variants in the parameter list are supposed to contain strings, but it's possible that the user might omit the parameter or pass an empty Variant (using Visual Basic's Empty keyword). If so, we'll just use a null String for the parameter (sExt or sPath). If the user passed a valid BSTR in the Variant, we'll assign it to our String variable. This works because the Variant type has an operator BSTR member that converts a Variant to a BSTR, and the String type has a constructor that takes a BSTR argument. There's a lot going on behind the scenes, as you can tell if you step through this code with a debugger. Finally, if someone passes the floating point number 3.1416 as the file extension, we'll throw the bozo out with an "Invalid Parameter" error.

There's not much to the output Variant code. You simply calculate the correct values in integers (such as iFilePart) and then cram them into the Variants for return. There's no reason to test whether optional arguments were actually passed. If an argument was omitted, Basic will simply create a temporary Variant with an error setting. You can fill this variable with whatever you like because it's going to be tossed anyway.

Debugging Variants

Before you start developing your own Variant functions (or stepping through mine), users of Microsoft Developer Studio will probably want to make one more change in AUTOEXP.DAT in the \MSDEV\BIN directory. See Article 3 for a review of how we edited this file to handle output of BSTRs and Strings in Unicode format. Handling Variants is a little different. Here's the setting I use:

Variant =vt=<vt,x> short=<iVal> long=<lVal> dbl=<dblVal,g> str=<bstrVal,su>

The result isn't exactly is pretty, but it's the best I could do. The debugger shows five of the most common fields of the VARIANT structure and union. You have to look at the value of the vt field (after memorizing the VT constant values) to figure out which of the other fields (if any) is meaningful. The alternative is to expand every Variant as it appears in the locals or watch the window--a task that gets old in a hurry.

In a language such as Visual Basic that supports Variant as an intrinsic type, the debugger can look at the the vt field and display the appropriate data in the appropriate format automatically. Will Visual C++ ever get an intrinsic __variant type, and will Microsoft Developer Studio ever display formatted __variant values? It could happen, but don't hold your breath.

A Variant Workout

The goal of the C++ Variant type is to work the same as the Visual Basic Variant type. Because Visual Basic invented Variants, whatever it does is the definition of Variant. Of course, C++ is a different language, and some things won't work quite the same.

The following tests are located in the TestVariant function in Test.Cpp. The event procedure of the Variant button in the sample program calls this function. It works a lot like the TestString function discussed in Article 3--writing test output to a string that is returned to the Visual Basic program for display.

Variant Construction

Consider these variable initializations:

Variant vInteger = (Integer)6;
Variant vLong = 9L;
Variant vSingle = 7.87f;
Variant vDouble = -89.2;
Variant vBoolean(True, VT_BOOL);
Variant vString = _W("String");  
Variant vError((Long)DISP_E_EXCEPTION, VT_ERROR);
Variant vCurrency = (Currency)78965;
Variant vDate(2.5, VT_DATE);    // Noon, January 1, 1900.

This isn't the same as Visual Basic because you can't initialize variables in the declarations in Visual Basic. In addition, you'll notice quite a few casts and extra arguments.

When initializing vInteger, the constant 6 must be cast to Integer (short) so that it will go through the right constructor. The variable vBoolean must have an argument specifying its type because an OLE Boolean is actually the same as a short and so they must share the same constructor. The vError and vDate variables must also have type arguments because they share the long and double types respectively. C++ is a finicky langauge, and you'll have to do a lot of casting to make it understand what you want.

Variant Assignment

Assignments work the same as initialization except that you can't use a type argument with operator=(). Therefore, the Variant class doesn't support direct assignment to the Error, Boolean, and Date types with the assignment operator. Instead, you can use the SetError, SetBoolean, and SetDate methods:

// Assign to types that have no operator=.
vBoolean.SetBoolean(False);
vError.SetError((Long)E_ACCESSDENIED);
vDate.SetDate(3333.125);

It's your responsibility to remember to use the method for these types. Alternatively, you can assign to some other type and change with the Type method:

// Assign to nearest type and then change to desired type.
vDate = 3333.125;
vDate.Type(VT_DATE);

Variant Type Conversion

Variants also have conversion operators so that you can assign a variant to a native type and the conversion will take place automatically:

// Assign double to long (throwing away remainder).
Long i = vDouble;
// Assign a date to a string (appears in date string format).
String s = vDate;
// Assign numeric string a numeric variable.
vString = _W("3.1416");
vSingle = vString.CopyAs(VT_R4);

String conversions are particularly convenient because OLE takes care of converting to and from numeric and string arguments. Date variables, for example, come out as strings in the standard date format. The iostream insertion operator (<<) for Variants takes advantage of string conversion to output Variants to an output streams. Here's an example of how the insertion operator displays different Variant types:

vInteger==6
vLong==9
vSingle==7.87
vDouble==-89.2
vBoolean==-1
vString=="String"
vError==Facility:2, Severity:1, Code:9
vCurrency==7.8965
vDate==1/1/00 12:00:00 PM

Notice that although the currency type is stored as a 64-bit integer, OLE translates the string output to a fixed-point number with four decimal places. The date type is stored as a double, but OLE translates it to a formatted date and time string. The error type is stored as a long, and OLE doesn't do anything with it. It's my insertion code in the operator<< function that treats it as a special case and breaks it into the standard parts of an OLE HRESULT. OLE translates Booleans to 0 or -1, but you could easily update the insertion code to output them as "True" and "False."

Variant Arithmetic

Of course, any self-respecting Variant class must handle normal arithmetic operations. Here are some addition statements and their output.

vTmp = vInteger + vLong; // 6 + 9 == 15
vTmp += vDouble;         // 15 += -89.2 == -74.2
float flt = 3.25;
flt += (float)vTmp;      // 3.25 += -74.2 == -70.95
vTmp += (short)77;       // -74.2 += 77 == 2.8
vTmp += vInteger++;      // 2.8 += 6 == 8.8
vTmp += ++vInteger;      // 8.8 += 8 == 16.8
vTmp = vString + vLong;  // Stuff9 + 9 == Stuff9
vTmp += vDouble;         // Stuff9 += -89.2 == Stuff9-89.2

I'll leave you to check similar statements (in the TestVariant function) using subtraction, multiplication, division, and even modulus. I confess that I didn't implement the C++ bitwise operators (&,|, and ~), but you could add them in a few minutes by copying the existing operator code. Of course, bitwise operators (like the modulus operator) only make sense for integer types. Or maybe not. If you can figure out a reasonable meaning for bitwise operators on strings or dates, C++ won't stop you from implementing it.

Variant Comparison

The Variant type also has logical operators.

f = (vDouble == vLong);  // (9 == 9) == 1
f = (vInteger == vLong); // (6 == 9) == 0
f = (vInteger != vLong); // (6 != 9) == 1
f = (vInteger <= vLong); // (6 <= 9) == 1
f = (vInteger < vLong);  // (6 < 9) == 1
f = (vInteger > vLong);  // (6 > 9) == 0
f = (vInteger >= vLong); // (6 >= 9) == 0
f = (vTmp == vDate);     // (2/14/09 3:00:00 AM == 2/14/09 3:00:00 AM) == 1
f = (vDate == vTmp);     // (2/14/09 3:00:00 AM == 2/14/09 3:00:00 AM) == 0
f = (vTmp == vString);   // (2/14/09 3:00:00 AM == Stuff9) == 0
f = (vTmp != vString);   // (2/14/09 3:00:00 AM != Stuff9) == 1

This looks pretty good except for one minor problem--vTmp equals vDate, but vDate doesn't equal vTmp. How can that be? Well, it's kind of a peculiarity of the way OLE works when vTmp is a string variant and vDate is a date. If vTmp comes first, the operator==() code asks OLE to convert vDate to a string and then compares the two strings. Match! If vDate comes first, the code tries to convert vTmp to a date, but unfortunately OLE doesn't know how to convert a date format string to a numeric date. No match! I'm sure you could fix this problem with enough special case code.

You may notice that some arithmetic and logical expressions require typecasts where you might prefer not to see them. You'd need a lot of operator functions to support every possible combination. Consider addition. I provide this function:

friend Variant operator+(Variant& v1, Variant& v2);

To cover all your bases, you'd need these:

friend Variant operator+(Variant& v1, BYTE b2);
friend Variant operator+(BYTE b1, Variant& v2);
friend Variant operator+(Variant& v1, short i2);
friend Variant operator+(short i1, Variant& v2);

And so on. Be my guest if you want to add these. Most of them could be implemented as simple inline functions.

Note The COleVariant class provided by MFC has some nice features (support for COleDateTime and COleCurrency). These were apparently required to meet its primary goal, which I'm told was to fit in with the MFC database classes. I'm not qualified to judge how it fits that goal, but it certainly doesn't meet my goal, which was to match the Visual Basic Variant type. COleVariant has no arithmetic operators and only one logical operator--equality. Unfortunately, its equality operator will indicate that a variant double containing zero is not equal to a variant long containing zero. That's quite a change from Visual Basic.

One More Variant Example

Visual Basic provides the InStr function for searching strings, but it always assumes you want to start at the front of the string and search back. There's no InStrR for searching backward. I needed a reverse string search function in my book Hardcore Visual Basic, so I wrote a crude InStrR in Basic, but noted in a comment that somebody ought to write an efficient version in C++.

Well, here it is. Check out the code in Tools.Cpp. The String Find button in the sample program tests it. The syntax of InStr is a little unusual, and InStrR has to jump through hoops to work the same way except better. The first and last parameters are optional, as shown in this syntax:

position = InStr([start,] target, search[, compare])

But alas, InStr behavior fails to match this syntax. You can't leave off the first argument if you supply the last argument. The following gives you a syntax error:

i = InStr(sTarget, sFind, 1)

Instead, you have to give the default first parameter (1) explicitly.

i = InStr(1, sTarget, sFind, 1)

I can't guess how Visual Basic implements this function so that the first argument isn't really optional. It's optional in my InStrR. All the parameters have to be Variants in order to make the optional arguments work. Here's how I do it:

Long DLLAPI InStrR(Variant vStart, Variant vTarget, 
                   Variant vFind, Variant vCompare) 
{
  try {

    Long iStart = -1, afCompare = ffReverse;
    String sTarget, sFind;

    // Assign strings depending on whether vStart is given.
    if (vStart.Type() == VT_BSTR) {
        if (!vFind.IsMissing()) {
            afCompare |= ((Long)vFind ? ffIgnoreCase : 0);
        }
        sTarget = vStart;
        sFind = vTarget;

    } else {
        iStart = vStart;
        if (iStart < 1) throw ERROR_INVALID_PARAMETER;
        if (!vCompare.IsMissing()) {
            afCompare |= ((Long)vCompare ? ffIgnoreCase : 0);
        }
        sTarget = vTarget;
        sFind = vFind;
    }
    // Find the string.
    return sTarget.Find(sFind, afCompare, iStart);

  } catch(Long e) {
    ErrorHandler(e);
    return 0;
  }
}

Most of the code deals with optional arguments. Once you figure them out, it takes only one line to find the string.

How Variants Work

You don't need to understand how the Variant class works to use it, but you can never know too much. See if you can find the private members where variant data is stored in the Variant class:

class Variant : VARIANT
{
public:
    
    // Constructors
    Variant();
    Variant(const Variant& vSrc);   
    Variant(BYTE bSrc);   // VT_UI1
.
.
.
private: 
    // Constructor helper
    void VariantCreate(VARTYPE vt = VT_EMPTY);

    // Destructor helper
    void VariantDestroy();
    Boolean IsConstructed();
    void Constructed(Boolean f);
    
    // Look, Ma! No data.
};

If you looked in the private section at the end of the class (where any self-respecting class would store data), you'd be out of luck. Instead the data comes at the very start where the Variant class is inherited from OLE's VARIANT structure. The Variant class gets internal access to all the VARIANT members, but because VARIANT is inherited privately, users of Variant can't see them. That's why Visual Basic can pass you a VARIANT and you can receive it as a Variant. They're the same thing--with a different interface.

Variant Construction and Destruction

Let's check out a few constructors.

inline Variant::Variant()
{
    VariantCreate();
}

inline Variant::Variant(const Variant& v)
{
    VariantCreate();
    HRESULT hres = VariantCopy(this, (Variant *)&v);
    if (hres) throw hres;
}

inline Variant::Variant(BYTE nSrc)
{ 
    VariantCreate(VT_UI1); 
    bVal = nSrc; 
}

What is this VariantCreate that seems to be doing the real construction work? Let's take a brief look at the destructor before I reveal the answer.

inline Variant::~Variant()
{
    if (IsConstructed()) {
        VariantDestroy();
    }
}

Implementation Choices

If the variable was received from the host as a parameter, the host owns it and it shouldn't be destroyed. If we created it with a constructor, we need to destroy it. But you can't see how any of this works because I hide the details in the VariantCreate, IsConstructed, and VariantDestroy helper functions.

You're going to have to check the source code to unravel this mystery. I'm not going to reveal the implementation in print because I might have to change it. One way or another, the class needs a flag (just one bit) indicating whether a given Variant object should be deallocated or not. VariantCreate will set this flag, IsConstructed will test it, and VariantDestroy will clear it. But you can't store the flag as a separate member variable, because that would change the size of the Variant class and make it impossible to receive VARIANTs as Variants. I can think of three ways to implement Variant constructors and destructors:

Use a bit in one of the wReserved fields as a flag. You know better than to use reserved fields of system data. What if they change the VARIANT type? Remember that I mentioned earlier that there are rumors they'll add a new type that uses part of that reserved data. On the other hand, with six bytes of unused data, surely no one would notice if we used just one itty bitty bit. But which one?
Use an unused bit in the vt field as a flag. Some of those bits are used for some IDispatch purposes that don't matter to our limited version of the VARIANT structure. On the other hand, Variant system functions might not like us to use those bits for unrelated things. We might have to save and restore this bit before calling certain system functions.
Keep track of each Variant object in a static array of data structures as suggested at the end of Article 3 for the String class. This is the safest way, but it will be a lot more work to implement and will have a performance cost.

I opted for the easiest choice, method 1, even though it's risky. My code works fine for Visual Basic 4.0, but who can say about future versions? You can check the source file for details, and change the implementation of VariantCreate, IsConstructed, and VariantDestroy if you feel uncomfortable with my choice.

Variant Operators

Just to give you a feel for how operator overloading of Variants works, here's the prefix version of the operator++ function:

// Prefix
Variant & Variant::operator++()
{
    switch (vt) {
    case VT_UI1:
        ++bVal;
        break;
    case VT_I2:
        ++iVal;
        break;
    case VT_I4:
        ++lVal;
        break;
    case VT_R4:
        ++fltVal;
        break;
    case VT_R8:
        ++dblVal;
        break;
    case VT_CY:
        ++cyVal.int64;
        break;
    default:
        throw DISP_E_TYPEMISMATCH;
    }
    return *this;
}

Exception Handling Revisited

You've seen several examples of functions that throw exceptions. The GetTempFile function in Article 3 and the SearchDirs function in this article throw Win32 error constants as exceptions. But if you look at the operator++ example above or at many of the other methods of the String, Variant, and SafeArray classes, you'll see that they throw OLE HRESULT constants. Win32 errors are always positive. HRESULTs can be negative or positive, and contain specific status bits. Mixing these two kinds of errors may seem random, but there's a method in the madness.

The String, Variant, and SafeArray classes are designed to be used with OLE objects. They happen to also work with the non-object DLLs described in this series. If we were dealing with OLE objects, the ErrorHandler function called in the catch blocks would be raising OLE exceptions and it would be much easier to do so if the errors raised were HRESULT values. The OLE exceptions would be seen in Visual Basic as normal errors, trappable with Basic's On Error statements. This is what you should be doing, and what I originally intended to explain before deadline realities set in.

Unfortunately, you can't generate Basic errors from non-object DLLs. You have to fall back on old-fashioned techniques such as returning error values. The API way of doing this is to use SetLastError to set an error code in the DLL. The DLL client can call GetLastError to get details (although, as mentioned in Article 1, this actually means checking the LastDllError property of the Basic Err object). That's what the ErrorHandler function does in VBUTIL. Here's the code:

void ErrorHandler(Long e)
{
    DWORD err = 0;
    if (e >= 0) {        
        err = (DWORD)e;
    } else {
        err = HResultToErr(e);
    }
    SetLastError((DWORD)err);
}

If the error is an HRESULT, ErrorHandler calls HResultToErr (which is nothing more than a big switch statement) to translate to a Win32 error. It then calls SetLastError to store the error value for any client that wants to check it. Because VBUTIL functions such as SearchDirs are local to this DLL, they won't be used by OLE objects and thus can throw Win32 errors directly to avoid the translation.

One other point. When you use the OleType static library in your own projects, don�t forget about exceptions. The String, Variant, and SafeArray classes can throw exceptions, and you must be ready to catch them. You have two choices. First, you can make sure you never do anything with a String, Variant, or SafeArray that could throw an exception. This is tough, but you could probably manage it in some projects. Your second, better choice is to catch and handle exceptions. VBUTIL provides a model of one simple technique, but there�s nothing sacred about its ErrorHandler function. You may want to design a better function for handling exceptions, especially if you use the library in OLE servers or controls.

Introduction

Article 1. Stealing Code with Type Libraries

Article 2. Libraries Made Too Easy

Article 3. Strings the OLE Way

Article 5. The Safe OLE Way of Handling Arrays