………… Taking the dive into fuzzy sets immediately elicits the obvious question: just how fuzzy is the data we’re operating on? As discussed in the first two installment of this amateur series of self-tutorials, the raison d’etre of fuzzy set theory is to model imprecise data that is typically expressible in nebulous terms in everyday speech; it addresses a class of data that is represented in ordinals (i.e. categories with orders) but for which it would be helpful to at least approximate with a continuous number scale. It essentially amounts to taking the information left over after we’ve defined ordinary (i.e. “crisp’) sets and putting it to at least some good use, just as recycled aluminum and cork shavings can be used in less lucrative products like paint and corkboard. Of course, once we’ve used some of the procedures outlined in the last two articles to define them using graded membership functions, it’s natural to try to quantify the imprecision we’re modeling in some way. It at least sets a sort of outer boundary to the uncertainty we’re modeling where none existed previously, even if that barrier is not completely accurate; it is the first baby step towards a program of Uncertainty Management. The ability to apply at least some yardstick to previously unquantifiable data is useful for the same reason that the unknown is invariably certain to scare readers of horror fiction. As Stephen King puts it so well,
“You approach the door in the old, deserted house, and you hear something scratching at it. The audience holds its breath along with the protagonist as she/he (more often she) approaches that door. The protagonist throws it open, and there is a ten-foot-tall bug. The audience screams, but this particular scream has an oddly relieved sound to it. ‘A bug ten feet tall is pretty horrible,’ the audience thinks, ‘but I can deal with a ten-foot-tall bug. I was afraid it might be a hundred feet tall.’”[I]………… A good horror writer will of course maximize the unknown in order to scare their customers, but the object of relational database management and data mining is to minimize it, to reassure end users. We know that bugs of an entirely different kind may arise whenever uncertainty is innately woven into the tapestry of the problems software engineers, DBAs and analysts seek to solve, but these underused precision modeling tools include a yardstick with which to measure our gremlins: the fuzzy complement, which allows us to assign a ballpark figure to the amount of fuzz involved. Complements are also one of the simplest types of set relations, which can serve as a bridge into discussion of more complicated topics like fuzzy intersections, unions and joins between two sets.
…………
Another single-set operation that is useful in fuzzy set theory is a partitioning method known as the alpha cut, but I’ll delay discussion of them since they’re more applicable to higher-level topics like possibility theory. Fuzzy complements and a-cuts both lead to peculiar nested subsets, albeit in different ways. In the latter case, it is because they assign to records to multiple, progressively larger subsets within the original fuzzy set, in tandem with the decrease in membership grade; in the former, it is because the various types encompass ever-wider sets of records, depending on what fuzzy complement we choose to apply and how generalized its definition is. As discussed in the taxonomy of fuzzy set types in the last article, the simplicity of these techniques quickly give rise to complexity because there are so many ways of combining these concepts; just as the difficulty in using membership functions consists mainly in the fact that there are so many we can choose from, so too does the challenge with fuzzy complements consist mostly of choosing the right type. This complexity is not an issue in ordinary crisp sets, which can be viewed as highly specific cases of fuzzy set theory. There’s no need to specify their membership functions in code because they either do or do not belong, which amounts to Boolean membership grades of 0 and 1 if we put it in fuzzy set terms. Likewise, there is only one type of complement in ordinary crisp sets, i.e. the sets of all records that don’t belong to them.
Involutive and Non-Involutive Complements………… The whole point of membership functions is to define “belonging” in terms of a grade on a continuous scale, so by logical necessity the concept of “not belonging” that defines complements also has to be a assigned a grade on a continuous scale. The most obvious way to do this is to simply subtract a membership score from 1, assuming that we’re defining our grades on a range of 0 to 1, as is almost always the case in fuzzy set theory. This is known as the standard complement, but we’re not limited to that single choice. The injection of membership grades into set theory raises the possibility of performing further operations to alter them when we perform the inversion operation, which opens up the door to a whole class of new complements that aren’t possible in crisp sets. In fact, we’re only limited by our imagination and the restrictions that the inversion functions must be strictly increasing or strictly decreasing, i.e. the values must move in only one direction, with no repeats. A violation of this principle would appear in a Reporting Services line chart as a flat segment.
………… It is possible to define a really broad set of “non-involutive” fuzzy complements if we don’t require that the inversion function return us to the original value when applied twice[ii], but I won’t expend much time on these “since involutive fuzzy complements play by far the most important role in practical applications.”[iii] Fuzzy Sets and Fuzzy Logic: Theory and Applications , the classic resource George J. Klir and Bo Yuan I’ll be depending upon for much of the math formulas in this series, contains an elegant illustration of how the standard fuzzy complement is contained within the set of involutive complements, which is in turn nested within the set of non-involutives, which are likewise a subset of all possible continuous complements and then to the set of all fuzzy complements, without any restrictions.[iv] Coding the Sugeno and Yager ComplementsTo demonstrate how these types of complements lead to different values, I calculated the complements in Figure 1 on a column in the Duchennes muscular dystrophy data I downloaded a few tutorial series ago from Vanderbilt University’s Department of Biostatistics . As in my last article, I’ll recycle the stored procedure I wrote for Outlier Detection with SQL Server, part 2.1: Z-Scores for double-duty as my membership function, but keep in mind that fuzzy sets have nothing to do with Z-Scores or probability in general unless you specifically add that meaning, as I have done here to essentially turn it into a fuzzy outlier detection method. To illustrate the possibility of combining membership functions in myriad ways, I retrieved Z-Scores for two columns and multiplied them together, but this sample code is much simpler since we’re only acting on one column. The results of the stored procedure are plugged into a table variable identical to the one we used last time around, except for the addition of four columns to hold the complements we’re experimenting with (the GroupRank is part of the original procedure and required for the INSERT EXEC operation, but can be safely ignored). The same @RescalingMax, @RescalingMin and @RescalingRange values are used in conjunction with the ReversedZScore column to normalize the MembershipScores within ranges of 0 to 1.
…………
The key difference is the addition of the @Pi, @LambdaParameter and @OmegaParameter variables, which are needed in the second UPDATE to calculate the non-involutive example[v] and the Sugeno and Yager Complements, which are the two most commonly mentioned ones in the literature. These two are useful when the problem at hand calls for bowing the number line between 0 and 1 in various arcs. When the @LambdaParameter is set to 0, the Sugeno Complement ought to be equal to the standard complement, while the Yager Complement ought to equal the same when its @OmegaParameter is set to 1. As the @LambdaParameter surpasses 0 the line bows towards lower values of both the complement and the membership grade, but at negative values it bows towards higher values of both. The Yager Complement moves in the opposite direction as its parameter is changed, but in different proportions.[vi] Although I have yet to do so in practice, these properties can be leveraged to fit different use cases perhaps in curve or distribution fitting, sensitivity adjustments or instances where an object carries less imprecision than its negation. In essence, the act of parameterization turns them both into whole families of functions that can be adapted as needed.Figure 1: Four Examples of Fuzzy Complements
DECLARE @RescalingMax decimal ( 38 , 6 ), @ RescalingMin decimal ( 38 , 6 ), @ RescalingRange decimal ( 38 , 6)
DECLARE @ LambdaParameter float = 1,
@ OmegaParameter float = 1 ,― ?
@Pi decimal ( 38 , 37 ) = 3.1415926535897932384626433832795028841 ― from http://www.eveandersson.com/pi/digits/100 DECLARE @ ZScoreTabletable
( ID bigint IDENTITY ( 1 , 1),
PrimaryKey sql_variant,
Value decimal ( 38 , 6),
ZScore decimal ( 38 , 6),
ReversedZScore as CAST ( 1 as decimal ( 38 , 6 )) ABS ( ZScore),
MembershipScore decimal ( 38 , 6),
RegularComplementfloat ,
NonInvolutiveComplementExamplefloat ,
SugenoComplement float,
YagerComplement float,
GroupRankbigint
) INSERT INTO @ZScoreTable
( PrimaryKey , Value , ZScore , GroupRank)
EXEC Calculations .ZScoreSP
@DatabaseName = N’DataMiningProjects ‘,
@ SchemaName = N’Health ‘,
@ TableName = N’DuchennesTable ‘,
@ ColumnName = N’LactateDehydrogenase ‘,
@PrimaryKeyName = N’ID’,
@DecimalPrecision = ’38 ,32′,
@OrderByCode = 8― RESCALING
SELECT @ RescalingMax = Max ( ReversedZScore ), @ RescalingMin = Min ( ReversedZScore ) FROM @ZScoreTable
SELECT @ RescalingRange = @ RescalingMax @ RescalingMin UPDATE @ZScoreTable
SET MembershipScore = ( ReversedZScore @ RescalingMin ) / @ RescalingRange UPDATE @ZScoreTable
SET RegularComplement = 1 MembershipScore,
NonInvolutiveComplementExample = 0.5 * ( 1 + Cos ( @Pi * MembershipSCore)),
SugenoComplement = ( 1 MembershipScore ) / CAST ( 1 + ( @ LambdaParameter * MembershipScore ) AS float),
YagerComplement = Power ( 1 Power ( MembershipScore , @ OmegaParameter ), 1 /