## [Math] Set theory explanation via boolean operations

(The following text has been copypasted to the Math Recipes book.)

We are going to put a set in integer variable, where each bit will represent an element of set. Groups of bits - a subset. 'Universe' (all possible elements) is a group of all possible bits within variable.

Cardinality of set is number of elements in it. We can find it just by count all bits or using popcount() function.

### Collecting TV series

int collected=0;

{
assert(element>=0 && element<=8);
collected=collect|(1<<element);
if (collected==0xff)
printf ("all collected\n");
};


Important to note that the add_to_set() function can be called several times with the same element.

### Rock band

#include <assert.h>

#define DRUMS  (1<<0)
#define GUITAR (1<<1)
#define BASS   (1<<2)
#define VOCAL  (1<<3)

// what each rocker can do in our band:
#define JOHN_CAN_PLAY    (GUITAR | BASS)
#define JAKE_CAN_PLAY    (GUITAR | VOCAL)
#define PETER_CAN_PLAY   (GUITAR | DRUMS)
#define MICHAEL_CAN_PLAY (DRUMS | VOCAL)
#define FRANK_CAN_PLAY   (GUITAR | BASS | DRUMS)

// instruments that must be used
// internally, this is 0xf (4 lowest bits)
// this is set universe
#define ROCK_BAND_INSTRUMENTS (GUITAR | BASS | DRUMS | VOCAL)

// we single out one instrument from each rocker using AND
// then we join them all together using OR, this is 'set union'
// note: we 'take' two instruments from JAKE
// internally, this also must be 0xf (4 lowest bits)
#define STAR_ROCK_BAND  (JAKE_CAN_PLAY & GUITAR) | (JAKE_CAN_PLAY & VOCAL) | (JOHN_CAN_PLAY & BASS) | (PETER_CAN_PLAY & DRUMS)

int main()
{
// this equation must hold:
assert(STAR_ROCK_BAND==ROCK_BAND_INSTRUMENTS);
};


The obvious limitation: only 32 elements in set are allowed in 32-bit integer type. But you can switch to 64-bit integer or to bitstring data type.

### Languages

#include <assert.h>
#include <stdio.h>

#define ENGLISH (1<<0)
#define FRENCH  (1<<1)
#define GERMAN  (1<<2)
#define SPANISH (1<<3)
#define RUSSIAN (1<<4)
#define ARABIC  (1<<5)
#define CHINESE (1<<6)
// all languages:
#define UNIVERSE (ENGLISH | FRENCH | GERMAN | SPANISH | RUSSIAN | ARABIC | CHINESE)

#define JOHN_CAN_SPEAK         (ENGLISH | FRENCH)
#define JOHN_CAN_UNDERSTAND    (JOHN_CAN_SPEAK | RUSSIAN | GERMAN)

#define PETER_CAN_SPEAK        (ENGLISH | GERMAN)
#define PETER_CAN_UNDERSTAND   (PETER_CAN_SPEAK | SPANISH | ARABIC)

#define NATASHA_CAN_SPEAK      (ENGLISH | RUSSIAN)
#define NATASHA_CAN_UNDERSTAND (NATASHA_CAN_SPEAK | CHINESE)

#define JOHN_AND_PETER_CAN_UNDERSTAND (JOHN_CAN_UNDERSTAND | PETER_CAN_UNDERSTAND)

int main()
{
// which languages can JOHN and PETER use to talk with each other?
// this is 'set intersection'
assert ((JOHN_CAN_SPEAK & PETER_CAN_SPEAK) == ENGLISH);

// which languages can Peter speak so that John would understand him?
// 'set intersection' again
assert ((PETER_CAN_SPEAK & JOHN_CAN_UNDERSTAND) == (ENGLISH | GERMAN));

// which languages can John speak so that Peter would understand him?
assert ((JOHN_CAN_SPEAK & PETER_CAN_UNDERSTAND) == ENGLISH);

// which languages can John use so that both Peter and Natasha will understand him?
assert ((JOHN_CAN_SPEAK & (PETER_CAN_UNDERSTAND & NATASHA_CAN_UNDERSTAND)) == ENGLISH);

// which languages both JOHN and PETER can't speak/understand?
// this is 'set complement'
assert (((~JOHN_AND_PETER_CAN_UNDERSTAND) & UNIVERSE) == CHINESE);

// which languages JOHN can speak that PETER don't?
// this is 'set subtraction' or 'difference of sets'
// in other words, clear all bits in JOHN_CAN_SPEAK that present in PETER_CAN_SPEAK
assert ((JOHN_CAN_SPEAK & (~PETER_CAN_SPEAK)) == FRENCH);

// which languages PETER can speak that JOHN don't?
assert ((PETER_CAN_SPEAK & (~JOHN_CAN_SPEAK)) == GERMAN);

// how many languages our group of 3 persons can speaks?
// this is 'cardinality' of set
// we're using a popcount() function here to count all bits in ORed values
assert (__builtin_popcount (JOHN_CAN_SPEAK | PETER_CAN_SPEAK | NATASHA_CAN_SPEAK)==4);

// which languages can be spoken/understood only by one person in a pair?
// (a language spoken by both persons is 'eliminated' by XOR operation.)
assert ((JOHN_CAN_UNDERSTAND ^ PETER_CAN_UNDERSTAND) == (FRENCH | RUSSIAN | ARABIC | SPANISH));
};


By the way, an interesting story about applying set/logical operations on paper cards:

To allow a visual check that all cards in a deck were oriented the same way, one corner of each card was beveled, much like Hollerith punched cards. Edge-notched cards, however, were not intended to be read by machines such as IBM card sorters. Instead, they were manipulated by passing one or more slim needles through selected holes in a group of cards. As the needles were lifted, the cards that were notched in the hole positions where the needles were inserted would be left behind as rest of the deck was lifted by the needles. Using two or more needles produced a logical and function. Combining the cards from two different selections produced a logical or. Quite complex manipulations, including sorting were possible using these techniques.


### De Morgan's laws

... is valid for both sets, boolean values and bitstrings of arbitrary size.

You can easily prove this for boolean values. Construct a truth table for all possible two boolean inputs. That would consists of only 4 rows. Q.E.D.

By induction, extend this to bitstrings of arbitrary size.

### Powerset

For example, we have 4 elements in set. Each element for each bit. How can we enumerate all possible combinations? Just by listing all numbers between 0b0000 and 0b1111. That comprises 16 combinations or subsets, including empty set (starting 0b0000).

This is why $2^S$ notation is also used: there are $2^{number of elements}$ subsets. In our case, $2^4=16$.

### Inclusion-exclusion principle

Many set theory textbooks has problems like these:

Suppose a survey revealed that 70% of the population visited
an amusement park and 80% visited a national park.
At least what percentage of the population visited both?

(The "Discrete Structures, Logic and Computability" book by James L. Hein, 4th ed.)

We will represent each 10% using star or dot. How many 'stars' can be shared maximally? I would align these two bars at left:

park1 : *******...
park2 : ********..
both  : *******...


It's 70%.

And shared minimally? I would align one bar at left, another at right:

park1 : ...*******
park2 : ********..
both  : ...*****..


50%.

Observing this, we can deduce the general formula:

Maximal shared = min(park1, park2) Minimal shared = park2 - (100% - park1)

This is what is called 'inclusion-exclusion principle'.

Another solution is in my 'SAT/SMT by Example' book, search for 'two parks'.

Yes, I know about these lousy Disqus ads. Please use adblocker. I would consider to subscribe to 'pro' version of Disqus if the signal/noise ratio in comments would be good enough.