[Math] Injective, surjective, bijective functions

These terms are widespread in math and CS. But sometimes, we, mere mortal programmers, should use them too.

Let's start with this. Possible function inputs - also known as 'domain'. Possible function outputs - also known as 'codomain', 'image', 'range'.

Bijection

Bijective function -- each input mapped to output. There are equal number of possible inputs and outputs.

You can precompute two tables to cache such a function. x=y and y=x for inverse function. (In math writing, if a function denoted as $f$, inverse one denoted as $f^{-1}$. By analogy -- it's like inverse of a $x$ value denoted as $\frac{1}{x}=x^{-1}$.)

AKA "1–1 correspondence".

An isomorphism is a bijection (that is, a 1–1 correspondence) f : S → S

( Modeling and Verification of Parallel Processes, 4 school, MOVEP 2000(LNCS2067, Springer, 2001) )

No input information is lost. An input can be recovered.

All sorts of symmetrial crypto algorithms. DES, AES, etc.

Bijective functions are reversible -- inverse function must exist (but may not be easy to find, like in case of cryptography). Encryption and decryption procedures can be swapped without loss of security.

A function is invertible if and only if it is a bijection

( A FIRST COURSE IN MATHEMATICAL LOGIC AND SET THEORY )

They are called bijections and they are truly symmetric, because they are invertible.
In the category of sets, an isomorphism is the same as a bijection.

( Bartosz Milewski - Category Theory for Programmers )

CRC32 for 32-bit input is bijective -- also, no collisions.

Permutations.

Bit rotations like ROR/ROL/RCR/RCL.

abs(unsigned int) -> unsigned int;

Also note that combining this functionality with "PyGILState_*()" APIs
is delicate, because these APIs assume a bijection between Python
thread states and OS-level threads, an assumption broken by the
presence of sub-interpreters.

( src )

When using data coming from a web browser or some other untrusted
source, a common technique is to check for illegal characters in a
string before using the string in a generated command line or storing
it in a database.  If you're doing this, be careful to check the
decoded string, not the encoded bytes data; some encodings may have
interesting properties, such as not being bijective or not being fully
ASCII-compatible.  This is especially true if the input data also
specifies the encoding, since the attacker can then choose a clever
way to hide malicious text in the encoded bytestream.

( src )

A linear map that is bijective (i.e. one-to-one and onto), is called an isomorphism.

( MATLAB®-based Finite Element Programming in Electromagnetic Modeling )

Often, the mapping between font dimensions and font parameters is bijective,

( WILL ROBERTSON, Philipp Stephani, Joseph Wright, Khaled Hosny, and others -- Experimental Unicode mathematical typesetting: The unicode-math package )


Mathematicians often say that bijective mappings/functions are both injective and surjective.

(Most mathematicians regard bijections as a special case of surjections, i.e. the expression
‘many-one correspondences’ is taken to include one–one correspondences.)

( H. Graham Flegg - From Geometry to Topology-Dover Publications (2001) )

A function f : A → B is a bijection if it is injective and surjective.

( Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein -- Introduction To Algorithms (Mit Press) )

A function is bijective iff it is both injective and surjective.

( Jean H. Gallier - Logic for computer science_ foundations of automatic theorem proving )

 * A bijection is a mapping which is both an injection and a surjection.
 * Every source has precisely one target, and vice versa.

( src )

Surjection

Many inputs, less outputs.

Information loss. Hence, these functions are irreversible.

Hash functions like sha1(input size > 160 bits), CRC32(input size > 32 bits).

abs(signed int) -> signed int.

Bit shift operations -- bits are dropped if you shift left/right (without rotation).

Functions like tolower()/toupper() -- you couldn't recovered the original input.

 * A surjection is a mapping if every target has at least one source; also
 * known as an 'onto' mapping.

( src )

Injection

Unused outputs -- not all outputs are 'connected' to inputs.

No loss of information. Input can be restored.

CRC32(input size < 32 bits) -- input can be sometimes recovered as I shown in SAT/SMT by example.

Hash functions like sha1(input size < 160 bits) -- input can be recovered via bruteforce.

The Common Lisp standard defines a CHAR-CODE function, which is an injective mapping from the set of characters into the set of integers (the character codes of the corresponding characters). However, there’s not much else that’s said about this mapping in the standard, except for the fact that the ordering on the set of characters induced by < and their character codes has to be consistent with CHAR.
So the CHAR-CODE function is injective, but not surjective.

( Edmund Weitz - Common Lisp Recipes: A Problem-Solution Approach )

In computer science, a perfect hash function h for a set S is a hash function that maps distinct
elements in S to a set of m integers, with no collisions.
In mathematical terms, it is an injective function.

( wikipedia )

See also about perfect hash.

No collisions, but there can be unused outputs.

Mathematicians have a name for noncollapsing functions: they call them injective or one-to-one...

( Bartosz Milewski - Category Theory for Programmers )

...is called injective (also one-to-one, or an embedding)...

( James L. Hein - Discrete Structures, Logic, And Computability (2015) )

 * An injection is a mapping where a target has at most one source; also
 * somewhat confusingly known as a 'one-to-one' mapping.

( src )


List of my other blog posts.

Yes, I know about these lousy Disqus ads. Please use adblocker. I would consider to subscribe to 'pro' version of Disqus if the signal/noise ratio in comments would be good enough.