(It's assumed that the reader is familiar with topological sorting. (Re-)read the "Dependency graphs and topological sorting" in my book. Also with graphs, DAGs, callgraphs, simulated annealing.)
It's a problem with topological sorting --- no cycles in DAG allowed.
I can fix this. I'll find an order for some vertices so that the number of backward references would be minimized and forward references --- maximized. That would be the fitness function. I use simulated annealing, but many other similar algorithms would also work, I believe.
Now I take an old Linux 0.99.15 (modern Linux is too big for that). I find callgraph for it and find a fuzzy topological order.
That results in such list:
unzip isofs_read_super seagate_st0x_detect ext_link leak_check_malloc sys_gettimeofday sysv_file_write main opl3_close con_write ... OUTB inb put_fs_long outw NCR5380_reselect EXT2_INODES_PER_GROUP sound_getconf ; fitness at end: ; fitness_fn() fwd= 11494 ; fitness_fn() back=50
Low-level functions on bottom, high-level functions on top. Funny thing - inb() and outw() (direct access to I/O ports) are low-level functions and con_write() is a high-level function. Indeed, Linus started kernel as a terminal emulator.
From a reverse engineer's point of view, we can quickly see the level of a function. How high or low it is. Level of abstraction. That maybe very useful. Because many reverse engineers often try to understand, what this or that function does --- is it on top of things, or small low-level utility?
Now also Boolector SMT solver:
translate_formula parse_smt2_parser btor_proputils_select_move_prop apply_add_add_4_eq btor_node_create_bv_const set_up_dual_and_collect ... getc abs BTOR_COUNT_STACK ccadical_melt BTOR_SIZE_STACK ccadical_add mpz_get_str ; fitness at end: ; fitness_fn() fwd= 17448 ; fitness_fn() back=101
Here, parse_smt2_parser() is a very high-level function and ccadical_add() is one of at the very low - indeed, this is the function that add term/literal to Cadical SAT solver.
All files, including the utility to find order, in C++.
Further work: use this to sort basic blocks in function, to map better visual layout to execution timeline.
