27-Jul-2011: Dataflow tracker

Just added to my generic tracer a module which I can call "dataflow tracker".

This is a module which would be able to answer the question "where each received from network byte is RIGHT NOW?"

It's state is far from release-state, so I can't publish it yet.

But how it works is extremely simple. When function like socket recv() is called and it received some data chunk from network, dataflow tracker (dt) marking each byte in memory buffer in form:

<function_name>_#<call_number>_byte_0x<offset>

Now each marked byte is tracked across process. When some instruction loads it from memory to register, that register part is marked as "byte from this buf + offset". When that byte saved from register to memory, that tracked too. memcpy() function calls are tracked too.

Here is an example: I send a very usual "service_register_NSGR" packet to Oracle TNS Listener.

Here is packet dump:

	  "\x02\xDE\x00\x00\x06\x00\x00\x00" # |........|
	  "\x00\x00\x00\x00\x02\xD4\x20\x08" # |........|
	  "\xFF\x03\x01\x00\x12\x34\x34\x34" # |.....444|
	  "\x34\x34\x78\x10\x10\x32\x10\x32" # |44x..2.2|
	  "\x10\x32\x10\x32\x10\x32\x54\x76" # |.2.2.2Tv|
	  "\x00\x78\x10\x32\x54\x76\x44\x00" # |.x.2TvD.|
	  "\x00\x80\x02\x00\x00\x00\x00\x04" # |........|
	  "\x00\x00\x70\xE4\xA5\x09\x90\x00" # |..p.....|
	  "\x23\x00\x00\x00\x42\x45\x43\x37" # |#...BEC7|
	  "\x36\x43\x32\x43\x43\x31\x33\x36" # |6C2CC136|
	  "\x2D\x35\x46\x39\x46\x2D\x45\x30" # |-5F9F-E0|
	  "\x33\x34\x2D\x30\x30\x30\x33\x42" # |34-0003B|
	  "\x41\x31\x33\x37\x34\x42\x33\x03" # |A1374B3.|
	  "\x00\x65\x00\x01\x00\x01\x00\x00" # |.e......|
	  "\x00\x00\x00\x00\x00\x00\x64\x02" # |......d.|
	  "\x00\x80\x05\x00\x00\x00\x00\x04" # |........|
	  "\x00\x00\x00\x00\x00\x00\x01\x00" # |........|
	  "\x00\x00\x10\x00\x00\x00\x02\x00" # |........|
	  "\x00\x00\x84\xC3\xCC\x07\x01\x00" # |........|
	  "\x00\x00\x84\x2F\xA6\x09\x00\x00" # |.../....|
	  "\x00\x00\x44\xA5\xA2\x09\x25\x98" # |..D...%.|
	  "\x18\xE9\x28\x50\x4F\x28\xBB\xAC" # |..(PO(..|
	  "\x15\x56\x8E\x68\x1D\x6D\x05\x00" # |.V.h.m..|
	  "\x00\x00\xFC\xA9\x36\x22\x0F\x00" # |....6"..|
	  "\x00\x00\x60\x30\xA6\x09\x0A\x00" # |..`0....|
	  "\x00\x00\x64\x00\x00\x00\x00\x00" # |..d.....|
	  "\x00\x00\xAA\x00\x00\x00\x00\x01" # |........|
	  "\x00\x00\x17\x00\x00\x00\x78\xC3" # |......x.|
	  "\xCC\x07\x6F\x72\x63\x6C\x00\x28" # |..orcl.(|
	  "\x48\x4F\x53\x54\x3D\x77\x69\x6E" # |HOST=win|
	  "\x32\x30\x30\x33\x29\x00\x01\x00" # |2003)...|
	  "\x00\x00\x09\x00\x00\x00\x01\x00" # |........|
	  "\x00\x00\x50\xC5\x2F\x22\x02\x00" # |..P./"..|
	  "\x00\x00\x34\xC5\x2F\x22\x00\x00" # |..4./"..|
	  "\x00\x00\x9C\xC5\xCC\x07\x6F\x72" # |......or|
	  "\x63\x6C\x5F\x58\x50\x54\x00\x09" # |cl_XPT..|
	  "\x00\x00\x00\x50\xC5\x2F\x22\x04" # |...P./".|
	  "\x00\x00\x00\x00\x00\x00\x00\x00" # |........|
	  "\x00\x00\x00\x00\x00\x00\x00\x34" # |.......4|
	  "\xC5\xCC\x07\x6F\x72\x63\x6C\x5F" # |...orcl_|
	  "\x58\x50\x54\x00\x01\x00\x00\x00" # |XPT.....|
	  "\x05\x00\x00\x00\x01\x00\x00\x00" # |........|
	  "\x84\xC5\x2F\x22\x02\x00\x00\x00" # |../"....|
	  "\x68\xC5\x2F\x22\x00\x00\x00\x00" # |h./"....|
	  "\xA4\xA5\xA2\x09\x6F\x72\x63\x6C" # |....orcl|
	  "\x00\x05\x00\x00\x00\x84\xC5\x2F" # |......./|
	  "\x22\x04\x00\x00\x00\x00\x00\x00" # |".......|
	  "\x00\x00\x00\x00\x00\x00\x00\x00" # |........|
	  "\x00\xFC\xC4\xCC\x07\x6F\x72\x63" # |.....orc|
	  "\x6C\x00\x01\x00\x00\x00\x10\x00" # |l.......|
	  "\x00\x00\x02\x00\x00\x00\xBC\xC3" # |........|
	  "\xCC\x07\x04\x00\x00\x00\xB0\x2F" # |......./|
	  "\xA6\x09\x00\x00\x00\x00\x00\x00" # |........|
	  "\x00\x00\x89\xC0\xB1\xC3\x08\x1D" # |........|
	  "\x46\x6D\xB6\xCF\xD1\xDD\x2C\xA7" # |Fm....,.|
	  "\x66\x6D\x0A\x00\x00\x00\x78\x2B" # |fm....x+|
	  "\xBC\x04\x7F\x00\x00\x00\x64\xA7" # |......d.|
	  "\xA2\x09\x0D\x00\x00\x00\x20\x2C" # |.......,|
	  "\xBC\x04\x11\x00\x00\x00\x95\x00" # |........|
	  "\x00\x00\x02\x20\x00\x80\x03\x00" # |........|
	  "\x00\x00\x98\xC5\x2F\x22\x00\x00" # |..../"..|
	  "\x00\x00\x00\x00\x00\x00\x0A\x00" # |........|
	  "\x00\x00\xB0\xC3\xCC\x07\x44\x45" # |......DE|
	  "\x44\x49\x43\x41\x54\x45\x44\x00" # |DICATED.|
	  "\x28\x41\x44\x44\x52\x45\x53\x53" # |(ADDRESS|
	  "\x3D\x28\x50\x52\x4F\x54\x4F\x43" # |=(PROTOC|
	  "\x4F\x4C\x3D\x42\x45\x51\x29\x28" # |OL=BEQ)(|
	  "\x50\x52\x4F\x47\x52\x41\x4D\x3D" # |PROGRAM=|
	  "\x43\x3A\x5C\x61\x70\x70\x5C\x41" # |C:\app\A|
	  "\x64\x6D\x69\x6E\x69\x73\x74\x72" # |dministr|
	  "\x61\x74\x6F\x72\x5C\x70\x72\x6F" # |ator\pro|
	  "\x64\x75\x63\x74\x5C\x31\x31\x2E" # |duct\11.|
	  "\x31\x2E\x30\x5C\x64\x62\x5F\x31" # |1.0\db_1|
	  "\x5C\x62\x69\x6E\x5C\x6F\x72\x61" # |\bin\ora|
	  "\x63\x6C\x65\x2E\x65\x78\x65\x29" # |cle.exe)|
	  "\x28\x41\x52\x47\x56\x30\x3D\x6F" # |(ARGV0=o|
	  "\x72\x61\x63\x6C\x65\x6F\x72\x63" # |racleorc|
	  "\x6C\x29\x28\x41\x52\x47\x53\x3D" # |l)(ARGS=|
	  "\x27\x28\x4C\x4F\x43\x41\x4C\x3D" # |'(LOCAL=|
	  "\x4E\x4F\x29\x27\x29\x29\x00\x4C" # |NO)')).L|
	  "\x4F\x43\x41\x4C\x20\x53\x45\x52" # |OCAL.SER|
	  "\x56\x45\x52\x00\x68\xC5\x2F\x22" # |VER.h./"|
	  "\x34\xC5\x2F\x22\x00\x00\x00\x00" # |4./"....|
	  "\x05\x00\x00\x00\x84\xC5\x2F\x22" # |....../"|
	  "\x04\x00\x00\x00\x00\x00\x00\x00" # |........|
	  "\x00\x00\x00\x00\x00\x00\x00\x00" # |........|
	  "\xFC\xC4\xCC\x07\x6F\x72\x63\x6C" # |....orcl|
	  "\x00\x09\x00\x00\x00\x50\xC5\x2F" # |.....P./|
	  "\x22\x04\x00\x00\x00\x00\x00\x00" # |".......|
	  "\x00\x00\x00\x00\x00\x00\x00\x00" # |........|
	  "\x00\x34\xC5\xCC\x07\x6F\x72\x63" # |.4...orc|
	  "\x6C\x5F\x58\x50\x54\x00"         # |l_XPT.  |

After processing, I can see how each byte from this packet is used. For example, let's open oranro11.dll!ncrfgnid() function in IDA (as addition to coloring and commenting functionality I added in gt 0.5beta):

Whoa, we see how packet bytes from 0xe to 0x13 are used! (snttread() function is the Oracle function used as recv() analog).

Here is also long dataflow tracker dump, where EACH byte from packet is used: function and offset:

http://conus.info/blogs.conus.info-files/dt_example.txt

If you would observer carefully, you'll also notice that some packet parts are ASCII strings. In dump we can see that these parts are used in Oracle generic library, in functions like "string compare", etc. We can track very fast, against which strings some string from packet is compared, and a lot of more!

It is also possible to list each byte from packet in order of when it was used. Then we would clearly see that first part of packet checked is packet header, especially packet size field (it is compared against what recv() function analog returned) then the rest is checked.

Stay tuned!


→ [list of blog posts] Please drop me email about bug(s) and/or suggestion(s): my emails.

'