You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fix imphash issue with empty import names. (#1944)
* Fix imphash issue with empty import names.
If an import has an empty name skip processing it. This is consistent with the
behavior of pefile (https://github.com/erocarrera/pefile/blob/593d094e35198dad92aaf040bef17eb800c8a373/pefile.py#L5871-L5872).
Add a test case which is just the tiny test file with the first import name set
to all NULL bytes. I tested that pefile calculates the imphash of it, which
matches what YARA now calculates too:
>>> import pefile
>>> pe = pefile.PE('/Users/wxs/src/yara/tests/data/tiny_empty_import_name')
>>> pe.get_imphash()
'0eff3a0eb037af8c1ef0bada984d6af5'
>>>
Fixes#1943
* Add test file forgot in last commit.
* Handle invalid import names.
If an imported function does not contain ONLY a-zA-Z0-9 and a small subset of
special characters it will now be ignored. This aligns us better with pefile,
which checks for valid import names and skips them if they are invalid.
I've also updated the test file to check that these special characters are
handled properly.
* Fix test after alignment with pefile.
Turns out that the tiny-idata-5200 file is corrupted to the point that pefile
doesn't parse it. For example, it finds no imports to hash:
```
>>> pe = pefile.PE('tests/data/tiny-idata-5200')
>>> pe.get_imphash()
''
>>>
```
We were parsing imports from this file before these changes that were incorrect,
so fix the tests to reflect the fact that we parse no imports from this file
anymore.
As part of this I've split the checks for number of parsed imports and
successfully parsed imports into two different counters, which now means we are
accurately reflecting when we are able to parse the import table but not the
descriptors in it while still making sure we don't parse too many as we have
seen before.
* Move declarations to align better.
Move these declaractions so they are with the rest of the constants. This makes
the output of -D easier to read.
* Fix memory leak when handling corrupt imports.
If we have an invalid import name we need to free the name and continue to the
next thunk and function. While fixing this I noticed that if we fail to alloc an
IMPORT_FUNCTION* we would end up looping endlessly because we were never
incrementing the thunk pointer or function index. Fix it by ALWAYS incrementing
those at the end of the loop and conditionally populating the newly allocated
node.
0 commit comments