chunker: fix invalid use of types

With the argument specified as unsigned char *, Cython emits
code in the Python wrapper to convert string-like objects to
unsigned char* (essentially PyBytes_AS_STRING).

Because the len(data) call is performed on a cdef'd string-ish type,
Cython emits a strlen() call, on the result of PyBytes_AS_STRING.

This is not correct, since embedded null bytes are entirely possible.

Incidentally, the code generated by Cython was also not correct,
since the Clang Static Analyzer found a path of execution where
passing arguments in a weird way from Python resulted in strlen(NULL).

Formulated like this, Cython emits essentially:

c_buzhash(
 PyBytes_AS_STRING(data),
 PyObject_Length(data),
 ...
)

which is correct.
This commit is contained in:
Marian Beermann 2017-06-14 19:16:36 +02:00
parent 8e477414ee
commit faf2d0b537
1 changed files with 2 additions and 2 deletions

View File

@ -50,11 +50,11 @@ cdef class Chunker:
return chunker_process(self.chunker)
def buzhash(unsigned char *data, unsigned long seed):
def buzhash(data, unsigned long seed):
cdef uint32_t *table
cdef uint32_t sum
table = buzhash_init_table(seed & 0xffffffff)
sum = c_buzhash(data, len(data), table)
sum = c_buzhash(<const unsigned char *> data, len(data), table)
free(table)
return sum