I need fast processing of XOR bytearray, In a variant on Python
for i in range(len(str1)): str1[i]=str1[i] ^ 55
works very slow
I wrote this module in C.
I know C language very badly, before I on it wrote nothing.
In a variant
PyArg_ParseTuple (args, "s", &str))
everything works as expected, but I need to use instead of s s* because elements can contain embeded null, but if I change s to s* when calling python crash
PyArg_ParseTuple (args, "s*", &str)) // crash
Maybe some beginner like me want to use my example as a start to write something of his own, so bring all the information to be used in this example on Windows.
Parsing arguments and building values on page http://docs.python.org/dev/c-api/arg.html
test_xor.c
#include <Python.h>
static PyObject* fast_xor(PyObject* self, PyObject* args)
{
const char* str ;
int i;
if (!PyArg_ParseTuple(args, "s", &str))
return NULL;
for(i=0;i<sizeof(str);i++) {str[i]^=55;};
return Py_BuildValue("s", str);
}
static PyMethodDef fastxorMethods[] =
{
{"fast_xor", fast_xor, METH_VARARGS, "fast_xor desc"},
{NULL, NULL, 0, NULL}
};
PyMODINIT_FUNC
initfastxor(void)
{
(void) Py_InitModule("fastxor", fastxorMethods);
}
test_xor.py
import fastxor
a=fastxor.fast_xor("World") # it works with s instead s*
print a
a=fastxor.fast_xor("Wo\0rld") # It does not work with s instead s*
compile.bat
rem use http://bellard.org/tcc/
tiny_impdef.exe C:\Python26\python26.dll
tcc -shared test_xor.c python26.def -IC:\Python26\include -LC:\Python26\libs -ofastxor.pyd
test_xor.py
bytearray() method returns a bytearray object which is an array of given bytes. It gives a mutable sequence of integers in the range 0 <= x < 256. Returns: Returns an array of bytes of the given size. source parameter can be used to initialize the array in few different ways.
Using the decode() Function to convert Bytearray to String in Python. An alternative way to convert a bytearray to string is by using the decode() method. The decode() method, when invoked on a bytearray object, takes the encoding format as input and returns the output string.
The bytearray type is a mutable sequence of integers in the range between 0 and 255. It allows you to work directly with binary data. It can be used to work with low-level data such as that inside of images or arriving directly from the network. Bytearray type inherits methods from both list and str types.
bytearray() Syntax bytearray() method returns a bytearray object (i.e. array of bytes) which is mutable (can be modified) sequence of integers in the range 0 <= x < 256 . If you want the immutable version, use the bytes() method.
You don't need build an extension module to do this quickly, you can use NumPy. But for your question, you need some c code like this:
#include <Python.h>
#include <stdlib.h>
static PyObject * fast_xor(PyObject* self, PyObject* args)
{
const char* str;
char * buf;
Py_ssize_t count;
PyObject * result;
int i;
if (!PyArg_ParseTuple(args, "s#", &str, &count))
{
return NULL;
}
buf = (char *)malloc(count);
for(i=0;i<count;i++)
{
buf[i]=str[i] ^ 55;
}
result = Py_BuildValue("s#", buf, count);
free(buf);
return result;
}
You can't change the content of string object, because string in Python is immutable. You can use "s#" to get the char *
pointer and the buffer length.
If you can use NumPy:
In [1]: import fastxor
In [2]: a = "abcdsafasf12q423\0sdfasdf"
In [3]: fastxor.fast_xor(a)
Out[3]: 'VUTSDVQVDQ\x06\x05F\x03\x05\x047DSQVDSQ'
In [5]: import numpy as np
In [6]: (np.frombuffer(a, np.int8)^55).tostring()
Out[6]: 'VUTSDVQVDQ\x06\x05F\x03\x05\x047DSQVDSQ'
In [7]: a = a*10000
In [8]: %timeit fastxor.fast_xor(a)
1000 loops, best of 3: 877 us per loop
In [15]: %timeit (np.frombuffer(a, np.int8)^55).tostring()
1000 loops, best of 3: 1.15 ms per loop
An alternative approach is to use PyObject_GetBuffer
. The module below defines fast_xor
for any object that supports the buffer protocol, and fast_xor_inplace
for objects that have writable buffers, such as bytearray
. This version returns None
. I also added a 2nd unsigned char
argument with a default value of 55.
Example:
>>> s = 'abc'
>>> b = bytearray(s)
>>> fast_xor(s), fast_xor(s, 0x20)
('VUT', 'ABC')
>>> fast_xor_inplace(b, 0x20)
>>> b
bytearray(b'ABC')
>>> fast_xor_inplace(s)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
BufferError: Object is not writable.
>>> fast_xor(b, 256)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OverflowError: unsigned byte integer is greater than maximum
Source:
#include <Python.h>
static PyObject *fast_xor_inplace(PyObject *self, PyObject *args)
{
PyObject *arg1;
unsigned char arg2 = 55;
Py_buffer buffer;
char *buf;
int i;
if (!PyArg_ParseTuple(args, "O|b:fast_xor_inplace", &arg1, &arg2))
return NULL;
if (PyObject_GetBuffer(arg1, &buffer, PyBUF_WRITABLE) < 0)
return NULL;
buf = buffer.buf;
for(i=0; i < buffer.len; i++)
buf[i] ^= arg2;
PyBuffer_Release(&buffer);
Py_INCREF(Py_None);
return Py_None;
}
static PyObject *fast_xor(PyObject *self, PyObject *args)
{
PyObject *arg1;
unsigned char arg2 = 55;
PyObject *result;
Py_buffer buffer;
char *buf, *str;
int i;
if (!PyArg_ParseTuple(args, "O|b:fast_xor", &arg1, &arg2))
return NULL;
if (PyObject_GetBuffer(arg1, &buffer, PyBUF_SIMPLE) < 0)
return NULL;
result = PyString_FromStringAndSize(NULL, buffer.len);
if (result == NULL)
return NULL;
buf = buffer.buf;
str = PyString_AS_STRING(result);
for(i=0; i < buffer.len; i++)
str[i] = buf[i] ^ arg2;
PyBuffer_Release(&buffer);
return result;
}
static PyMethodDef fastxorMethods[] =
{
{"fast_xor", fast_xor, METH_VARARGS, "fast xor"},
{"fast_xor_inplace", fast_xor_inplace, METH_VARARGS, "fast inplace xor"},
{NULL, NULL, 0, NULL}
};
PyMODINIT_FUNC
initfastxor(void)
{
Py_InitModule3("fastxor", fastxorMethods, "fast xor functions");
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With