Use ca_kit to Rapidly Establish a CA

I began using ca_kit so often that it became inconvenient not having it formally packaged and uploaded to PyPI. So, I’ve built it into a formal package. The following scripts are published into the path, upon install:

  • ck_create_ca: Create CA certificates
  • ck_create: Create regular certificates
  • ck_sign: Sign a regular certificate against the CA certificate
  • ck_verify_ca: Verify that a signed certificate matches the CA certificate

I hope this is as indisposable to you as it is to me.

DEFLATE Socket Compression in Python

Properly-working DEFLATE compression is elusive in Python. Thanks to wtolson, I’ve found such a solution.

This is an example framework for establishing the compression and decompression objects:

def activate_zlib():
    import zlib

    wbits = -zlib.MAX_WBITS

    compress = zlib.compressobj(level, zlib.DEFLATED, wbits)

    compressor = lambda x: compress.compress(x) + \
                            compress.flush(zlib.Z_SYNC_FLUSH)

    decompress = zlib.decompressobj(wbits)
    
    decompressor = lambda x: decompress.decompress(x) + \
                                decompress.flush()
    
    return (compressor, decompressor)

With this example, you’ll pass all of your outgoing data through the compressor, and all of your incoming data to the decompressor. As always, you’ll do a read-loop until you’ve decompressed the expected number of bytes.

Method Overloads in Python 3.4

Python 3.4 added a “singledispatch” decorator to functools, which provides method overloads. This enables you to perform different operations based on the type of the first argument.

By default, it prefers to work with static methods. This mostly comes from the link above:

import functools


class TestClass(object):
    @functools.singledispatch
    def test_method(arg):
        print("Let me just say,", end=" ")
        print(arg)

    @test_method.register(int)
    def _(arg):
        print("Strength in numbers, eh?", end=" ")
        print(arg)

    @test_method.register(list)
    def _(arg):
        print("Enumerate this:")

        for i, elem in enumerate(arg):
            print(i, elem)

if __name__ == '__main__':
    TestClass.test_method(55555)
    TestClass.test_method([33, 22, 11])

However, there is a low-impact way to get overloading on instance-methods, too. We’ll just place our own wrapper around the standard singledispatch wrapper, and hijack the bulk of the functionality:

import functools

def instancemethod_dispatch(func):
    dispatcher = functools.singledispatch(func)
    def wrapper(*args, **kw):
        return dispatcher.dispatch(args[1].__class__)(*args, **kw)
    wrapper.register = dispatcher.register
    functools.update_wrapper(wrapper, func)
    return wrapper


class TestClass2(object):
    @instancemethod_dispatch
    def test_method(self, arg):
        print("2: Let me just say,", end=" ")
        print(arg)

    @test_method.register(int)
    def _(self, arg):
        print("2: Strength in numbers, eh?", end=" ")
        print(arg)

    @test_method.register(list)
    def _(self, arg):
        print("2: Enumerate this:")

        for i, elem in enumerate(arg):
            print(i, elem)

if __name__ == '__main__':
    t = TestClass2()
    t.test_method(55555)
    t.test_method([33, 22, 11])

Aside from superficial changes to the original example, we just added the instancemethod_dispatch function and updated the methods to take a “self” argument.

A special thanks to Zero Piraeus for penning the instancemethod_dispatch method (under the original name of “methdispatch”).

Hidden in Plain Site: The Python print() Statement

The use of print() is so commonplace and thoughtless that it’s easy to forget that it’s still a function. There are parameters often neglected. You may even find yourself using sys.stdout to avoid the automatic newline, which is folly.

This is the signature, as of 3.4:

print(*objects, sep=' ', end='\n', file=sys.stdout, flush=False)

None of the parameters need an explanation. The flush parameter was added in 3.3 .

A Compatible Way of Getting Package-Paths in Python

The Python community is always looking to improve things in the way of consistency, and that’s its best and worst feature because from time to time, one method isn’t finished before another method is begun.

For example, when it comes to something, like packaging, you may need to account for what you need in several different ways, such as identifying non-Python files in both the “package data” clauses of your setup attributes and listing them in a MANIFEST.in (one is considered only when building source distributions, and another is considered only when building binary distributions). However, in order to do this, you also have to embed the non-Python files within one of your actual source directories (the “package” directories), because the package-data files are made to belong to particular packages. We won’t even talk about the complexities of source and binary packages when packaging into a wheel. Such divergences are the topic of many entire series of articles.

With similar compatibility problems, in order to use one of the module loaders, for the purposes of reflection, the recommended package will vary depending on whether you’re running 2.x, 3.2/3.3, and 3.4 .

It’s a pain. For your convenience, this is such a flow, used to determine the path of the package:

_MODULE_NAME = 'module name'
_APP_PATH = None

# Works in 3.4

try:
    import importlib.util
    _ORIGIN = importlib.util.find_spec(_MODULE_NAME).origin
    _APP_PATH = os.path.abspath(os.path.dirname(_ORIGIN))
except:
    pass

# Works in 3.2

if _APP_PATH is None:
    try:
        import importlib
        _INITFILEPATH = importlib.find_loader(_MODULE_NAME).path
        _APP_PATH = os.path.abspath(os.path.dirname(_INITFILEPATH))
    except:
        pass

# Works in 2.x

if _APP_PATH is None:
    import imp
    _APP_PATH = imp.find_module(_MODULE_NAME)[1]

Doing Proper Encryption with Rijndael using [Pure] Python

When it comes to writing code that portably encrypts and decrypts without concern for the platform, you’ll have to overcome a couple of small factors with Python:

  • You’ll have to install a cryptography library
  • You’ll need code that doesn’t need a build environment (no C-code)

I personally prefer Rijndael (also called “AES” when you constrain yourself to certain key-sizes and/or block-sizes) as an affordable, ubiquitous algorithm. There was a remarkable amount of difficulty in finding a pure-Python algorithm that worked in both Python 2 and Python 3.

By mentioning “proper” encryption in the title, I was referring to “key derivation” (also called “key expansion”). It’s not enough to have a password and an algorithm. Unless you already have a sufficiently random key of exactly the right number of bits, you’ll need to submit the password to a key-expansion algorithm in order to generate a key of the right length. This is often done using the PBKDF2 algorithm.

It was difficult to find a Python 2 and 3 version of PBKDF2 as well.

I’ve posted the pprp package, which packages Python 2 and Python 3 versions of Rijndael and PBKDF2 and automatically chooses the right version during module-load. As I implement PKCS7 for padding, I also include a utility function to trim the padding.

The main documentation provides the following as an example:

import io
import os.path
import hashlib

import pprp

def trans(text):
    return text.encode('ASCII') if sys.version_info[0] >= 3 else text

passphrase = trans('password')
salt = trans('salt')
block_size = 16
key_size = 32
data = "this is a test" * 100

key = pprp.pbkdf2(passphrase, salt, key_size)

def source_gen():
    for i in range(0, len(data), block_size):
        block = data[i:i + block_size]
        len_ = len(block)

        if len_ > 0:
            yield block.encode('ASCII')

        if len_ < block_size:
            break

# Pump the encrypter output into the decrypter.
encrypted_gen = pprp.rjindael_encrypt_gen(key, source_gen(), block_size)

# Run, and sink the output into an IO stream. Trim the padding off the last 
# block.

s = io.BytesIO()
ends_at = 0
for block in pprp.rjindael_decrypt_gen(key, encrypted_gen, block_size):
    ends_at += block_size
    if ends_at >= len(data):
        block = pprp.trim_pkcs7_padding(block)

    s.write(block)

decrypted = s.getvalue()
assert data == decrypted.decode('ASCII')

Generating Signatures with 25519 Curves in Python

It’s not very often that you need to do encryption in pure Python. Obviously, you’d want to have this functionality off-loaded to C-routines whenever possible. However, in some situations, often when you require code portability or that there not be any C-compilation phase, a pure-Python implementation can be convenient.

The 25519 elliptic curve is a special curve defined by Bernstein, rather than NIST. It is fairly well-embraced as an alternative to NIST-generated curves, which are assumed to be either influenced or compromised by the NSA. See Aris’ notable 25519-patch contribution to OpenSSH.

There exists a pure-Python implementation of 25519 at this site, but it’s too expensive to use in a high-volume platform (because of the math involved). However, it’s perfect if it’s to be driven by infrequent events, and a delay of a few seconds is fine. The benefits of an EC, if you don’t already know, is that the factors are usually considerably smaller for strength comparable to traditional, non-EC algorithms. Therefore, your trade-off for a couple of more seconds of computation is an enigmatically-shorter signature.

To use it, grab the Python source, included here for posterity:

import hashlib

b = 256
q = 2**255 - 19
l = 2**252 + 27742317777372353535851937790883648493

def H(m):
  return hashlib.sha512(m).digest()

def expmod(b,e,m):
  if e == 0: return 1
  t = expmod(b,e/2,m)**2 % m
  if e & 1: t = (t*b) % m
  return t

def inv(x):
  return expmod(x,q-2,q)

d = -121665 * inv(121666)
I = expmod(2,(q-1)/4,q)

def xrecover(y):
  xx = (y*y-1) * inv(d*y*y+1)
  x = expmod(xx,(q+3)/8,q)
  if (x*x - xx) % q != 0: x = (x*I) % q
  if x % 2 != 0: x = q-x
  return x

By = 4 * inv(5)
Bx = xrecover(By)
B = [Bx % q,By % q]

def edwards(P,Q):
  x1 = P[0]
  y1 = P[1]
  x2 = Q[0]
  y2 = Q[1]
  x3 = (x1*y2+x2*y1) * inv(1+d*x1*x2*y1*y2)
  y3 = (y1*y2+x1*x2) * inv(1-d*x1*x2*y1*y2)
  return [x3 % q,y3 % q]

def scalarmult(P,e):
  if e == 0: return [0,1]
  Q = scalarmult(P,e/2)
  Q = edwards(Q,Q)
  if e & 1: Q = edwards(Q,P)
  return Q

def encodeint(y):
  bits = [(y >> i) & 1 for i in range(b)]
  return ''.join([chr(sum([bits[i * 8 + j] << j for j in range(8)])) for i in range(b/8)])

def encodepoint(P):
  x = P[0]
  y = P[1]
  bits = [(y >> i) & 1 for i in range(b - 1)] + [x & 1]
  return ''.join([chr(sum([bits[i * 8 + j] << j for j in range(8)])) for i in range(b/8)])

def bit(h,i):
  return (ord(h[i/8]) >> (i%8)) & 1

def publickey(sk):
  h = H(sk)
  a = 2**(b-2) + sum(2**i * bit(h,i) for i in range(3,b-2))
  A = scalarmult(B,a)
  return encodepoint(A)

def Hint(m):
  h = H(m)
  return sum(2**i * bit(h,i) for i in range(2*b))

def signature(m,sk,pk):
  h = H(sk)
  a = 2**(b-2) + sum(2**i * bit(h,i) for i in range(3,b-2))
  r = Hint(''.join([h[i] for i in range(b/8,b/4)]) + m)
  R = scalarmult(B,r)
  S = (r + Hint(encodepoint(R) + pk + m) * a) % l
  return encodepoint(R) + encodeint(S)

def isoncurve(P):
  x = P[0]
  y = P[1]
  return (-x*x + y*y - 1 - d*x*x*y*y) % q == 0

def decodeint(s):
  return sum(2**i * bit(s,i) for i in range(0,b))

def decodepoint(s):
  y = sum(2**i * bit(s,i) for i in range(0,b-1))
  x = xrecover(y)
  if x & 1 != bit(s,b-1): x = q-x
  P = [x,y]
  if not isoncurve(P): raise Exception("decoding point that is not on curve")
  return P

def checkvalid(s,m,pk):
  if len(s) != b/4: raise Exception("signature length is wrong")
  if len(pk) != b/8: raise Exception("public-key length is wrong")
  R = decodepoint(s[0:b/8])
  A = decodepoint(pk)
  S = decodeint(s[b/8:b/4])
  h = Hint(encodepoint(R) + pk + m)
  if scalarmult(B,S) != edwards(R,scalarmult(A,h)):
    raise Exception("signature does not pass verification")

These routines are not Python 3 compatible.

For reference, other implementations in other languages can be found at the Wikipedia page, under “External links”.

To implement the Bernstein version above, you must first generate a signing (private) key. This comes from a random 32-bit number with some bits flipped:

import os

signing_key_bytes = [ord(os.urandom(1)) for i in range(32)]

signing_key_bytes[0] &= 248
signing_key_bytes[31] &= 127
signing_key_bytes[31] |= 64

signing_key = ''.join([chr(x) for x in signing_key_bytes])

To produce the signature, and then check it:

import ed25519

message = "some data"

public_key = ed25519.publickey(signing_key)

signature = ed25519.signature(message, signing_key, public_key)
ed25519.checkvalid(signature, message, public_key)

The signing and the ensuing check take about six-seconds, together.

The ed25519 module is the one whose source is displayed above. The name “Ed25519” (the official name of the algorithm) stands for “Edwards Curve”, the mathematical foundation on which Bernstein designed the 25519 curve.