## Featured Post

### Có một Biển Đông trên không gian mạng

Có một Biển Đông trên không gian mạng Thái Dương Mùa hè 2014, giữa lúc người Việt trong nước và hải ngoại đang sôi sục vì Trung Quốc đư...

## Tuesday, April 22, 2014

### Stats

This week

All time

I guess I should blog in English more often.

## Friday, April 18, 2014

### Fairy tales in password hashing with scrypt

Update: the author of scrypt said that he's added a warning on top of scryptenc.h.

TL;DR: scrypt is a password-based key derivation function that is often used as a password hashing scheme. libscrypt is considered the official implementation of scrypt, but it's actually a file encryption API with the scrypt function never exposed directly. If one misuses libscrypt for password hashing they may end up with extremely weak hashes.

Password-based key derivation functions (PBKDF) are algorithms that take a password and some non-secret data (e.g., salt), and output one or more secret keys that can be further used in other crypto constructions. These functions are generally made deliberately slow so as to frustrate brute-force attack or dictionary attack on the input password.

scrypt is considered a state-of-the-art PBKDF, but its most common use is as a password hashing scheme, even though it wasn't originally designed for this purpose. The author of scrypt has never released a standalone library for the scrypt function; he's released only a utility, confusingly named scrypt, that uses the scrypt function to implement an encryption API.

This makes a lot of senses, as scrypt was originally built for Tarsnap, the online file backup service built by the same person. It's also, however, a crypto disaster waiting to happen: if one uses the file encryption API as a password hashing scheme they may end up with extremely weak hashes. Since the introduction of scrypt I've seen a few of such misuses, all of which were made by otherwise competent programmers.

The scrypt encryption API is declared as follows:

/**
* scryptenc_buf(inbuf, inbuflen, outbuf, passwd, passwdlen,
*     maxmem, maxmemfrac, maxtime):
* Encrypt inbuflen bytes from inbuf, writing the resulting inbuflen + 128
* bytes to outbuf.
*/
int scryptenc_buf(const uint8_t *, size_t, uint8_t *,
const uint8_t *, size_t, size_t, double, double);

The intended usage of this function is to derive a key from the password using scrypt (with a randomly generated salt,) then use the derived key to encrypt the input buffer. I've found that many people that want to use scrypt as a password hashing function end up using scryptenc_buf.

In the first case the programmer used scryptenc_buf to encrypt an empty string using the input password as the key, and saved the output as the password hash. There's no badness: although the hash isn't compatible with scrypt anymore, it still has the same strength. It went downhill very fast from here though.

The second programmer used the same function, but she encrypted the input password with a static key. Since scryptenc_buf generates a random salt for each invocation, each password is probably encrypted with a unique key (derived from the random salt and the static key,) but it's still pretty bad: if anyone obtains the static key they can recover all passwords from their hashes. The developer knew that encrypting password is a bad idea, but she was confused by the API. After all the hashes looked random. As the saying goes, bad crypto is usually indistinguishable from good crypto.

The third programmer, for some reason that I've forgotten, wanted to use a single salt for all users, so he modified scryptenc_buf as follows:

scryptenc_buf(const uint8_t * inbuf, size_t inbuflen, uint8_t * outbuf, const uint8_t * passwd,
size_t passwdlen, const uint8_t * salt, size_t saltlen,
size_t maxmem, double maxmemfrac, double maxtime);

Like the second programmer, he encrypted the input password with a static key. With a static salt, all passwords are encrypted under one single key. The encryption algorithm used in scryptenc_buf is AES in counter mode with a zero IV. Did you spot the vulnerability? It's the classic keystream reuse: knowing a single password leads to the recovery of all passwords from their hashes. If one can register as a user they can recover all passwords. How comes hashing password introduces such a deadly vulnerability? I run out of people to blame.

The last case was a group of Java programmers. One of them wrote a JNI wrapper on top of libscrypt. The wrapper accepted a memcost parameter, which should be the same as $$N$$ in the original scrypt paper, but somehow its author wanted it to be $$\log{N}$$. When another programmer called this function he passed, however, $$N$$, so the actual memcost parameter became super big. This mismatch should be caught easily, as libscrypt would return an error code if it couldn't allocate the required memory. Checking return value for errors seems not to be, however, a popular pattern in the Java world.

As a result no password was hashed, and all stored hashes were a series of zero bytes. Anyone could sign in to anybody else's accounts using any passwords. Fortunately, they brought my consulting team in very early in the development process, and I found the bug before they had any real users. Moral of the story? Always look at the output of your crypto.

If I develop a crypto library, I'll conduct user studies like how they do it in usability research. Give developers the library and ask them to conduct a specific task. Rinse and repeat until nobody would misuse it.

## Tuesday, April 15, 2014

### Obama is a liar

Theo Google, một công ty của Mỹ, thì đương kim tổng thống Mỹ là một thằng nói láo, ngu dốt, theo đạo Hồi và cả cộng sản. Mà Google chỉ là dựa vào gợi ý của người dùng Internet thôi nhé: đây là những cụm từ xuất hiện nhiều nhất khi người ta muốn mô tả Obama. Không tin thì bạn thử tìm đi. Thấy chưa? Thấy dân Mỹ chửi Obama như chửi con chưa? Thế mà chẳng thấy ai bị bắt gì cả. Lăng mạ như thế này thì còn thể thống gì nữa. Tự do mà không có khuôn khổ thế này thì loạn mất.

Thế mà nước Mỹ vẫn ổn, vẫn giàu mạnh. Một trăm năm mươi năm nay chuyển giao quyền lực không hề đổ một giọt máu, chỉ có tốn nước miếng thôi.

Trong lúc đó... có một ông tên là Trương Duy Nhất viết blog nói về các "Obama của Việt Nam", chưa dám chửi như chửi con, cũng chẳng dám mắng như chủ mắng đầy tớ, chỉ mới nói nhẹ vài tiếng, nhưng cũng đã được tặng vài cuốn lịch xé chơi qua ngày rồi.

Thử tưởng tượng chúng ta có một Edward Snowden của Việt Nam. Anh ấy vừa chôm được một đống tài liệu bí mật, phải làm gì tiếp theo bây giờ? Chẳng có tờ báo trong nước nào dám đăng. Gửi cho báo nước ngoài thì chính phủ sẽ chặn những tờ báo đó, mà dẫu không chặn thì cũng chẳng mấy ai biết mà tìm đọc. Chắc chỉ còn nước cân ký bán hoặc xé chùi đít dần.

Trong khi đó... Washington Post, tờ báo ở sát nhà Obama, đăng liền tù tì những tài liệu do Snowden gửi đến, rồi đoạt luôn giải Pulitzer, giải thưởng báo chí của chính nước Mỹ. Nghe mỉa mai cứ như là báo Nhân Dân được trao giải báo chí quốc gia vì đã phanh phui những phi vụ đàn áp tự do của chính phủ chúng mình.

Tự do ngôn luận là vậy đó. Rất đơn giản, đâu có gì khó hiểu đâu: tôi được quyền nói, anh được quyền nói, ai cũng được quyền nói và không một ai có quyền kiểm duyệt.

## Friday, April 11, 2014

### An idea to solve the CloudFlare Heartbleed challenge

tl;dr: list of values that if any of them is leaked one can recover the RSA private key: $$p$$, $$q$$, $$d \mod{(p-1)}$$, $$d \mod{(q-1)}$$, $$c^d \mod q$$, and $$c^d \mod p$$, where $$c\ = m^e$$ and $$m$$ is the premaster secret sent by the client.

I plan to take a look at the challenge this weekend, but it seems that someone has solved it in just a few hours. Very nice work. Anyway here's how I think this challenge could be solved:
1. Exploit the fact that if you know any intermediate values in the Chinese Remainder algorithm, you can recover the private key.
2. Search for these intermediate values in the leaked memory.

The private key will be used in two cases:
a) To decrypt the proposed premaster secret from the client.
b) To sign an ephemeral (EC)DH public key.

In both cases you know the value $$c$$ and $$m$$ in the equation $$c^d = m \pmod N$$, where $$d$$ is the private key, and $$N$$ is the modulus. To make it faster $$c^d \pmod N$$ is carried out using Chinese Remainder algorithm in which there are two intermediate values $$x_1$$ and $$x_2$$ that are calculated as follows:
$$c^d = c^{d\mod{q-1}} = x_1 \pmod q$$
$$c^d = c^{d\mod{p-1}} = x_2 \pmod p$$

Note that $$(m - x_1)$$ is a multiple of $$q$$. So you can use $$\gcd(m - x_1, N)$$ to obtain $$q$$, if you happen to know $$x_1$$. Searching for $$x_1$$ looks feasible because it's a bignum half the size of the modulus.

Warning: I haven't verified that this actually works. I just come up with the idea while taking a shower.

Update: I've taken a look at the code, and I think this idea should work. OpenSSL indeed uses Chinese Remainder.

static int RSA_eay_mod_exp(BIGNUM *r0, const BIGNUM *I, RSA *rsa, BN_CTX *ctx)
{
BIGNUM *r1,*m1,*vrfy;
BIGNUM local_dmp1,local_dmq1,local_c,local_r1;
BIGNUM *dmp1,*dmq1,*c,*pr1;
int ret=0;

BN_CTX_start(ctx);
r1 = BN_CTX_get(ctx);
m1 = BN_CTX_get(ctx);
vrfy = BN_CTX_get(ctx);

r1 and m1 are the intermediate values, and they seem to be allocated for every invocation of this function. Their bignum_st structures are re-used from a pool, but the actual values are freshly allocated from the heap, unless I miss something. This is important, because we want these values sitting near the Heartbeat request. Note that if you read this function you may get the impression that p and q are also freshly allocated, but in fact they aren't.

Update: someone pointed out that r1 and m1 will be scrubbed before Heartbeat requests are processed. Too bad :(.

Update: a simple exploit that simply searches for $$p$$ in the leaked memory: https://gist.github.com/epixoip/10570627.

Update: you need to protect pre-calculated CRT parameters stored in the private key as well. See http://lekkertech.net/akamai.txt.

Besides $$d$$, $$p$$, and $$q$$ inside the encoded private key there are also two pre-calcuted CRT parameters: $$d_p = d \pmod{p - 1}$$ and $$d_q = d \pmod{q - 1}$$. If you know $$d_p$$ you can recover the private key as follows:
\begin{aligned} d_p & = d \pmod{p - 1} \\ \rightarrow ed_p & = ed \pmod{p-1} \\ \rightarrow ed_p & = 1 \pmod{p-1}\\ \end{aligned}
That means $$p - 1$$ is a divisor of $$ed_p - 1$$.