# Breaking UC Browser

### Introduction

At the end of March we reported on the hidden potential to download and run unverified code in UC Browser. Today we will examine in detail how it happens and how hackers can use it.

Some time ago, UC Browser was promoted and distributed quite aggressively. It was installed on devices by malware, distributed via websites under the guise of video files (i.e., users thought they were downloading pornography or something, but instead were getting APK files with this browser), advertised using worrisome banners about a user’s browser being outdated or vulnerable. The official UC Browser VK group had a topic where users could complain about false advertising and many users provided examples. In 2016, there was even a commercial in Russian (yes, a commercial of a browser that blocks commercials).

As we write this article, UC Browser was installed 500,000,000 times from Google Play. This is impressive since only Google Chrome managed to top that. Among the reviews, you can see a lot of user complaints about advertising and being redirected to other applications on Google Play. This was the reason for our study: we wanted to see if UC Browser is doing something wrong. And it is! The application is able to download and run executable code, which violates Google Play’s policy for app publishing . And UC Browser doesn’t only download executable code; it does this unsafely, which can be used for a MitM attack. Let's see if we can use it this way.

Everything that follows applies to the version of UC Browser that was distributed via Google Play at the time of our study:

package: com.UCMobile.intl
versionName: 12.10.8.1172
versionCode: 10598
sha1 APK-file: f5edb2243413c777172f6362876041eb0c3a928c

### Attack Vector

UC Browser’s manifest contains a service with a telltale name of com.uc.deployment.UpgradeDeployService.

    <service android:exported="false" android:name="com.uc.deployment.UpgradeDeployService" android:process=":deploy" />


When this service launches, the browser makes a POST request to puds.ucweb.com/upgrade/index.xhtml that can be seen in traffic for some time after the launch. In response, the browser may receive a command to download any update or a new module. During our analysis, we never received such commands from the server, but we noticed that when trying to open a PDF file in the browser, it repeats the request to the above address, then downloads a native library. To simulate an attack, we decided to use this feature of UC Browser—the ability to open PDF files using a native library — not present in the APK file, but downloadable from the Internet. Technically, UC Browser can download something without a user’s permission when given an appropriate response to a request sent upon startup. But for this we need to study the interaction protocol with the server in more detail, so we thought it was easier to just hook and edit the response and then replace the library needed to open PDF files with something different.

So when a user wants to open a PDF file directly in the browser, traffic may contain the following requests:

First, there is a POST request to puds.ucweb.com/upgrade/index.xhtml, then the compressed library for viewing PDF files and office documents is downloaded. Logically, we can assume that the first request sends information about the system (at least the architecture, because the server needs to select an appropriate library), and the server responds with some information about the library that needs to be downloaded, like its address and maybe something else. The problem is that this request is encrypted.

 Request fragment Response fragment

The library is compressed in a ZIP file and not encrypted.

### Searching for traffic decryption code

Let’s try and decrypt the server’s response. Take a look at the code of the com.uc.deployment.UpgradeDeployService class: from the onStartCommand method, we navigate to com.uc.deployment.b.x, and then to com.uc.browser.core.d.c.f.e:

    public final void e(l arg9) {
int v4_5;
String v3_1;
byte[] v3;
byte[] v1 = null;
if(arg9 == null) {
v3 = v1;
}
else {
v3_1 = arg9.iGX.ipR;
StringBuilder v4 = new StringBuilder("[");
v4.append(v3_1);
v4.append("]product:");
v4.append(arg9.iGX.ipR);
v4 = new StringBuilder("[");
v4.append(v3_1);
v4.append("]version:");
v4.append(arg9.iGX.iEn);
v4 = new StringBuilder("[");
v4.append(v3_1);
v4.append(arg9.iGX.mMode);
v4 = new StringBuilder("[");
v4.append(v3_1);
v4.append("]force_flag:");
v4.append(arg9.iGX.iEo);
v4 = new StringBuilder("[");
v4.append(v3_1);
v4.append("]silent_mode:");
v4.append(arg9.iGX.iDQ);
v4 = new StringBuilder("[");
v4.append(v3_1);
v4.append("]silent_type:");
v4.append(arg9.iGX.iEr);
v4 = new StringBuilder("[");
v4.append(v3_1);
v4.append("]silent_state:");
v4.append(arg9.iGX.iEp);
v4 = new StringBuilder("[");
v4.append(v3_1);
v4.append("]silent_file:");
v4.append(arg9.iGX.iEq);
v4 = new StringBuilder("[");
v4.append(v3_1);
v4.append("]apk_md5:");
v4.append(arg9.iGX.iEl);
v4 = new StringBuilder("[");
v4.append(v3_1);
v4 = new StringBuilder("[");
v4.append(v3_1);
v4 = new StringBuilder("[");
v4.append(v3_1);
v4.append(arg9.iGH);
v4 = new StringBuilder("[");
v4.append(v3_1);
v4.append("]apollo_child_version:");
v4.append(arg9.iGX.iEx);
v4 = new StringBuilder("[");
v4.append(v3_1);
v4.append("]apollo_series:");
v4.append(arg9.iGX.iEw);
v4 = new StringBuilder("[");
v4.append(v3_1);
v4.append("]apollo_cpu_arch:");
v4.append(arg9.iGX.iEt);
v4 = new StringBuilder("[");
v4.append(v3_1);
v4.append("]apollo_cpu_vfp3:");
v4.append(arg9.iGX.iEv);
v4 = new StringBuilder("[");
v4.append(v3_1);
v4.append("]apollo_cpu_vfp:");
v4.append(arg9.iGX.iEu);
ArrayList v3_2 = arg9.iGX.iEz;
if(v3_2 != null && v3_2.size() != 0) {
Iterator v3_3 = v3_2.iterator();
while(v3_3.hasNext()) {
Object v4_1 = v3_3.next();
StringBuilder v5 = new StringBuilder("[");
v5.append(((au)v4_1).getName());
v5.append("]component_name:");
v5.append(((au)v4_1).getName());
v5 = new StringBuilder("[");
v5.append(((au)v4_1).getName());
v5.append("]component_ver_name:");
v5 = new StringBuilder("[");
v5.append(((au)v4_1).getName());
v5.append("]component_ver_code:");
v5.append(((au)v4_1).gBl);
v5 = new StringBuilder("[");
v5.append(((au)v4_1).getName());
v5.append("]component_req_type:");
v5.append(((au)v4_1).gBq);
}
}

j v3_4 = new j();
m.b(v3_4);
h v4_2 = new h();
m.b(v4_2);
ay v5_1 = new ay();
v3_4.hS("");
v3_4.setImsi("");
v3_4.hV("");
v5_1.bPQ = v3_4;
v5_1.bPP = v4_2;
v5_1.yr(arg9.iGX.ipR);
v5_1.gBF = arg9.iGX.mMode;
v5_1.gBI = arg9.iGX.iEz;
v3_2 = v5_1.gAr;
c.aBh();
String v4_3 = com.uc.b.a.a.c.Pd();
boolean v4_4 = v4_3 == null || !v4_3.contains("neon") ? false : true;
c.aBh();
v4_5 = e.getScreenWidth();
int v6 = e.getScreenHeight();
StringBuilder v7 = new StringBuilder();
v7.append(v4_5);
v7.append("*");
v7.append(v6);
v3_2.add(g.fs("api_level", String.valueOf(Build$VERSION.SDK_INT))); v3_2.add(g.fs("uc_apk_list", SystemHelper.getUCMobileApks())); Iterator v4_6 = arg9.iGX.iEA.entrySet().iterator(); while(v4_6.hasNext()) { Object v6_1 = v4_6.next(); v3_2.add(g.fs(((Map$Entry)v6_1).getKey(), ((Map$Entry)v6_1).getValue())); } v3 = v5_1.toByteArray(); } if(v3 == null) { this.iGY.iGI.a(arg9, "up_encode", "yes", "fail"); return; } v4_5 = this.iGY.iGw ? 0x1F : 0; if(v3 == null) { } else { v3 = g.i(v4_5, v3); if(v3 == null) { } else { v1 = new byte[v3.length + 16]; byte[] v6_2 = new byte[16]; Arrays.fill(v6_2, 0); v6_2[0] = 0x5F; v6_2[1] = 0; v6_2[2] = ((byte)v4_5); v6_2[3] = -50; System.arraycopy(v6_2, 0, v1, 0, 16); System.arraycopy(v3, 0, v1, 16, v3.length); } } if(v1 == null) { this.iGY.iGI.a(arg9, "up_encrypt", "yes", "fail"); return; } if(TextUtils.isEmpty(this.iGY.mUpgradeUrl)) { this.iGY.iGI.a(arg9, "up_url", "yes", "fail"); return; } StringBuilder v0 = new StringBuilder("["); v0.append(arg9.iGX.ipR); v0.append("]url:"); v0.append(this.iGY.mUpgradeUrl); com.uc.browser.core.d.c.i v0_1 = this.iGY.iGI; v3_1 = this.iGY.mUpgradeUrl; com.uc.base.net.e v0_2 = new com.uc.base.net.e(new com.uc.browser.core.d.c.i$a(v0_1, arg9));
v3_1 = v3_1.contains("?") ? v3_1 + "&dataver=pb" : v3_1 + "?dataver=pb";
n v3_5 = v0_2.uc(v3_1);
m.b(v3_5, false);
v3_5.setMethod("POST");
v3_5.setBodyProvider(v1);
v0_2.b(v3_5);
this.iGY.iGI.a(arg9, "up_null", "yes", "success");
this.iGY.iGI.b(arg9);
}

We can see this is where the POST request is made. Have a look at the 16-byte array that contains: 0x5F, 0, 0x1F, -50 (=0xCE). Values are the same as in the above request.

The same class contains a nested class with another interesting method:

        public final void a(l arg10, byte[] arg11) {
f v0 = this.iGQ;
StringBuilder v1 = new StringBuilder("[");
v1.append(arg10.iGX.ipR);
byte[] v1_1 = null;
if(arg11 == null) {
}
else if(arg11.length < 16) {
}
else {
if(arg11[0] != 0x60 && arg11[3] != 0xFFFFFFD0) {
goto label_57;
}
int v3 = 1;
int v5 = arg11[1] == 1 ? 1 : 0;
if(arg11[2] != 1 && arg11[2] != 11) {
if(arg11[2] == 0x1F) {
}
else {
v3 = 0;
}
}
byte[] v7 = new byte[arg11.length - 16];
System.arraycopy(arg11, 16, v7, 0, v7.length);
if(v3 != 0) {
v7 = g.j(arg11[2], v7);
}
if(v7 == null) {
goto label_57;
}
if(v5 != 0) {
v1_1 = g.P(v7);
goto label_57;
}
v1_1 = v7;
}
label_57:
if(v1_1 == null) {
v0.iGY.iGI.a(arg10, "up_decrypt", "yes", "fail");
return;
}
q v11 = g.b(arg10, v1_1);
if(v11 == null) {
v0.iGY.iGI.a(arg10, "up_decode", "yes", "fail");
return;
}
if(v0.iGY.iGt) {
v0.d(arg10);
}
if(v0.iGY.iGo != null) {
v0.iGY.iGo.a(0, ((o)v11));
}
if(v0.iGY.iGs) {
v0.iGY.a(((o)v11));
v0.iGY.iGI.a(v11, "up_silent", "yes", "success");
v0.iGY.iGI.a(v11);
return;
}
v0.iGY.iGI.a(v11, "up_silent", "no", "success");
}
}

This method receives an input of a bytes array and checks whether the zero byte is 0x60, or the third byte is 0xD0 and if the second byte is 1, 11, or 0x1F. Check out the server response: zero byte is 0x60, the second byte is 0x1F, the third byte is 0x60. It looks like what we need. Judging by the strings («up_decrypt», for example), a method is supposed to be called here to decrypt the server response. Now let’s look at the g.j method. Note that the first argument is a byte at offset 2 (that is, 0x1F in our case), and the second is the server response without the first 16 bytes.

     public static byte[] j(int arg1, byte[] arg2) {
if(arg1 == 1) {
}
else if(arg1 == 11) {
arg2 = m.aF(arg2);
}
else if(arg1 != 0x1F) {
}
else {
arg2 = EncryptHelper.decrypt(arg2);
}
return arg2;
}

Obviously, it is selecting the decryption algorithm, and the byte that in our case equals 0x1F indicates one of three possible options.

Let’s go back to the code analysis. After a couple of jumps, we get to the method with the telltale name, decryptBytesByKey. Now, two more bytes are separated from our response and form a string. It’s clear this is how the key is selected to decrypt the message.

    private static byte[] decryptBytesByKey(byte[] bytes) {
byte[] v0 = null;
if(bytes != null) {
try {
if(bytes.length < EncryptHelper.PREFIX_BYTES_SIZE) {
}
else if(bytes.length == EncryptHelper.PREFIX_BYTES_SIZE) {
return v0;
}
else {
byte[] prefix = new byte[EncryptHelper.PREFIX_BYTES_SIZE];  // 2 байта
System.arraycopy(bytes, 0, prefix, 0, prefix.length);
String keyId = c.ayR().d(ByteBuffer.wrap(prefix).getShort()); // Выбор ключа
if(keyId == null) {
return v0;
}
else {
a v2 = EncryptHelper.ayL();
if(v2 == null) {
return v0;
}
else {
byte[] enrypted = new byte[bytes.length - EncryptHelper.PREFIX_BYTES_SIZE];
System.arraycopy(bytes, EncryptHelper.PREFIX_BYTES_SIZE, enrypted, 0, enrypted.length);
return v2.l(keyId, enrypted);
}
}
}
}
catch(SecException v7_1) {
EncryptHelper.handleDecryptException(((Throwable)v7_1), v7_1.getErrorCode());
return v0;
}
catch(Throwable v7) {
EncryptHelper.handleDecryptException(v7, 2);
return v0;
}
}

return v0;
}

Jumping ahead a bit, note that at this stage, it’s only the key identifier, not the key itself. Key selection is going to be a little more complicated.

In the next method, two more parameters are added to the existing ones, so we get a total of four. The magic number 16, the key identifier, the encrypted data, and a string is added there for some reason (empty in our case).

    public final byte[] l(String keyId, byte[] encrypted) throws SecException {
return this.ayJ().staticBinarySafeDecryptNoB64(16, keyId, encrypted, "");
}

After a series of jumps, we see the staticBinarySafeDecryptNoB64 method of the com.alibaba.wireless.security.open.staticdataencrypt.IStaticDataEncryptComponent interface. The main application code has no classes that implement this interface. This class is contained in the file lib/armeabi-v7a/libsgmain.so, which is not really .SO, but rather .JAR. The method we are interested in is implemented as follows:

package com.alibaba.wireless.security.a.i;

// ...

public class a implements IStaticDataEncryptComponent {
private ISecurityGuardPlugin a;
// ...
private byte[] a(int mode, int magicInt, int xzInt, String keyId, byte[] encrypted, String magicString) {
return this.a.getRouter().doCommand(10601, new Object[]{Integer.valueOf(mode), Integer.valueOf(magicInt), Integer.valueOf(xzInt), keyId, encrypted, magicString});
}
// ...
private byte[] b(int magicInt, String keyId, byte[] encrypted, String magicString) {
return this.a(2, magicInt, 0, keyId, encrypted, magicString);
}
// ...
public byte[] staticBinarySafeDecryptNoB64(int magicInt, String keyId, byte[] encrypted, String magicString) throws SecException {
if(keyId != null && keyId.length() > 0 && magicInt >= 0 && magicInt < 19 && encrypted != null && encrypted.length > 0) {
return this.b(magicInt, keyId, encrypted, magicString);
}

throw new SecException("", 301);
}
//...
}

Here, our parameter list is supplemented with two more integers: 2 and 0. Apparently, 2 means decryption, as in the doFinal method of the javax.crypto.Cipher system class. Then, this information is transmitted to a certain Router along with the number 10601, which is apparently the command number.

After the next chain of jumps, we find a class that implements the RouterComponent interface and the doCommand method:

package com.alibaba.wireless.security.mainplugin;

import com.alibaba.wireless.security.framework.IRouterComponent;

public class a implements IRouterComponent {
public a() {
super();
}
public Object doCommand(int arg2, Object[] arg3) {
return JNICLibrary.doCommandNative(arg2, arg3);
}
}

There is also the JNICLibrary class, where the native doCommandNative method is declared:

package com.taobao.wireless.security.adapter;

public class JNICLibrary {
public static native Object doCommandNative(int arg0, Object[] arg1);
}

So, we need to find the doCommandNative method in the native code. That’s where the fun begins.

### Machine code obfuscation

There is one native library in the libsgmain.so file (which is actually a .JAR file and, like we said above, implements some encryption related interfaces): libsgmainso-6.4.36.so. We load it in IDA and get a bunch of dialogs with error messages. The problem is that the section header table is invalid. This is done on purpose to complicate the analysis.

But we don’t really need it anyway. The program header table is enough to correctly load the ELF file and analyze it. So we simply delete the section header table, nulling the corresponding fields in the header.

Then we open the file in IDA again.

We have two ways to tell the Java virtual machine exactly where the native library contains the implementation of the method declared as native in the Java code. The first is to give it a name like this: Java_package_name_ClassName_methodName. The second one is to register it when loading the library (in the JNI_OnLoad function) by calling the RegisterNatives function. In our case, if you use the first method, the name should be like this: Java_com_taobao_wireless_security_adapter_JNICLibrary_doCommandNative. The list of exported functions doesn’t contain this name, which means we need to look for the RegisterNatives. Thus, we go to the JNI_OnLoad function and see the following:

What’s going on here? At first glance, the beginning and end of the function are typical of the ARM architecture. The first instruction pushes the contents of the registers that the function will use to the stack (in this case, R0, R1, and R2), as well as the contents of the LR register with the function’s return address. The last instruction restores the saved registers and puts the return address to the PC register, thus returning from the function. But if we take a closer look, we may notice that the penultimate instruction changes the return address, stored on the stack. Let’s calculate what it will be when the code is executed. The address 0xB130 loads in R1, has 5 subtracted from it, then is moved to R0 and receives an addition of 0x10. In the end, it equals 0xB13B. Thus, IDA thinks that the final instruction performs a normal function return, while in fact, it performs a transfer to the calculated address 0xB13B.

Now let us remind you that ARM processors have two modes and two sets of instructions — ARM and Thumb. The low-order bit of the address determines which instruction set the processor will use That is, the address is actually 0xB13A, while the value in the low-order bit indicates the Thumb mode.
A similar “adapter” and some semantic garbage are added to the beginning of each function in this library. But we won’t dwell on them in detail. Just remember that the real beginning of almost all functions is a little further.

Since no explicit transition to 0xB13A in the code exists, IDA cannot recognize that there is code there. For the same reason, it does not recognize most of the code in the library as code, which makes analyzing a bit trickier. So, we tell IDA that there is code, and this is what happens:

Starting from 0xB144, we clearly have the table. But what about sub_494C?

When calling this function in the LR register, we get the address of the above-mentioned table (0xB144). R0 contains the index in this table. That is, we take the value from the table, add it to LR, and obtain the address we need to go to. Let's try to calculate it: 0xB144 + [0xB144 + 8 * 4] = 0xB144 + 0x120 = 0xB264. We navigate to this address and see a couple of useful instructions, then go to 0xB140:

Now there will be a transition at offset with the index 0x20 from the table.
Judging by the size of the table, there will be many such transitions in the code. So we would want to deal with this automatically and avoid manually calculating addresses. Thus, scripts and the ability to patch code in IDA come to our rescue:

def put_unconditional_branch(source, destination):
offset = (destination - source - 4) >> 1
if offset > 2097151 or offset < -2097152:
raise RuntimeError("Invalid offset")
if offset > 1023 or offset < -1024:
instruction1 = 0xf000 | ((offset >> 11) & 0x7ff)
instruction2 = 0xb800 | (offset & 0x7ff)
patch_word(source, instruction1)
patch_word(source + 2, instruction2)
else:
instruction = 0xe000 | (offset & 0x7ff)
patch_word(source, instruction)

ea = here()
if get_wide_word(ea) == 0xb503: #PUSH {R0,R1,LR}
ea1 = ea + 2
if get_wide_word(ea1) == 0xbf00: #NOP
ea1 += 2
if get_operand_type(ea1, 0) == 1 and get_operand_value(ea1, 0) == 0 and get_operand_type(ea1, 1) == 2:
index = get_wide_dword(get_operand_value(ea1, 1))
print "index =", hex(index)
ea1 += 2
if get_operand_type(ea1, 0) == 7:
table = get_operand_value(ea1, 0) + 4
elif get_operand_type(ea1, 1) == 2:
table = get_operand_value(ea1, 1) + 4
else:
print "Wrong operand type on", hex(ea1), "-", get_operand_type(ea1, 0), get_operand_type(ea1, 1)
table = None
if table is None:
print "Unable to find table"
else:
print "table =", hex(table)
offset = get_wide_dword(table + (index << 2))
put_unconditional_branch(ea, table + offset)
else:
print "Unknown code", get_operand_type(ea1, 0), get_operand_value(ea1, 0), get_operand_type(ea1, 1) == 2
else:
print "Unable to detect first instruction"

We put the cursor on the 0xB26A string, run the script, and see the transition to 0xB4B0:

Again, IDA does not recognize this place as code. We help it and see another structure there:

Instructions that go after BLX do not appear very meaningful; they are more like some kind of offset. We look at sub_4964:

Indeed, it takes DWORD at the address from LR, adds it to this address, then takes the value at the resulting address and stores it in the stack. Additionally, it adds 4 to LR to jump this same offset after returning from the function. Then the POP {R1} command takes the resulting value from the stack. Looking at what is located at the address 0xB4BA + 0xEA = 0xB5A4, we can see something similar to the address table:

To patch this structure, we need to get two parameters from the code: the offset and the register number where we want to push the result. We will have to prepare a piece of code in advance for each possible register.

patches = {}
patches[0] = (0x00, 0xbf, 0x01, 0x48, 0x00, 0x68, 0x02, 0xe0)
patches[1] = (0x00, 0xbf, 0x01, 0x49, 0x09, 0x68, 0x02, 0xe0)
patches[2] = (0x00, 0xbf, 0x01, 0x4a, 0x12, 0x68, 0x02, 0xe0)
patches[3] = (0x00, 0xbf, 0x01, 0x4b, 0x1b, 0x68, 0x02, 0xe0)
patches[4] = (0x00, 0xbf, 0x01, 0x4c, 0x24, 0x68, 0x02, 0xe0)
patches[5] = (0x00, 0xbf, 0x01, 0x4d, 0x2d, 0x68, 0x02, 0xe0)
patches[8] = (0x00, 0xbf, 0xdf, 0xf8, 0x06, 0x80, 0xd8, 0xf8, 0x00, 0x80, 0x01, 0xe0)
patches[9] = (0x00, 0xbf, 0xdf, 0xf8, 0x06, 0x90, 0xd9, 0xf8, 0x00, 0x90, 0x01, 0xe0)
patches[10] = (0x00, 0xbf, 0xdf, 0xf8, 0x06, 0xa0, 0xda, 0xf8, 0x00, 0xa0, 0x01, 0xe0)
patches[11] = (0x00, 0xbf, 0xdf, 0xf8, 0x06, 0xb0, 0xdb, 0xf8, 0x00, 0xb0, 0x01, 0xe0)

ea = here()
if (get_wide_word(ea) == 0xb082 #SUB SP, SP, #8
and get_wide_word(ea + 2) == 0xb503): #PUSH {R0,R1,LR}
if get_operand_type(ea + 4, 0) == 7:
pop = get_bytes(ea + 12, 4, 0)
if pop[1] == '\xbc':
register = -1
r = get_wide_byte(ea + 12)
for i in range(8):
if r == (1 << i):
register = i
break
if register == -1:
print "Unable to detect register"
else:
address = get_wide_dword(ea + 8) + ea + 8
for b in patches[register]:
patch_byte(ea, b)
ea += 1
if ea % 4 != 0:
ea += 2
elif pop[:3] == '\x5d\xf8\x04':
register = ord(pop[3]) >> 4
if register in patches:
address = get_wide_dword(ea + 8) + ea + 8
for b in patches[register]:
patch_byte(ea, b)
ea += 1
else:
else:
print "Wrong operand type on +4:", get_operand_type(ea + 4, 0)
else:
print "Unable to detect first instructions"

We put the cursor at the beginning of the structure we want to replace (i.e. 0xB4B2) and run the script:

In addition to the already mentioned structures, the code includes the following:

As in the previous case, there is an offset after the BLX instruction:

We take the offset at the address from LR, add it to LR, and navigate there. 0x72044 + 0xC = 0x72050. The script for this structure is quite simple:

def put_unconditional_branch(source, destination):
offset = (destination - source - 4) >> 1
if offset > 2097151 or offset < -2097152:
raise RuntimeError("Invalid offset")
if offset > 1023 or offset < -1024:
instruction1 = 0xf000 | ((offset >> 11) & 0x7ff)
instruction2 = 0xb800 | (offset & 0x7ff)
patch_word(source, instruction1)
patch_word(source + 2, instruction2)
else:
instruction = 0xe000 | (offset & 0x7ff)
patch_word(source, instruction)

ea = here()
if get_wide_word(ea) == 0xb503: #PUSH {R0,R1,LR}
ea1 = ea + 6
if get_wide_word(ea + 2) == 0xbf00: #NOP
ea1 += 2
offset = get_wide_dword(ea1)
put_unconditional_branch(ea, (ea1 + offset) & 0xffffffff)
else:
print "Unable to detect first instruction"

The result of executing the script:

After we patch everything in this function, we can point the IDA to its real beginning. It will collect the entire function code piece by piece and we’ll be able to decompile it using HexRays.

### Decrypting the strings

We’ve learned how to deal with machine code obfuscation in the libsgmainso-6.4.36.so library from UC Browser and obtained the function code JNI_OnLoad.

int __fastcall real_JNI_OnLoad(JavaVM *vm)
{
int result; // r0
jclass clazz; // r0 MAPDST
int v4; // r0
JNIEnv *env; // r4
int v6; // [sp-40h] [bp-5Ch]
int v7; // [sp+Ch] [bp-10h]
v7 = *(_DWORD *)off_8AC00;
if ( !vm )
goto LABEL_39;
sub_7C4F4();
env = (JNIEnv *)sub_7C5B0(0);
if ( !env )
goto LABEL_39;
v4 = sub_72CCC();
sub_73634(v4);
sub_73E24(&unk_83EA6, &v6, 49);
clazz = (jclass)((int (__fastcall *)(JNIEnv *, int *))(*env)->FindClass)(env, &v6);
if ( clazz
&& (sub_9EE4(),
sub_71D68(env),
sub_E7DC(env) >= 0
&& sub_69D68(env) >= 0
&& sub_197B4(env, clazz) >= 0
&& sub_E240(env, clazz) >= 0
&& sub_B8B0(env, clazz) >= 0
&& sub_5F0F4(env, clazz) >= 0
&& sub_70640(env, clazz) >= 0
&& sub_11F3C(env) >= 0
&& sub_21C3C(env, clazz) >= 0
&& sub_2148C(env, clazz) >= 0
&& sub_210E0(env, clazz) >= 0
&& sub_41B58(env, clazz) >= 0
&& sub_27920(env, clazz) >= 0
&& sub_293E8(env, clazz) >= 0
&& sub_208F4(env, clazz) >= 0) )
{
result = (sub_B7B0(env, clazz) >> 31) | 0x10004;
}
else
{
LABEL_39:
result = -1;
}
return result;
}

Let’s look into the following strings:

  sub_73E24(&unk_83EA6, &v6, 49);
clazz = (jclass)((int (__fastcall *)(JNIEnv *, int *))(*env)->FindClass)(env, &v6);

It’s quite clear that the sub_73E24 function decrypts the class name. Parameters of this function contain a pointer to the data that look similar to those encrypted, a kind of buffer, and a number. Obviously, a decrypted string will be in the buffer after a call to the function, since the buffer goes to the FindClass function, which receives the same class name as the second parameter. So the number is the size of the buffer or the length of the string. Let’s try to decrypt the name of the class. It should indicate if we are going in the right direction. Let’s take a closer look at what happens in sub_73E24.

int __fastcall sub_73E56(unsigned __int8 *in, unsigned __int8 *out, size_t size)
{
int v4; // r6
int v7; // r11
int v8; // r9
int v9; // r4
size_t v10; // r5
int v11; // r0
struc_1 v13; // [sp+0h] [bp-30h]
int v14; // [sp+1Ch] [bp-14h]
int v15; // [sp+20h] [bp-10h]
v4 = 0;
v15 = *(_DWORD *)off_8AC00;
v14 = 0;
v7 = sub_7AF78(17);
v8 = sub_7AF78(size);
if ( !v7 )
{
v9 = 0;
goto LABEL_12;
}
(*(void (__fastcall **)(int, const char *, int))(v7 + 12))(v7, "DcO/lcK+h?m3c*q@", 16);
if ( !v8 )
{
LABEL_9:
v4 = 0;
goto LABEL_10;
}
v4 = 0;
if ( !in )
{
LABEL_10:
v9 = 0;
goto LABEL_11;
}
v9 = 0;
if ( out )
{
memset(out, 0, size);
v10 = size - 1;
(*(void (__fastcall **)(int, unsigned __int8 *, size_t))(v8 + 12))(v8, in, v10);
memset(&v13, 0, 0x14u);
v13.field_4 = 3;
v13.field_10 = v7;
v13.field_14 = v8;
v11 = sub_6115C(&v13, &v14);
v9 = v11;
if ( v11 )
{
if ( *(_DWORD *)(v11 + 4) == v10 )
{
qmemcpy(out, *(const void **)v11, v10);
v4 = *(_DWORD *)(v9 + 4);
}
else
{
v4 = 0;
}
goto LABEL_11;
}
goto LABEL_9;
}
LABEL_11:
sub_7B148(v7);
LABEL_12:
if ( v8 )
sub_7B148(v8);
if ( v9 )
sub_7B148(v9);
return v4;
}

The sub_7AF78 function creates a container instance for byte arrays of the specified size (we will not focus on them in detail). Two such containers are created here: one contains the string "DcO/lcK+h?m3c*q@" (it is easy to guess that this is the key), the other has the encrypted data. Both objects are then placed in a certain structure, which transfers to the sub_6115C function. We may also note that this structure contains a field with a value of 3. Let’s see what happens next.

int __fastcall sub_611B4(struc_1 *a1, _DWORD *a2)
{
int v3; // lr
unsigned int v4; // r1
int v5; // r0
int v6; // r1
int result; // r0
int v8; // r0
*a2 = 820000;
if ( a1 )
{
v3 = a1->field_14;
if ( v3 )
{
v4 = a1->field_4;
if ( v4 < 0x19 )
{
switch ( v4 )
{
case 0u:
v8 = sub_6419C(a1->field_0, a1->field_10, v3);
goto LABEL_17;
case 3u:
v8 = sub_6364C(a1->field_0, a1->field_10, v3);
goto LABEL_17;
case 0x10u:
case 0x11u:
case 0x12u:
v8 = sub_612F4(
a1->field_0,
v4,
*(_QWORD *)&a1->field_8,
*(_QWORD *)&a1->field_8 >> 32,
a1->field_10,
v3,
a2);
goto LABEL_17;
case 0x14u:
v8 = sub_63A28(a1->field_0, v3);
goto LABEL_17;
case 0x15u:
sub_61A60(a1->field_0, v3, a2);
return result;
case 0x16u:
v8 = sub_62440(a1->field_14);
goto LABEL_17;
case 0x17u:
v8 = sub_6226C(a1->field_10, v3);
goto LABEL_17;
case 0x18u:
v8 = sub_63530(a1->field_14);
LABEL_17:
v6 = 0;
if ( v8 )
{
*a2 = 0;
v6 = v8;
}
return v6;
default:
LOWORD(v5) = 28032;
goto LABEL_5;
}
}
}
}
LOWORD(v5) = -27504;
LABEL_5:
HIWORD(v5) = 13;
v6 = 0;
*a2 = v5;
return v6;
}

The field with the previously assigned value 3 is transitioned as the switch parameter. Let’s take a look at case 3 — parameters that the previous function added to the structure (i.e., the key and the encrypted data) are transitioned to the sub_6364C function. If we look closely at sub_6364C, we can recognize the RC4 algorithm.

So we have an algorithm and a key. Let’s try to decrypt the name of the class. This is what we’ve got: com/taobao/wireless/security/adapter/JNICLibrary. Brilliant! We are on the right track.

### Command tree

Now we need to find the call to RegisterNatives, which will point us to the doCommandNative function. So we look through the functions called from JNI_OnLoad, and find it in sub_B7B0:

int __fastcall sub_B7F6(JNIEnv *env, jclass clazz)
{
char signature[41]; // [sp+7h] [bp-55h]
char name[16]; // [sp+30h] [bp-2Ch]
JNINativeMethod method; // [sp+40h] [bp-1Ch]
int v8; // [sp+4Ch] [bp-10h]

v8 = *(_DWORD *)off_8AC00;
decryptString((unsigned __int8 *)&unk_83ED9, (unsigned __int8 *)name, 0x10u);// doCommandNative
decryptString((unsigned __int8 *)&unk_83EEA, (unsigned __int8 *)signature, 0x29u);// (I[Ljava/lang/Object;)Ljava/lang/Object;
method.name = name;
method.signature = signature;
method.fnPtr = sub_B69C;
return ((int (__fastcall *)(JNIEnv *, jclass, JNINativeMethod *, int))(*env)->RegisterNatives)(env, clazz, &method, 1) >> 31;
}

And indeed, a native method with the name doCommandNative is registered here. Now we know its address. Let’s have a look at what it does.

int __fastcall doCommandNative(JNIEnv *env, jobject obj, int command, jarray args)
{
int v5; // r5
struc_2 *a5; // r6
int v9; // r1
int v11; // [sp+Ch] [bp-14h]
int v12; // [sp+10h] [bp-10h]
v5 = 0;
v12 = *(_DWORD *)off_8AC00;
v11 = 0;
a5 = (struc_2 *)malloc(0x14u);
if ( a5 )
{
a5->field_0 = 0;
a5->field_4 = 0;
a5->field_8 = 0;
a5->field_C = 0;
v9 = command % 10000 / 100;
a5->field_0 = command / 10000;
a5->field_4 = v9;
a5->field_8 = command % 100;
a5->field_C = env;
a5->field_10 = args;
v5 = sub_9D60(command / 10000, v9, command % 100, 1, (int)a5, &v11);
}
free(a5);
if ( !v5 && v11 )
sub_7CF34(env, v11, &byte_83ED7);
return v5;
}

The name suggests that it is the entry point for all functions the developers transferred to the native library. We’re specifically interested in function number 10601.

From the code, we can see that the command number gives us three numbers: command / 10000, command % 10000 / 100, and command % 10 (in our case, 1, 6, and 1). These three numbers, as well as the pointer to JNIEnv and the arguments transferred to the function, make up a structure and are passed on. With these three numbers (we’ll denote them N1, N2, and N3), a command tree is constructed. Something like this:

The tree is created dynamically in JNI_OnLoad. Three numbers encode the path in the tree. Each leaf of the tree contains the corresponding function’s xorred address. The key is in the parent node. It is quite easy to find a place in the code where the function we need is added to the tree if we understand all the structures there (we won’t spend time describing them in this article).

### More obfuscation

We’ve got the address of the function that is supposed to decrypt the traffic: 0x5F1AC. But it’s still too early to relax — the UC Browser developers have another surprise for us.

After receiving the parameters from an array in the Java code, we go to the function at 0x4D070. Another type of code obfuscation is already waiting.

We then push two indices in R7 and R4:

The first index moves to R11:

We use this index to obtain the address from the table:

After transferring to the first address, we use the second index from R4. The table contains 230 elements.

What do we do with it? We could tell the IDA that it is a sort of a switch: Edit -> Other -> Specify switch idiom.

The resulting code is horrendous. However, we can see the call to the familiar sub_6115C function in its tangles:

There was the switch parameter with the RC4 decryption in case 3. In this case, the structure that transfers to the function is filled with the parameters transferred to doCommandNative. We recall that we had magicInt there with value 16. We look at the corresponding case and after several transitions, find the code that helps us identify the algorithm.

It’s AES!

We have an algorithm and only need to get its parameters, such as mode, key, and (possibly) the initialization vector (its presence depends on the AES algorithm operation mode). The structure that contains them should be created somewhere before calling the sub_6115C function. But since this part of the code is particularly well obfuscated, we decided to patch the code so all parameters of the decryption function could be dumped into a file.

### Patch

If you don’t want to manually write all the patch code in the assembly language, you can run Android Studio, code a function that receives the same parameters as our decryption function and writes to the file, then copy the resulting code generated by the compiler.

Our good friends from the UC Browser team also “ensured” the convenience of adding code. We do remember that we have garbage code at the beginning of each function, which can easily be replaced with any other code. Very convenient :) However, there is not enough room at the beginning of the target function for the code that saves all parameters to a file. We had to divide it into parts and use the garbage blocks of neighboring functions. We got four parts in total.
Part one:

The first four function parameters in the ARM architecture are indicated in the R0-R3 registers, while the rest, if any, goes via the stack. The LR register indicates the return address. We need to save all this data so the function can work after we dump its parameters. We also need to save all the registers that we use in the process, so we use PUSH.W {R0-R10,LR}. In R7, we get the address of the list of parameters transferred to the function via the stack.

Using the fopen function, we open the /data/local/tmp/aes file in the “ab” mode (so that we could add something). We then load the file name address in R0 and the string address that indicates the mode in R1. This is where the garbage code ends, so we navigate to the next function. Since we want it to continue working, we put the transition to the actual function code in the beginning, before the garbage, and replace the garbage with the rest of the patch.

We then call fopen.

The first three parameters of the aes function are of the int type. Since we pushed the registers to the stack at the beginning, we can simply transfer their addresses in the stack to the fwrite function.

Next, we have three structures that indicate the size of the data and contain a pointer to the data for the key, the initialization vector, and the encrypted data.

At the end, we close the file, restore the registers, and give control back to the actual aes function.
We compile the APK file with the patched library, sign it, download it onto a device or emulator, and run it. Now we see the dump has been created with a lot of data. The browser does not just decrypt traffic, but other data too, and all decryption is performed via this function. For some reason, we don’t see the data we need, and the request we are expecting is not visible in the traffic. Let’s skip waiting until UC Browser has the chance to make this request and take the encrypted response obtained earlier from the server to patch the application again. We’ll add the decryption to the onCreate of the main activity.

    const/16 v1, 0x62
new-array v1, v1, [B
fill-array-data v1, :encrypted_data
const/16 v0, 0x1f
invoke-static {v0, v1}, Lcom/uc/browser/core/d/c/g;->j(I[B)[B
move-result-object v1
array-length v2, v1
invoke-static {v2}, Ljava/lang/String;->valueOf(I)Ljava/lang/String;
move-result-object v2
const-string v0, "ololo"
invoke-static {v0, v2}, Landroid/util/Log;->d(Ljava/lang/String;Ljava/lang/String;)I

We compile it, sign, install, and run. Thus, we get a NullPointerException since the method returns a null value.

After analyzing of the code further, we found a function with rather interesting strings: “META-INF/” and “.RSA”. Looks like the app verifies its certificate, or even generates keys from it. We don’t really want to dig into what is happening with the certificate, so let’s just give it the correct certificate. We’ll patch the encrypted string, so instead of “META-INF/” we get “BLABLINF/”, we create a folder with this name in the APK file, and save the browser certificate in it.

We compile it, sign, install, and run. Bingo! We have the key!

### MitM

Now we’ve got the key and an equal initialization vector. Let’s try to decrypt the server response in the CBC mode.

We see the archive URL, something like MD5, “extract_unzipsize”, and a number. Let us check. The MD5 of the archive is the same; the size of the unzipped library is the same. Now we’ll try to patch this library and transmit it to the browser. To show that our patched library has loaded, we’ll build an Intent to create the text message «PWNED!» We’ll replace two responses from the server: puds.ucweb.com/upgrade/index.xhtml and the one that prompts the archive download. In the first, we substitute MD5 (the size remains the same after unzipping); in the second, we send the archive with the patched library.

The browser makes several attempts to download the archive, resulting in an error. Apparently, something fishy is happening there. Analyzing this bizarre format, we found that the server also transmits the archive size:

It is LEB128 encoded. The patch slightly changes the size of the compressed library, so the browser decided that the archive broke upon download and displayed an error after several attempts.
So we fix the archive size and… voila! :) See the result in the video.

### Consequences and developer’s response

In the same way, hackers can use this insecure feature of UC Browser to distribute and launch malicious libraries. These libraries will work in the context of the browser, resulting in full system privileges that the browser has. This grants them free reign to display phishing windows, as well as access to the browser’s working files including logins, passwords, and cookies in the database.

We contacted the UC Browser developers and informed them about the problem we had found, tried to point out the vulnerability and its danger, but they refused to discuss the matter. Meanwhile, the browser with the dangerous function remained in plain sight. Though as soon as we revealed the details of the vulnerability, it was impossible to ignore it as before. A new version of UC Browser 12.10.9.1193 was released on March 27, which accessed the server via HTTPS puds.ucweb.com/upgrade/index.xhtml. In addition, between the “bug fixing” and the time we wrote this article, an attempt to open a PDF in the browser resulted in an error message with the text, “Oops, something is wrong.” There was no request to the server when trying to open the PDF file. This was performed upon startup, though, which is a sign that the ability to download the executable code in violation of Google Play policies is still present.