Raw SQL queries and why you should care

Almost every android application uses SQLite in the background. It’s pretty powerful if you know what you’re doing. I’m going to show you what well done SQL script does for performance.

This is the old getContactList function:

public List<ContactItem> getContactsList() {
  Cursor senderCursor = db.query(DATABASE_TABLE, new String[] {
      "ProviderID", "Sender", "MAX(MessageID)" },
      "Sender IS NOT NULL AND IsIncoming = 1", null,
      "ProviderID, Sender", null, null);
  List<ContactItem> contactList = new ArrayList<ContactItem>();
  if (senderCursor != null) {
    while (senderCursor.moveToNext()) {   
      ContactItem c = new ContactItem();
      int providerId = senderCursor.getInt(0);
      String sender = senderCursor.getString(1);
      long messageId = senderCursor.getLong(2);

      c.setDisplayName(sender);
      c.setDisplayContactAddress(sender);
      ContactAddress ca = new ContactAddress();
      ca.setDisplayAddress(sender);
      ca.setParsedAddress(sender);
      ca.setIMProviderType(IMProviderTypes.values()[providerId]);
      c.setContactAddresses(new ContactAddress[] { ca });
      c.setLastMessageItem(this.getMessageItem(messageId));

      contactList.add(c);
    }
  }
  senderCursor.close();

  Cursor recipientCursor = db.query(DATABASE_TABLE, new String[] {
      "ProviderID", "Recipient", "MAX(MessageID)" },
      "Recipient IS NOT NULL AND IsIncoming = 0", null,
      "ProviderID, Recipient", null, null);
  if (recipientCursor != null) {
    while (recipientCursor.moveToNext()) {
      ContactItem c = new ContactItem();
      int providerId = recipientCursor.getInt(0);
      String recipient = recipientCursor.getString(1);
      long messageId = recipientCursor.getLong(2);

      c.setDisplayName(recipient);
      c.setDisplayContactAddress(recipient);
      ContactAddress ca = new ContactAddress();
      ca.setDisplayAddress(recipient);
      ca.setParsedAddress(recipient);
      ca.setIMProviderType(IMProviderTypes.values()[providerId]);
      c.setContactAddresses(new ContactAddress[] { ca });

      MessageItem msgItem = null;

      msgItem = this.getMessageItem(messageId);
      c.setLastMessageItem(msgItem);

      int count = contactList.size();
      boolean found = false;
      for (int i = 0; i < count; i++) {
        ContactItem c2 = contactList.get(i);
        if (c2.getContactAddresses()[0].equals(ca)) {
          found = true;
          MessageItem m2 = c2.getLastMessageItem();
          if (m2 == null
              || c2.getLastMessageItem().getMessageId() < messageId) {
            c.setLastMessageItem(msgItem);
            contactList.set(i, c);
            break;
          }
          break;
        }
      }
      if (!found) {
        contactList.add(c);
      }
    }
  }
  recipientCursor.close();
  return contactList;

}

I’ll admit, it’s pretty inefficient and here’s why. This generates a contact list based on your database messages. It first builds a small contact list based on every received message with including the last logged message. It does another SQL search per message to get that message information. So, if you have 100 contacts, we’re at 101 searches (1 for entire recipient list, 100 for each message). This is done via the getMessageItem(messageId) function.

Then it’ll search based on sender, who you’ve sent messages to (because sometimes we have to treat somebody as a contact, even though they’ve never responded). It’ll do the same. 1 search for result, and cross reference the received message list to append or update last message info and contacts.

So, we’re looking at 1 + x + 1 + (something between 0 and x) number of SQL queries requests using this method. If you were use batch the message information requests with an “IN” clause then it’ll be 1 + x + 1. Here, ‘x’ in signifies how many contacts you have. This is all achieved using the basic android/sqlite .query() function.

There’s also the issue where just because the message was written last to the database doesn’t mean it’s the last message received. This is gets screwed up when you start syncing old messages. In other words, instead of using MessageID, we should be using CreationDateTime

So what about raw queries? Well, compare and see. Here’s the optimized implementation:

public List<ContactItem> getContactsList() {
    String rawQuery = ""
        + "SELECT "
        + "  M2.MessageID, "
        + "  M2.ProviderID, "
        + "  M2.MessageStatusID, "
        + "  M2.IsIncoming, "
        + "  M2.Sender, "
        + "  M2.Recipient, "
        + "  M2.CreationDateTime, "
        + "  M2.LastSendAttemptDateTime, "
        + "  M2.CompletionDateTime, "
        + "  M2.MessageText, "
        + "  M2.ExtraData, "
        + "  M2.ExtraDataTypeID, "
        + "  M2.ImportMessageID, "
        + "  M2.ImportConversationID, "
        + "  M2.IsRead, "
        + "  LastMessageData.ProviderID, "
        + "  LastMessageData.Address "
        + "FROM "
        + "  ( "
        + "    SELECT "
        + "      ContactAddresses.ProviderID, "
        + "      ContactAddresses.Address, "
        + "      MAX(CreationDateTime) [LastMessageTime] "
        + "    FROM "
        + "      ( "
        + "        SELECT DISTINCT "
        + "          ProviderID, "
        + "          CASE WHEN IsIncoming = 1 THEN Sender ELSE Recipient END [Address] "
        + "        FROM "
        + "          Messages "
        + "      ) as ContactAddresses "
        + "      LEFT JOIN Messages M1 ON "
        + "        M1.ProviderID = ContactAddresses.ProviderID "
        + "        AND (M1.Sender = ContactAddresses.Address "
        + "          OR M1.Recipient = ContactAddresses.Address) "
        + "    GROUP BY "
        + "      ContactAddresses.ProviderID, "
        + "      ContactAddresses.Address "
        + "  ) AS LastMessageData "
        + "  LEFT JOIN Messages M2 ON  "
        + "    M2.CreationDateTime = LastMessageData.[LastMessageTime] "
        + "    AND (M2.Sender = LastMessageData.Address "
        + "      OR M2.Recipient = LastMessageData.Address) "
        + "ORDER BY                   "
        + "  LastMessageData.ProviderID, "
        + "  LastMessageData.Address,                "
        + "  M2.MessageID";
    Cursor cursor = db.rawQuery(rawQuery, null);
    List<ContactItem> contactList = new ArrayList<ContactItem>();
    if (cursor != null) {
      while (cursor.moveToNext()) {
        ContactItem c = new ContactItem();
        MessageItem m = parseMessageItemCursor(cursor);
        int index = allFields.length;
        int providerId = cursor.getInt(index++);
        String address = cursor.getString(index++);
        c.setDisplayName(address);
        c.setDisplayContactAddress(address);
        ContactAddress ca = new ContactAddress();
        ca.setDisplayAddress(address);
        ca.setParsedAddress(address);
        ca.setIMProviderType(IMProviderTypes.values()[providerId]);
        c.setContactAddresses(new ContactAddress[] { ca });
        c.setLastMessageItem(m);

        //check for extremely rare chance of exact same time
        int count = contactList.size();
        boolean found = false;
        for (int i = 0; i < count; i++) {
          ContactItem c2 = contactList.get(i);
          if (c2.getContactAddresses()[0].equals(ca)) {
            found = true;
            break;
          }
        }
        if (!found) {
          contactList.add(c);
        }

      }
    }
    cursor.close();

    return contactList;
}

Only one SQL query. SQL will return everything I need in a neat little package. The only thing I have to worry about is messages with exactly duplicated SenderID, Address and CreationDateTime. It’s ridiculously rare for a message to have an identical timestamp, but just in case, the ORDER BY clause means when that happens, the most recently inserted message would be on the bottom. I then check if the contact already exists and update it accordingly.

With a proper index on ServiceID and Sender/Recipient, then we can generate and load contact info in less than a second even when there are THOUSANDS of messaging in the database.

2 thoughts on “Raw SQL queries and why you should care

  1. izokronus

    Hey, I understood some of these words!
    No, but really, thanks for the post. This side of things might not be as sexy as UI stuff but it *is* appreciated

    Reply
  2. Nathan

    Just want to say that I’m really enjoying reading these posts. It’s not to say that I’m really understanding it all, but I really appreciate the time you’re taking both to write an application from scratch, but then taking the time to explain some of what you’re working on each day. Really nice dev work going on here. You’re a pro.

    Reply

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>