Saturday, February 4, 2012

Keeping Google App Engine (GAE) instances alive

So you finally finished an application and decide to deploy to Google's App Engine service, only to find out..... it's slow! After so many seconds, the JVM application instance gets shut down and takes a whopping 10 seconds to start back up. Of course, google wants you to pay for an always-on instance, but there are a few tricks to managing how you can keep an instance alive.

Google talks about warmup requests and how they can be used to offset the loading request time, but honestly I find this service to be a joke. If you wrote your application in Java, you probably ended up having this feature on by default; why is it still so slow!?

Since each scalable application in Google App Engine uses algorithms to determine how many instances need to be alive due to the amount of requests, there are equal algorithms to determine when to stop an instance. The GAE performance documentation states that the Least Recently Used (LRU) algorithm is used when determining what instance should shut down. Through random experimentation, I found that on average an instance will shut down after 10 minutes of idling.

        10 mins..... meh


<?xml version="1.0" encoding="UTF-8"?>
<cronentries>
  <cron>
    <url>/ping</url>
    <description>Repopulate the cache every 10 minutes</description>
    <schedule>every 10 minutes</schedule>
  </cron>
</cronentries>

I decided to use The GAE cron feature and have my application ping itself once every 10 minutes. After deploying the application, I find that the LRU algorithm is configured correctly and stops my instance half of the amount of times I would expect. I configured and uploaded the cron to ping every 5 minutes and again the LRU algorithm disabled my instance after so many pings.

        5 mins..... better

<?xml version="1.0" encoding="UTF-8"?>
<cronentries>
  <cron>
    <url>/ping</url>
    <description>Repopulate the cache every 5 minutes</description>
    <schedule>every 5 minutes</schedule>
  </cron>
</cronentries>

I decide to try something sneaky and use a "prime schedule" I thought could fool GAE. I wrote three crons on a 3 minute, 7 minute, and 11 minute schedule. This seems to work the best out of all my attempts as the algorithm has a harder time figuring out the mean idle time and shuts down the instance less frequently. With a decent amount of traffic, I would figure this to be the best cron to use to allow your instance to shut down during off-hours.

        prime schedule..... best!

<?xml version="1.0" encoding="UTF-8"?>
<cronentries>
  <cron>
    <url>/ping</url>
    <description>Repopulate the cache every 3 minutes</description>
    <schedule>every 3 minutes</schedule>
  </cron>
  <cron>
    <url>/ping</url>
    <description>Repopulate the cache every 7 minutes</description>
    <schedule>every 7 minutes</schedule>
  </cron>
  <cron>
    <url>/ping</url>
    <description>Repopulate the cache every 11 minutes</description>
    <schedule>every 11 minutes</schedule>
  </cron>
</cronentries>

But for the absolute need of an always-on instance for free, a single cron scheduled every 1 minute does the trick. Of course I can't tell you if Google monitors and suspends accounts for this kind of activity.

        1 min..... dangerous?


<?xml version="1.0" encoding="UTF-8"?>
<cronentries>
  <cron>
    <url>/ping</url>
    <description>Repopulate the cache every 1 minute</description>
    <schedule>every 1 minutes</schedule>
  </cron>
</cronentries>


UPDATE

I figure you guys would like to see some usage statistics over time using the prime schedule, so I got on the app engine dashboard and took some screenshots.

Here is the past 24 hours of usage, you can see the services shutdown a couple of times (during off-hours) and humming for the majority of the day. You will also notice how random the prime schedule appears to GAE.

24 Hour Usage

Here are the past 30 days of usage, again you can see the my instances have been pretty durable.

30 Day Usage

6 comments:

  1. Thanks for the trick. I'm not using the servlet enviroment, but can you give me you're ping file? (/ping)

    ReplyDelete
    Replies
    1. Since I'm using a servlet environment, I actually don't have a /ping file.
      I instead have a servlet mapped to the /ping url that looks something like this:

      @Override
      protected void doGet(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException {
      String qState = req.getParameter("st");
      String qDairy = req.getParameter("cd");

      if ( !validateQuery(qState, qDairy) ) {
      respond(resp.getWriter(), new JSONObject(), "Invalid code syntax. Please try a different dairy code.", Http
      ServletResponse.SC_BAD_REQUEST);
      return;
      }

      // more stuff
      }

      And then I map the url to the servlet in my web.xml:


      <servlet-mapping>
      <servlet-name>dairyPingServlet</servlet-name>
      <url-pattern>/ping.json</url-pattern>
      </servlet-mapping>


      In your case, you could probably have a /ping.html file that has some AJAX to get something from your server in GAE. As long as you have consistent activity in the server side of your app, you shouldn't have any issues with your instance(s) dying.

      Delete
    2. Can we use only hello word in log file to ping . it is works

      Delete
  2. hi, how shall i use this in my servlet app that i want to host on appengine?

    ReplyDelete
  3. Can you give that full ping servlet here

    ReplyDelete
  4. Good stuff!! Thank you for saving time and money for the rest of us!

    ReplyDelete