Loggly

Close

If you don't know the subdomain for your account, you can retrieve it by resetting your password. If you don't have an account, signup now.

Use the source, Loggly

Posted 29 Mar, 2013 by Philip O'Toole in Code

Here at Loggly we make great use of open-source software to avoid remanufacturing wheels.  It allows us to employ the latest technologies to build exciting, interesting and large-scale systems quickly -- all without worrying that the building blocks will forever remain a black-box.  But working with open-source at such an accelerated pace requires its own set of skills. Here are five key lessons learned from our Engineering team -- that might help you.

1. Assume at your own risk


It's easy to assume that the project developers implemented a certain feature in a certain way, and before you know it that assumption becomes taken for granted. Without realizing it, one is building a system on that assumption. This is insidious and one needs to deliberately check the source and test the system before making any major decisions. Time is short at a small company like ours, but it'll be even shorter if you have to re-implement due to an invalid assumption.
  Also, never make customer adoption assumptions in a vacuum, the Steve Jobs effect is awesome but rare. Leverage your customer community for responsive feedback.

2. Community support


Look for tools that can help debug, deploy, and manage the technology. If the tools ecosystem is healthy, then the open-source technology is probably vibrant too. Your sysadmins will thank you too, if they don't have to build everything from scratch. And take the pulse of the community. Join the mailing lists. Do the same questions come up over and over again? And do they get answered?

3. The project's strength of vision


We at Loggly always feel better when the direction of the project aligns closely with the problems we're trying to solve. For example, ZeroMQ is a technology with a clear purpose. Problems arise when this clear direction is missing, because the items high on your list may not be high on the project maintainers' list.


4. Gambling on the "Killer Feature"


Sometimes you just have to take a gamble because of a "killer feature". ZeroMQ is a good example, a technology we chose during its early days. Data-driven experiments showed that it was an order of magnitude faster and less resource intensive than the alternatives, but it was a very young project at the time we chose it. But we chose it because of its performance, even though it it was immature -- but it did have a strong vision.


5. When there are multiple contenders, choose the one that is weak on the things you want to be strong on, and strong on the things you'd rather be weak on.


This is my favorite insight from the Loggly Engineering team -- it's comparitive advantage applied to software design. Engineering is about trade-offs. Imagine you need to choose from 2 competing technologies. One has great algorithm support, but is weaker when it comes to IO. The other technology is not as strong when it comes to algorithms, but has superb IO support. So if you're planning to design and implement novel algorithms, you're going to become strong in algorithms -- in fact you want to be strong in that area. So choose the system with the strong IO support, and focus on the algorithms yourself.



Hopefully these pointers help you when you next come to evaluating an open-source technology.  We believe it shows in the adoption of our world's most popular cloud-based log management service.

Even better, why not join us and help choose the technology for Loggly's next generation systems?

No Comments   |   Leave a Comment   |  

The Truth is in the Logs

Posted 14 Feb, 2013 by dave@loggly.com in Business, Code, and Log Management

Last Friday, The New York Times reporter John Broder published a less than rosy picture of highway trip between Washington D.C. and Boston, cruising in Tesla’s Model S luxury sedan. The purpose of the trip was to range test the car between two new supercharging stations with a "speedy road trip". Broder wrote about his anxiety-ridden stretches between charging stations, when energy consumption was outpacing mileage. Among other complaints, he was unhappy about the need to power off the heat on a cold Northeastern day and to drive slowly, in an effort to conserve battery power.  The trip ended not at the charging station, but on the back of a tow truck having run out of electricity―a result he visually documented with a feature photo of the car being towed. Ouch.

This wasn’t the PR outcome that Tesla Motors chief executive Elon Musk expected. Avoiding a factless he-said-she-said on Twitter, he gave an interview saying that he was planning to publish the log files of the reporter's vehicle, since they were at odds with details in the story. As reported in Venture Beat:  “Musk claims Broder failed to mention how much he was punching the accelerator early in the ride, a move that Tesla warns its customers will drain the battery faster. Also, Musk says Broder took a detour through Manhattan. And he didn’t fully charge the car before departing.”

While the logs have yet to be published, if they are and if they do support Musk’s claims, it will be advantageous for Tesla -- and the “logging” community at large; bringing light to the next big category of business intelligence.  "Your honor, I'd like to call the #LogFile as my next witness." Amusing, but true and will be happening more and more often inside and outside of business.

As this story demonstrates, software is behind everything these days. Massive machine data is being generated every minute not only from our computers and cell phones, but from the cars we drive, the appliances we use, and virtually anything with a chip, motor or battery run with an operating system. The golden nuggets of truth exist within the log files, holding the detailed data about an event that happened, one that cannot be disputed.  

Such data, when mined and organized properly, provides a wealth of indicators to solve problems, such as why a web transaction timed out, the difference between a server being “on” and actually performing as intended, or to defend the veracity of a product’s claims. The tremendous value for IT departments (sysops, techops, devops and product developers) in having real-time access to log data for feedback on product performance is limitless.  Log files can show where users struggled or took too long to accomplish tasks, or where your applications or hardware let them down.  In trial-by-pubic cases like these log file data shines in a new light, far outside of the walls of technology but in the vernacular of the general public.

Logs don’t lie, but the truth can’t come out unless there are affordable and easy ways to release insights from these enormous log databases within the window of time in which they matter.  Moving the conversation forward, suppose Tesla was able to collect and analyze log files on all of its vehicles and then receive alerts on issues to determine if they need addressing or if they are simply isolated events caused by user error, right back to the driver before the situation arose.  That's the end game: taking aggregate user data to find nuggets of wisdom that can then be fed back to the end user with guidance -- seamlessly.

That's where we come in. Loggly’s cloud-based log management service gives companies fast, centralized access to all of their log data, so they can solve issues, identify problems and make customers happy again understanding and answering "is this needle in the haystack or tip of the iceberg".  In the court of public opinion and social everything or in the case of 100% cloud driven buisness... anything less than real-time is becoming really-late.

When consumer product CEOs start talking about logs, that's a pretty good sign that log files and log analytics are not just a stream of text and data to throw on the backup server every night―if you didn’t run out of storage for them already.  Smart log file mining helps companies keep customers happy and their bottom line bigger.  If you run a data-driven business, the more your company can act on that data to improve application/service/product performance and experience, the better off your customers will bef.

If Elon Musk is driving the future of Tesla, an automobile brand off his log files, shouldn't your cloud application-driven business also being harnessing the full power of log intelligence?

No Comments   |   Leave a Comment   |  

Enabling CORS in Django Piston

Posted 5 Dec, 2011 by Ivan Tam in Code

Here at Loggly, one of our goals is to make our API accessible and easy to integrate. By enabling CORS (Cross Origin Resource Sharing) on our API endpoints, we hope more Javascript developers can take advantage of what our product has to offer.
 
CORS is an addition to the browser security model that allows XHR requests to be made from one domain to another. CORS allows Javascript applications to access resources on domains other than the original document's domain, working around the same-origin policy. While Javascript application developers have crafted techniques like JSONP, Flash proxies, XHR receivers, and server-side proxies to circumvent the same-origin policy, CORS makes these hacks unnecessary.
 
To take advantage of CORS both the server and the browser need to support the standard. The browser needs to initiate a negotiation with the server and the server must signal to the browser which domains are allowed to make cross-domain requests. Our current API is implemented in Django Piston, an open-source project that enabled us to quickly build a RESTful API on top of Django. Piston does not support CORS out-of-the-box, but it wasn't hard to write some code to enable it and we'd like to show how it was done.
 
A full explanation of CORS is beyond the scope of this post, but the central idea behind CORS is a negotiation between the browser and server of allowed and disallowed actions. This negotiation is done via HTTP headers. The essential headers are the following:
 
  • Origin: Sent by the browser signifying the originating domain.
  • Access-Control-Allowed-Origin: Sent by the server, listing the origin domains allowed to make requests to the server's domain. Can be a comma-separated list of domains or "*" to allow requests from all domains.
  • Access-Control-Allow-Methods: Sent by the server, listing the HTTP methods the browser is allowed to use in requests to the server.
  • Access-Control-Allow-Headers: Sent by the server, listing the HTTP methods the server is willing to accept from the browser.
Essentially, to enable CORS we need to have Django Piston respond to an OPTIONS request with the server-sent headers and send the requisite headers along with responses.
 
The Resource class is the heart of a Django Piston-built API. The code  that injects the headers into responses lives in a subclass of the base Resource class. We've called this class CORSResource:
 
 
The CORSResource performs two simple tasks. First, it intercepts any OPTIONS method requests to handle the pre-flight negotiation between the browser and the server. Since OPTIONS requests do not have a response body, an empty HTTPResponse() is returned along with the requisite headers. Second, CORSResource intercepts responses from the Django Piston handlers (where responses are generated) and decorates them with the CORS headers.
 
To use CORSResource, we simply instantiated our endpoints with the CORSResource sub-class instead of the base Resource class. The change to our API's urls.py file look like this:
 
 
We hope this post helps other Django Piston API implementors enable CORS in their own APIs. We're planning to release this implementation in the coming weeks and we're looking forward to see what Javascript developers are going to do with direct access to our API.
 
Happy hacking!
 
(image from http://blogs.bournemouth.ac.uk/research/2011/09/01/sharing-your-research-data/)

1 Comment   |   Leave a Comment   |  

Java Code Coverage With Cobertura and Jenkins

Posted 29 Nov, 2011 by Noah Gift in Code and Startup

Introduction

 
Writing software is tough, tools and processes that reduce the difficulty are always welcome.  Code Coverage is an interesting developer tool because it tells you how much you don't know.  Not having a test for a particular piece of code doesn't mean that it doesn't work, it just means you aren't as sure as you could be that it works.    In this article, I will be covering getting the open source, Java Code Coverage tool, Cobertura, working with Ant, Jenkins and Github.  
 

Cobertura

 
Since getting continuous integration working in a particular language can be complicated, it is a best practice to break the problem down into discreet chunks.  Fortunately, Cobertura makes this easy, because the source code comes with an example ant project.  You can download a binary distribution here:  http://cobertura.sourceforge.net/download.html.  Inside of the download will be a relative path  ..examples/basic.  If you cd into that directory, you can generate a code coverage report on a sample Java project by typing:  
 
ant 
 
If you don't have junit installed, you will get output like the below, on my OS X Lion laptop:
 
 
If you run ant clean it will remove the reports directory, which contains, old reports.  One tricky bit with running the example, is that the build.xml file is set to look for cobertura, and the cobertura lib directory, in a relative path two directories above.
 
 
 
This can be tricky if you check in only the examples directory into your git repository and don't have cobertura installed properly.  What will happen then, is that the examples won't build properly and it will be tricky to figure out.  If you find yourself in this situation, one easy hack is to simply place the cobertura jar files, including the dependencies in the lib, inside of your ant home.  On Ubuntu linux this is /usr/share/ant/lib.
 

Jenkins

With the basic Cobertura configuration out of the way, the next thing to do is to create a throwaway git repository, on github or your own server.  Next you will want to check in the whole Cobertura binary distribution that you download from their site:   http://cobertura.sourceforge.net/, or only check in the examples root, and install Cobertura properly on your build server.  Inside of jenkins you will need to do the following things:
 
1.  Point Jenkins at your git repository.
2.  Install the Cobertura jenkins plugin:  https://wiki.jenkins-ci.org/display/JENKINS/Cobertura+Plugin
3.  Configure the reports directory correctly.  Because I only checked in the examples directory, my reports configuration looked like this:
 
**/reports/cobertura-xml/coverage.xml
 
4. build it.
 
Here is a screenshot of what it looks like when Jenkins builds the example project with code coverage:
 
 

Conclusion

If you were able to follow along at home, and have Jenkins automatically building the example project with code coverage, then the next step is to convert this knowledge to your own code base.  One problem I had when getting Cobertura working with my code base was configuring the junit stanza properly to fork.  Finally, an even simpler way to get code coverage cooking for your Java code base is to install the Eclipse plugin eCobertura.  If you are really stuck getting things to work, this is a nice easy win.  I hope you enjoyed the article, and if you have any Cobertura tricks, I would love to hear about them.
   
 
Ant:  http://ant.apache.org/
Cobertura Plugin Jenkins:  https://wiki.jenkins-ci.org/display/JENKINS/Cobertura+Plugin
Jenkins:  http://jenkins-ci.org/
Github:  https://github.com/
Cobertura:  http://cobertura.sourceforge.net/
Measuring Code Coverage With Cobertura:  http://www.ibm.com/developerworks/java/library/j-cobertura/
eCobertura:  http://ecobertura.johoop.de/
 
 

No Comments   |   Leave a Comment   |  

Automated Integration

Posted 16 Nov, 2011 by Mike Blume in Business, Code, and Startup

One of the fundamental challenges of distributed coding is deciding what/when to integrate. Sure, that patch your colleague just sent you looks good, but is it actually ready to go into master? At Loggly, we've been feeling our way towards a disciplined integration process. A year ago, our frontend developers were all making commits directly to trunk in a single SVN repo. Once every few weeks, we'd run `svn up` on our servers, and hope for the best. Today our code goes through peer review, unit testing, and static analysis before it even touches our master branch.

Like most projects these days, the process starts on github. Fork. Push a feature branch to your repo. Open a pull request. Go through a couple rounds of discussion and revision. Merge. Every change to our code goes through this process. At first we thought it would slow us down, that we'd want pull requests for the nontrivial code and to just push to master for the easy stuff. After just a few days, we found the pull requests were slowing us down not at all, and that we all enjoyed the greater transparency into our colleagues' work.

Once we merge, the automation kicks in -- our default integration branch is 'proposed', so clicking merge doesn't actually get the code into the master branch. Jenkins polls our 'proposed' branch once a minute, then runs a simple preflight script on the code.

Rather than keep that preflight in a jenkins configuration page, we have it checked into the codebase so that any developer can run it too; this way there's no excuse for breaking the build -- you should have seen it break locally =P

Here's our preflight script. Let's go through it line by line.

DIR="$( cd "$( dirname "$0" )" && pwd )"

APP=$DIR/..

First we figure out where we're running, so that we can find the other scripts distributed with the app. 

$DIR/purge_pyc

Next we purge pyc files. This is done because if a user recently switched from a branch which contained files which don't exist in our branch, the pyc files may still be around, and may be found by the interpreter. 

$DIR/syncenv

Next we run a script to sync our python virtual environment, and ensure all requirements are present. 

$DIR/runtests

Here, of course, we run our unit tests. Each run prints a coverage report, so that as we recover from our testing debt, we can measure our progress.

&& $DIR/lint...

Next, and this is important, we run pylint over the parts of our app that we expect to pass with no warnings. As we clean up our app, we continue to add modules to this list. Pylint does a few useful things for us. It looks for trivial name errors of the kind that could quickly cause code to stacktrace -- using a module without importing it, etc. It also enforces certain kinds of coding discipline. Our functions and modules can't exceed a certain length. The cyclomatic complexity of our functions is limited. 

If all of this passes successfully, Jenkins automatically pushes the checked-out commit to master, which is where we base our development. Thus, we're always basing our development on known-vetted code.

If any of it fails, Jenkins still has a couple more tricks to pull. Here's our on-failure script:

This comes in two parts. The first runs a standard-issue git-bisect between origin/proposed and origin/master. Since origin/master has already been vetted by jenkins (that's how it became master), we know there'll be a regression somewhere between the commits. This goes into the session output, and is e-mailed to the relevant committers. Next, we roll the proposed branch back to the already-vetted master branch. Whatever pull request broke the build will have to be re-made from scratch.

 

No Comments   |   Leave a Comment   |  

Ajax This!

Posted 7 Nov, 2011 by Brian Schroeder in Code and Log Management

Today’s blog post is more of a tool than a toy.  Lately I’ve been working on a bookmarklet that utilizes Pusher (if you want to learn a little about Pusher, check out my previous blogpost).  The bookmarklet I’m working on is really silly, but these technologies have the potential to be used for some really cool apps.

I needed to figure out a way for javascript to manipulate objects on web pages remotely through ajax calls.   You can already do this with click events and Pusher by sending the class or ID of any element you click on, but what if there is no class or ID?  I also want this to work on any webpage.  As a result, I needed to figure out how to build a selector that was as unique as possible.  One way to do this is  to select using all attributes the object has to offer.

Enter, the custom attribute selector syntax:

This selector will only select objects that have a class value of “whatever” AND a myattribute value of “customattribute”.

Some stack overflow and basic google searching told me I can tease attributes out using a regex(gross).  I knew there had to be a better way so I started fiddling with javascript objects I captured on click and tried to figure out how to uniqify my selector string.  What I discovered is that every object I clicked on contained a ton of info.  The parts that I use to build my unique-ish selector are:

  • localName

The name of the html tag.  in the gist above, this value is customhtml.

  • attribute

All attribute(s) associated with the object (if any). In the gist above, these attributes are class=”whatever” and myattribute=”customattribute”

  • parentNode

This is the object that contains the one clicked on.  Technically my gist has no parent, but I use the parent node to map out a path from the object I clicked on all the way up to the body of the page.  If the object you click on doesn’t have any attributes ( a <p> object for example) you can still select it based on the specific path to that object.

  • textContent

This is any text that’s in the object.  For example <p>writing stuff</p> will have a textContent value of “writing stuff”.

Apart these elements aren’t bad at selecting the object you want, but together they get pretty close to behaving like a this object.  

 

Here is how I stitch these elements together to create my this-ish selector:

There are a few tricks here:

  • I found that if the object you click on has no attributes the path doesn’t help too much, so if the object I’m building a selector for doesn’t have any attributes I just give it a class with an empty value, and it works really well.  
  • In an attempt to keep my selector as unique as possible I’m also using :contains() to select on specific strings as well as the path.  This helps ensure that you’re manipulating the exact object you want.
  • If you’re clicking on an image tag, I found that contains won’t allow you to select that image, but if we take it out it works, so I added a filter to keep image tag selectors from using :contains(). (although it currently only works in the developer tools javascript console, I haven’t yet figured out why it’s not working for my bookmarklet).

Generally, the resulting selector ends up looking something like this:

While this method doesn’t currently work on every single element of every single webpage, it has worked on most of the pages I have tried out, including the google search page.

Anyone who has every tried to explain, on the phone (so 1991), to a grandparent or non-tech savvy friend/family member how to do certain things on the web should be able to see the power of this.  With these selectors it’s now possible to build a bookmarklet that allows two or more people to see what elements are being clicking on.  Now describing webpage elements to your grandfather isn’t needed.  Hook him up with your Pusher powered bookmarklet and all he has to do is go to a URL, click a bookmark button, and let’s say... click on the text you just highlighted in yellow.  On the flip side, you’ll know (almost)exactly what elements on the page he’s clicking on because his clicks will make highlights on the page as well.  Super neat!

Obviously this method isn’t perfected, but if you’re interested helping me make it more precise, please fork my project on github.

https://github.com/brainTrain/ajaxThis

 

No Comments   |   Leave a Comment   |  

PagerDuty, Loggly, and Alert Birds

Posted 28 Oct, 2011 by David Lanstein in Business and Code

Well, I'm back, and this time I'm here to talk about an awesome product that we use all the time, PagerDuty.  We use it internally for our own alerting (as do a number of Fortune 500 companies along with a million other startups), but we've also integrated it into Alert Birds, which is our alerting tool.  With Alert Birds, you can configure saved searches that run against Loggly, and you'll run those searches over a period of time that you've selected, and Alert Birds will escalate alerts in PagerDuty.  Before you can do any of those things, however, you need to set up the PagerDuty endpoint.

 

After you've done that, the next thing you'll need to do is to configure a saved search, and then configure the alert that you want to run.  The search itself is pretty straightforward, it has a name, a search string e.g.

(this is why it's cool to send us JSON!), and a list of inputs and devices that you choose - you may want to run a particular search on only your web servers, for instance.  The interesting bit is the alert itself, which runs a search that you choose, but has a number of options as to what conditions consitute an alert, and what the message should be:

 

This is where PagerDuty comes in.  Although you can send a GET or POST request to an endpoint of your choosing with the alert data, triggering an alert in PagerDuty is far more useful, as they can SMS/email/phone you, and they also handle escalations and reporting.  So, in the example above, if my web servers are spewing 500 exceptions, I want my ops folks to get notified, provided there are more than 10 - I don't want to wake anyone up over a little blip!  I'm a nice IT manager like that.  Anyhow, once an alert is in a critical state, it will run your search every minute until you're below the threshold, and once that happens, Alert Birds will automatically resolve your alert in PagerDuty.

That's pretty much all there is to it!  You can find the docs on Alert Birds here, please do drop me a line at support@loggly.com if you need a hand, and until next time, happy alerting!

No Comments   |   Leave a Comment   |  

Alert Birds, OAuth, and Loggly

Posted 10 Oct, 2011 by David Lanstein in Code

In today's edition of the Loggly Blog, we're going to talk about integrating with Loggly using OAuth.  You can authenticate against Loggly in a couple different ways, but we're going to focus on OAuth today, because that's how Alert Birds is able to run saved searches on a user's behalf.  OAuth can be a little tricky to set up at first, which is a big reason why we're open-sourced Alert Birds, and the code can be found at http://github.com/loggly/alertbirds-community-edition.  

To build an app, the very first thing you'll need to do is to register an app in your Loggly instance, at /account/applications/.  Apps are called 'Consumers' in OAuth parlance, and after your application gets approved, you can get started working with our API using OAuth.

In basic terms, the OAuth flow looks something like this:

1. Get a request token, which is good for a single request

2. Hit the authorization URL, and pass the request token

3. Log into Loggly when redirected

4. Authorize the application with your consumer key to access Loggly on your behalf

5. Get redirected back to your app, and with the OAuth verifier and the request token, get an access token, which is what you'll need to make future requests

This sounds pretty simple, but everything has to be just right for the flow to work.  Query string parameters have to be encoded in the right order, you need to send the correct HTTP method that Loggly is expecting, and you can't lose the OAuth verifier or the request token, because if the access token request fails, you'll have to ask the user to re-authenticate your app - no good!  

The OAuth wrapper you'll probably want to use is here:  https://github.com/loggly/alertbirds-community-edition/blob/master/lib/oauth.py, and a working client code implementation is here: http://github.com/loggly/alertbirds-community-edition/blob/master/controllers/main.py.  The wrapper class simply wraps the OAuth 2 Python library, and makes it a little easier to interact with Loggly, and the client code is where you'll find the interesting stuff in terms of hanging onto the request token, handling the return from Loggly, etc.  

A simplified walkthough of what's happening in main.py is this:

That's pretty much all there is to it.  Using the lib/oauth.py wrapper from Alert Birds should make it considerably easier to get going.  Happy coding!

No Comments   |   Leave a Comment   |  

Get Real with Pusher! Getting Those Alert Birds Squawking...

Posted 4 Oct, 2011 by Brian Schroeder in Code and Startup

 
Pusher rocks!  It gets real with the web in a major way.  Websockets!  I'm the new kid on the Loggly block and I've been working on Loggly's first alerting app, Alert Birds.  Alert Birds uses Pusher because it's fast (real time is kinda hard to beat) and it's easy to use. 
 
I'm giving you guys some code to play around with.  Using Pusher and Sound Manager, we're going to set up a site that makes Loggly's Alert Birds hop and sing on a page whenever anyone clicks on them. That means you see when your buddies click birds, and vise verse.  Just think of it as a bird chatroom.  Trust me, it's awesome.
 
If you want to follow along with the code, clone the github repo I created here:
http://github.com/brainTrain/PusherBirds
 
All you need to do in order to get pusher working is sign up for an account, grab some credentials and get coding.
 
Pusher has two main components:
  •     Listening for events.
  •     Triggering events.
 
Let's start with listening:
 
To listen for a pusher event you only need the app key, channel name and event name.  With these compoenets listening to a Pusher channel is pretty straight forward and looks something like this:

	
   
 
Triggering is just as straight forward and requires the same elements as listening, plus the app secret and a backend to handle and send data to Pusher.
 
First you'll need some data to send, let's go nuts with ajax!
 
This brings us to the backend component of triggering events, so let's look at the trigger.php file.
 
 
The first thing you need to do is head over to pusher's Publisher Libraries and choose the backend language you heart the most.
 
You can find Pusher's list of Publisher Libraries herehttp://pusher.com/docs/rest_libraries
 
Once you do that simply follow the Library authors instructions and go nuts!  In my case I chose the generic php library to demonstrate this simple example. It's all fairly straight forward but I did run into a namespace issue with my host.  To fix it I simply commented out the namespace in Pusher.php.
   
In trigger.php I add Pusher.php using php's require function, put my keys, secrets etc into variables and use Pusher.php's Pusher() class to validate.  The last line triggers the event and grabs data from the button variable I sent to the URL using my ajax call.
 
I should point out that Pusher is somewhat flash dependent as well. For browsers that don't support websockets, they have a solution.  Flash.  Which means it's probably a good idea to handle flashblockers for any pusher app you create. 
 
Here's Pusher's list of supported browsers:  http://pusher.com/docs/browser_compatibility
 
That's it!
 
At this point you should have everything you need to communicate with our Alert Birds.  Go nuts!
 

No Comments   |   Leave a Comment   |  

Creating Heroku Addons: Getting Started

Posted 27 Sep, 2011 by Ivan Tam in Code

Heroku

Recently the Loggly add-on went live for all Heroku users. Creating this add-on involved implementing a few API endpoints in our service for Heroku to call. This is a quick overview of what the development process looks like.
 
Heroku's add-on API specs have a number of calls that fall into the following three categories:
  • Provisioning/deprovisioning: These are calls that ask the add-on service to create or remove resources on behalf of the Heroku user. One thing to remember here is that Heroku will ask the add-on service to create resources per Heroku *app*, not per Heroku user account.
    In most cases, when Heroku makes a provisioning call to the add-on service, a new account is created for a Heroku app.
  • Single-sign-on: Heroku users do not need to provide separate account credentials to access services the add-on provides. As long as they are logged into Heroku, they are able to use or configure the add-on. This is accomplished by forwarding the user to the add-on's URL. A token and a timestamp are forwarded along with this request for authentication.
  • Plan change: Heroku allows add-on providers to change users based on tiers of service. A Heroku user is free to upgrade or downgrade add-on tiers at any time.
 
Each of these required actions corresponds to a REST endpoint that the add-on service
must implement:
  • POST and DELETE to http://<addon_service>/heroku/resources respectively provision and deprovision accounts.
  • GET to http://<addon_service>/heroku/resources/<id>?token=<token>&timestamp=<timestamp> does authentication and any necessary forwarding for SSO.
  • UPDATE to http://<addon_service>/heroku/resources/<id> will update the account's service tier.
 
Heroku provides an awesome tool called kensa that simulates API calls from Heroku on these endpoints to test the add-on service. 
 
To install kensa open up a terminal and make sure you have Ruby and gems installed and type:
 
sudo gem install kensa
Kensa is used to generate the boilerplate manifest that describes properties of the add-on:

Brasero:blogpost ivan$ kensa init
Initialized new addon manifest in addon-manifest.json

Brasero:blogpost ivan$ cat addon-manifest.json 
{
  "id": "myaddon",
  "api": {
    "config_vars": [ "MYADDON_URL" ],
    "password": "QQMmzfm3pkzzE3Fn",
    "sso_salt": "eqs5xOh7kGqen5hd",
    "production": "https://yourapp.com/",
    "test": "http://localhost:4567/"
  }
}

Edit the id to name the add-on. Edit the test and production urls to reflect the hostnames of your respective test and development hosts.
 
Heroku doesn't call it as such, but kensa is also a tool that cleverly puts test-driven development into your add-on development process. When you run the tool in test mode, it makes REST calls to the host defined in the test directive of the addon-manifest.json file:
 
Brasero:workspace ivan$ kensa test provision

Testing manifest id key
  Check if exists [PASS]
  Check is a string [PASS]
  Check is not blank [PASS]

Testing manifest api key
  Check if exists [PASS]
  Check is a hash [PASS]
  Check contains password [PASS]
  Check contains test url [PASS]
  Check contains production url [PASS]
  Check production url uses SSL [PASS]
  Check contains config_vars array [PASS]
  Check containst at least one config var [PASS]
  Check all config vars are uppercase strings [PASS]

  Check all config vars are prefixed with the addon id [PASS]
  Check deprecated fields [PASS]
done.


Testing POST /heroku/resources
  Check response [FAIL]
    ! expected 200, got 401
 
The output above is typical from early runs of kensa, but by running the kensa tool early and often the test failures will guide development.
 
After all the provision tests pass, invoke the next set of tests for deprovisioning by running:
 
Brasero:workspace ivan$ kensa test deprovision <some account id>

Follow through with the planchange, and sso tests. When all tests pass, the integration is complete!
 
What remains is to upload addon-manifest.json to your Heroku provider account and to document the add-on for the Heroku Addon Catalog. More information on how to do that is available here.
 
Other tips:
  • Until the add-on leaves Beta, the only plan available to test is the test plan. Be sure your add-on has a test plan during development.
  • Don't go into Beta until the add-on is feature complete. An add-on cannot be taken out of Beta back into Alpha without help from Heroku.
 

1 Comment   |   Leave a Comment   |  

Java logging, and our awesome community

Posted 20 Sep, 2011 by jon@loggly.com in Code and Log Management

Since I'm on a roll with the blog posts, I thought I'd quickly cover some of the ways you can log out of your Java code to us. I've been buried in *our* java code for nearly two years now, and we've been talking about improving the quality of the logging that we're doing there recently. As it turns out, there are some pretty interesting projects out there that look like they may be just what we need...

Before talking about them, though, I thought I'd give you a quick run-down on how we're doing things now. We use a very slightly tweaked version of the log4j SyslogAppender  (version 1.2.16) - the tweak is that we upped the message size from 1k to 32k. Yep, we're completely ignoring the syslog RFC, but its been working perfectly fine for us for quite a while, so I'm ok with that. We configure log4j as described on our wiki so that we log to a local syslog-ng (using a different facility for each app) which forwards to the appropriate ports on logs.loggly.com. Its a very simple approach, but has been very reliable for us.

So why would we want to change it?

The main reason is that we set this up before we could send JSON data, and we log a lot of performance data, which is a perfect fit for JSON. We're going to be moving all of our java logging to JSON over the next few months, because it will let us dive deeper into our logs, without all the noise.

A couple of weeks ago, Patrick Lightbody from neustar emailed us to tell us about a java class he'd written to send data into Loggly using http. Its an extension of java.util.logging.Handler and is only a couple of hundred lines of nice clean code . He shared it on github and described how to use it in his email.

 
If you're using java.util.logging, give it a whirl!

 

This got me thinking that we've been a little, um, remiss in communicating just how many libraries people have written for us, so I did a search on github for java projects with loggly in the repo name and found some other projects that also look pretty nice...

All of these projects use our HTTP interface, rather than TCP or UDP, which makes sense since we don't currently support JSON except over HTTP. We're planning on fixing that :-)

Expanding the search a little, there are a bunch of projects in github, on top of the ones we've created, that should make it easy to log out of javascript, python, ruby, and C#, as well as the Java stuff I talked about above.

We're obviously pretty happy that so many people are working on making it easier to get data to us, and I want to say thank you to everyone who has been contributing to all of those projects.  I'd like to encourage all of you reading this to jump in and help out if you can. Everyone who is contributing to these projects should drop us a line, and we'll send you one of our X-Ray Beaver tee's as a thank you :-)

4 Comments   |   Leave a Comment   |  

DreamForce presentation

Posted 7 Sep, 2011 by jon@loggly.com in Code and Log Management

Last week at DreamForce, I did a talk about how magical Loggly is, and a few people have asked to get copies of the slides. So, to make it a bit easier for everyone, here they are...

http://loggly.com/assets/4e67ec60dabe9d63dc002074/dreamworks_2011.pdf

Enjoy!

I was able to demo at DreamForce, so most of the exmaple slides ("Simple search" through "uniq") never saw the light of day there. They're included here, just for the sake of completeness.

Big thanks to Heroku & SalesForce for inviting us to DreamForce to celebrate us becoming an Add-on for Heroku - we had a lot of fun, and everyone seemed to love Hoover. Then again, how could you not?

No Comments   |   Leave a Comment   |  

Automating Pylint Integration With Jenkins

Posted 1 Sep, 2011 by Noah Gift in Code

If you write a lot of Python code it pays to setup ways to automate code quality control.  Jenkins, an open source continuous integration server, is a great platform for: automated testing, code coverage reports, and static analysis with Pylint.  In this blog post, I am only going to focus on Pylint, but stay tuned for more Jenkins topics in the future.

Pylint, if you haven't used it before, is the defacto static analysis tool for Python programmers.  It isn't perfect, and it does need a decent amount of configuration and tuning, but it is quite useful.  At Loggly we are using Jenkins to automatically run Pylint on git pushes to Master.  You can see a gist of our Pylint command here:  https://gist.github.com/1175983.  

One strategy to get Pylint integrated into an automated build of your software is to set a low bar.  More then likely you will need to turn off many warnings, initially, to get things working on every checkin.  Once you have this automated, you can then get fancier and slowly turn up the "heat" on your code.  If you look at the pylint command above, you will notice that the following warnings were turned off: W0622,W0611,F0401,R0914,W0221,W0222,W0142,F0010,W0703,R0911.  For example, one of the disabled warnings above is:  R0911: Too many return statements (%s/%s) Used when a function or method has too many return statement, making it hard to follow.. Once things are stabilized for a few days under the current Pylint settings, we may look to refactor the code to eliminate some of these return statements and turn the warning back on.

Jenkins has an interesting plugin called Violations, which can reject a build, based on a Pylint score threshold.  If you look at the following Pylint output, you will see  a Pylint score of 10/10:  https://gist.github.com/1176006.  If the Pylint score dropped below 10 on a git push, a build failure would occur as shown in picture below.

I hope you enjoyed the blog post, and if you have a Jenkins/Pylint, we would love to hear from you.

 

References

1.  Writing clean, testable, high quality code in Python:  http://www.ibm.com/developerworks/aix/library/au-cleancode/

2.  Jenkins CI:  http://jenkins-ci.org/

3.  Pylint:  http://www.logilab.org/857

3 Comments   |   Leave a Comment   |  

On the Way to Impressive

Posted 6 Jun, 2011 by Kord Campbell in Code and Log Management

I manage our Twitter stream, and for the most part I see an overwhelming amount of positive support for the product. Sometimes though, we get the occasional brutally honest comment indicating we could be doing a better job. Once such tweet came in about a week ago from @jakajancar. Jaka tweeted the following regarding Loggly's apparent lack of features:

 

He has a helluva point here. A lot of the things we do for scale and speed aren't readily apparent to the casual observer. It's really only when you start sending us several million events an hour and then start searching across 100s of millions events that you realize the scope of what Loggly can do. It's taken the lion's share of our time over the last 6 months to achieve scalability to handle thousands of accounts sending in up to 8GB/day each. Even though not everyone has those types of volumes, there exist accounts that do. We had to make sure we could handle higher volumes before we did something really cool with the rest of the system.

Well, it's about #@!%'ing time do start doing something cool with all this scalability.

Putting the Sexy Time in JSON

About 2 months ago Jon and I were talking to App47 about how they use Loggly. App47 is embedding Loggly in their own app, and requested a feature where we would extract a given field, index on it, and then allow narrowing of searches on only that field. We scoped the work, codenamed the project Argonaut and began cranking on it. The result is a new Loggly input which accepts JSON formatted data and on which partial or ranged searches can be done.

Back on Twitter, I received a reply from Jaka about what exactly he'd like feature-wise in Loggly.

Turns out the stuff we've been working on is very nearly the set of features Jaka asked for. Que the evil laughter.

Using the New JSON Hotness

Using the new Loggly JSON input is pretty easy. You simply create a HTTP input that is JSON enabled, and then forward a JSON structured text blob to it. Let's start by taking a look at some sample data we generated with a custom Apache logging format:

Now let's search for an event which matches status code 200 across a bunch of these suckers:

Loggly will only return events for this search which have a field named status and a value of 200. The hotness doesn't stop there though - you can also do ranged searches:

By now you are saying to yourself, that's pretty cool, but can I graph my field shizzle with it? Hell yeah you can! You can either use graph or compare with single values, or ranged values to conduct your searches. Here we 'bucket' the status codes and use the compare graphing command to get a breakdown of response codes:

Let's not stop there. You can also do the equivalent of a grep | cut | sort | uniq -c|sort -n if you use the unique command:

We're not entirely done with all the little options in and around these features, but will be adding them in the coming weeks. If you have suggestions you'd like to throw our way, we're more than happy to pretend the subsequent improvements were attributed to you asking for them! :P

So how do you JSON all your log data?

It's pretty obvious by now we're not planning on directly providing field extraction support in the product. Loggly's approach to log management has always erred on the side of simple, and field extractions are no exception. We'll be partnering with a few other providers in the coming months to get you endpoints for extracting fields from unstructured data, and also cranking out best practices for doing it yourself.

For now, if you log from your own applications, implementing a Loggly JSON input is fairly trivial: You just send us JSON on your JSON enabled HTTP inputs.

For other use-cases, be sure to keep an eye out for a follow-up post by Jordan showing you how to configure your Apache server to serve up custom log formats and using the grok tool to convert other logging formats on the fly to JSON. Grok will even forward the events to your account from a monitored file.

Impressed yet? Just wait for real-realtime feeds and our new alerting app coming out next month!

 

3 Comments   |   Leave a Comment   |  

Send Custom Metrics to CloudWatch's API

Posted 10 May, 2011 by Kord Campbell in Business and Code

A few week’s ago I wrote up how to implement simple alerting with Loggly and PagerDuty. This week I’m covering how to do something very similar with the new version of Amazon’s CloudWatch which they recently released.

Amazon doesn’t rely on a monitoring agent to collect the metrics for CW, so it’s literally a few clicks in the AWS interface to start using it. Data is collected by their pre-instrumented hypervisor and then forward to the CW service where it can be selected, displayed and alerted on by the user.

With the latest release of CW, Amazon provides new endpoints in the CW API which allow an user to send in custom metrics. These metrics can be used in combination with the hypervisor based metrics to build complex alerts and drive auto-scalability for applications based on EC2.

It’s that new functionality that I’ll be using to send data from Loggly to CloudWatch.

The Code

As always, the code for this post is parked on Loggly’s Github account. The cloudwatch.py file contains the signing bits required for talking to Amazon’s API endpoints, and some basic code for posting to the PutMetricData method. You don’t need the boto library for this, but it won’t hurt if you already have it installed.

The detailed instructions for setting all up are on the Github project page. Basically all you need to do to get this running is to get syslog-ng forwarding your web logs to Loggly, configure your Loggly credentials, and then enter your AWS_ACCESS_KEY_ID and AWS_PRIVATE_ACCESS_KEY_ID in the code.

You’ll need a few cheese shop libraries installed, including httplib2, simplejson and hoover, the Loggly Python library.

Set up a cronjob file that runs it periodically, preferably on an instance you are monitoring.

*/5 * * * * python ~/loggly-watch/main.py

The Result

The code above conducts a simple search on Loggly for all events being sent to the default input for your account. If all you are sending to that input is combined_access formatted log lines, then you’ll end up with hit counts sampled every 5 minutes from Loggly, offset by one minute to ensure we’ve indexed them properly.

The result is pretty impressive, with so little work involved. You can even do combo graphs containing metrics delivered by the AWS hypervisor.

Alarms

Once the metrics are flowing in, you can set alarms to trigger if they go over (or under) a certain threshold. In the screenshot below I’m monitoring for the term ‘exception’ coming in from my crappy blog which is hosted on AppEngine and which logs with my AppEngine async logging library.

The screenshot above shows where CW triggered an alarm for exceptions, then cleared itself after the threshold dropped below 4.

Monitor the Monitor

With Loggly and CloudWatch alerting, there are a whole host of monitoring and correlation use cases you can tackle with just a little bit of hacking. You can even alarm on the cronjob itself to ensure your monitoring is functioning and healthy. Here’s how.

Start by making sure your local syslog instance is sending data to Loggly, and then change your cronjob to pipe it’s output to logger:

*/5 * * * * python /home/kord/code/loggly-watch/main.py 2>&1 | logger -t cloudwatch-cron

Next, set up a search in the same main.py file you are calling with cron to search for a successful run of the cronjob that runs the search (that’s so meta it hurts):

Note: I’m keeping this example purposefully simple. In practice you’ll probably want to make this check little more sophisticated by ensuring the response from the Loggly server is valid or not, and that each search ran successfully.

Finally then create an alarm such that it triggers if the results number less than 1 over a 10 minute period.

Happy alerting!

No Comments   |   Leave a Comment   |  

Using the Loggly JavaScript Bug

Posted 10 May, 2011 by Kord Campbell in Code and Log Management

Update: I added support for HTTPs connections in the code examples below.

JavaScript bugs are all the rage. Dropping a little bit of someone else’s JavaScript in your website allows sites like Google Analytics, LoopFuse, and MixPanel to provide various analytics around the data they collect from people browsing your site.

Loggly already supports sending in logs from your webserver via syslog or HTTP, but it’s usually a bit more involved to set up and configure than installing one of these JS bugs. We figured this was a good enough excuse to get in on the JS bug action, so we whipped up a cool solution for doing this using iframes and cross-site POSTs from your user’s browsers. We stuck the JS file on Amazon’s CloudWatch service to ensure we weren’t slowing your site down, and we made it work with our existing HTTP inputs.

Setting it Up

Setting up the Loggly JS bug is pretty straightforward. Start out by creating a Loggly HTTP input which you’ll use to send events. Next you’ll need to include a script tag for loading the loggly.js JavaScript file. Just drop the following snippet into your webpage, at the top or bottom:

Once you have that included, you’ll want to include the following code, editing it slightly to include your HTTP input’s SHA-2 key as it appears on your HTTP input detail page, and setting the default logging level:

If you are using something like jQuery that provides a load event, you can drop the window.onload=function() bit and just stick the code inside whatever JS ready block you already have.

Log Crap Out of JavaScript

Assuming no other changes to your JS code, you’ll end up getting events flowing into your HTTP input which represent accesses to your website by remote clients. We’ll capture IP address and whatever else you send us as key/value pairs that you can use to search on, including browser width and height. Here’s an example of a search we ran through Ivan’s tally command to get the top IP access to our site over the last hour:

If you want to take it a bit further, you can actually log whatever you want out of your site’s JS. Here’s an example of how we could log a user’s subdomain stored in a cookie:

Using the data from your application and the power of Loggly’s search, you should be able to whip up any DIY analytics solution your heart desires.

BTW, if you using Loggly in a unique way, let us know. We’d love to chat with you, make a video, or even hire you!

Log on fellow beavers.

1 Comment   |   Leave a Comment   |  

Getting Started with Loggly Tutorial

Posted 19 Apr, 2011 by inga weizman in Code and Log Management

Kord Campbell, Chief Log Wrangler at Loggly, recently did a webinar for ModelMetrics getting started with sending logs to Loggly.  Kord steps through process of creating an account, configuring inputs and servers, testing, and searching data. It’s a great way to get familiar with the system.

Loggly Getting Started Webinar w/ ModelMetrics from Hoover Beaver on Vimeo.

 

1 Comment   |   Leave a Comment   |  

Alerting with Loggly and Pagerduty

Posted 25 Mar, 2011 by Kord Campbell in Code and Log Management

I recently wrote a blog post about triggering Woot lights on my desk with some simple Python code and Loggly. While hooking up Loggly to an Arduino with Woot lights can be somewhat interesting and exciting, I really didn’t present a practical solution for monitoring and alerting on events such as exceptions or errors generated by your applications.

In this example I’ll do just that with about 7 lines of Python code and a call to the mighty fine PagerDuty service.

The Setup

This solution uses the Python Hoover library which is available via the cheese shop. You’ll need to install Hoover with the following commands, which may or may not include you to installing setuptools.

Once you have Hoover installed, you’ll need to figure out what you want to alert on. For this example I’m going to use my lame blog which is hosted on code Bret Taylor wrote for Google AppEngine and Tornado. I use my async logging library for logging out of AppEngine to Loggly.

I put a method in my blog which throws an error when you hit this page. This is the result of the exception when viewed in the Loggly shell:

The Code

Now we see the error, and know what to search on to find it, we need write some simple code that will run a single bucket facet search to return a numeric count of results over a given period of time and trigger a PagerDuty alert if we find anything. In this example I constrain my search to NOW-6MINUTES to NOW-1MINUTE to ensure the events have been forwarded and indexed by Loggly. Here’s the code:

Ok, so I lied. It’s 10 lines of Python. It’s also one line in your crontab file:

*/5 * * * * /usr/bin/python blogalert.py

In practice you’d want to take the date stamp of the bucket result and use it for PD’s incident key. This would keep any overlap in searches from triggering a double or false alert.

BTW, on a related note my wife keeps calling PagerDuty ‘the girlfriend’, because she texts and calls me at all hours of the night and I have to scamper off to acknowledge her advances. My suggestion to Alex the other day was for PD to implement sexy personas that I could pick that would whisper sweet nothings in my ear at 2AM such as, “Hey baby, web head 12 is down again. Would you like to resolve?” :P

Happy alerting!

No Comments   |   Leave a Comment   |  

Using the Loggly Node.js Library

Posted 20 Mar, 2011 by Kord Campbell in Code and Log Management

Loggly uses a fair amount of Node.js and it’s currently my favorite backend framework in which to develop cool mashups with Loggly. Last month Charlie Robbins of Nodejitsu finished up a fantastic library for doing Loggly searches, facet calls, and even sending in events to Loggly with Node.js. Charlie also posted about node-loggly’s release on Nodejitsu’s blog.

I’ve been talking about getting an analytics app built on top of Loggly for a while now, and figured this would be a good opportunity to try out node-loggly for querying Loggly facet info to drive graphs and charts. I’m already using it extensively for doing logging into Loggly from all my Node.js apps. Here’s what it looks like in action.

The entire example is hosted on Github and include instructions for getting it setup and going. Notable dependencies include Richard Henry’s awesome Paperboy static file module and Charlie’s library for Loggly.

The Code

The code below pulls a single Loggly facet call for the last week, compacts it into a single bucketed result, and then returns a simple JSON string to the client for handling. The last part logs use of the app itself to Loggly.

Multiple facet searches are made for the terms defined in the index.html file.

Note that I’m not extracting the user-agent fields with these searches! I’m just doing a count of events with the term Safari, Chrome, etc. in them and then graphing the results. A better approach would be to extract a sampling of user-agents over a short time range, then search specifically for those strings.

Lastly, here are the views for the app itself during this post’s first hour of life. This was generated using a ‘graph loggly-node-chart’ in the Loggly shell:

I’ll continue to use Winston for all sorts of stuff moving forward, and will post back more examples here and on our wiki.

2 Comments   |   Leave a Comment   |  

Tracking Signups with Woot Lights

Posted 2 Mar, 2011 by Kord Campbell in Code and Log Management

Since our public launch on February 2nd our signups have been steadily increasing. Last Wednesday we signed up a record 50+ free accounts in a single 24 hour period. We use a combination of Loggly itself, Google Analytics, LoopFuse and SalesForce to track our signups, but those data points aren’t readily accessible by the entire office. I thought it would be fun and informative to implement a more real-time monitoring system for our new friends.

Enter my Woot lights and a MP3 of the stock market bell.

The code for all this stuff is on Github. You’ll need to load up your Arduino with the serial.c file, which contains a simple serial protocol that brings a pin high on receiving a ‘Y’ and low on receiving a ‘N’. More information on talking to Arduinos with Python is here.

The ground wire and and the trigger pin (port 13) are at the top, but covered up in the diagram above. You’ll need a PNP transistor, readily available from Radio Shack, or Fry’s. You can also order them on Sparkfun, where you can get an Arduino if you don’t have one already.

The server code is pretty straightforward. It just sends a Y or N down the serial line to the Arduino. You can test by starting the server:

python arduinoserver.py &

Next, test the server by going to http://localhost:9090/ You should be able to turn the lights on and off from the page.

Now lets look at the facet call we’re making on Loggly:

# facet check for signups 2 minutes ago to 1 minute ago
resp, content = h.request("http://%s.loggly.com/api/facets/date/?q='/thankyou/'&from=NOW-2MINUTE&until=NOW-1MINUTE&buckets=1" % subdomain, "GET", headers={'content-type':'text/plain'} )
foo = simplejson.loads(content)
free = foo['data'].items()[0][1]

The search we’re running looks for the landing page for signups, which in our case is called ‘thankyou’. We’re asking Loggly to give us data that came in from 2 minutes ago to 1 minute ago, and put that in a single bucket, or result. Here’s that result as returned from Loggly:

superman-2:logglywoot kord$ python signup.py 
{
    "gmt_offset": "-0800", 
    "data": {
        "2011-03-03T02:12:27.715Z": 1
    }, 
    "numFound": 1, 
    "context": {
        "rows": null, 
        "from": "NOW-2MINUTE", 
        "until": "NOW-1MINUTE", 
        "start": 0, 
        "query": "'/thankyou/'", 
        "order": "desc"
    }, 
    "gap": "+1MINUTES"
}

Be sure to check out the Loggly API documentation on our wiki. There are all sorts of interesting things you can build with them!

Woot on.

2 Comments   |   Leave a Comment   |  

Coolcam Your Logs with Geckoboard

Posted 31 Dec, 2010 by Kord Campbell in Code and Log Management

Nothing is more addictive than good dose of data visualization. Yesterday I stumbled across Geckoboard and I have to confess I’m totally hooked. I’ve managed to bang out a simple AppEngine-based framework to proxy Geckoboard data from Loggly’s APIs. The code is currently feeding a logging dashboard on Geckoboard where we’re tracking 404 errors, signups, wiki edits, and searches from tweets about Apache logs.

To build your own logging mashup you’ll need an AppEngine account, a Geckoboard account, and a Loggly account. As of this writing, both Geckoboard and Loggly are in private beta, but if you tweet “Hey @loggly, I’m a natural @geckoboard hacker!”, I’ll make certain you get a Loggly invite. Maybe the guys over at Geckoboard will do the same!

Getting Started

Once you get your accounts all sorted around, start by creating an HTTP input called ‘appengine’ in your Loggly account. You’ll use this input to log from your Python code on AppEngine to Loggly. Follow the directions on the wiki to create an HTTP input if you need help.

If you run your own webserver, you’ll also need to configure your syslog daemon to send us your web application logs. Again, follow the instructions on the wiki to setup file monitoring. We recommend using the latest version of Syslog-NG to monitor and forward log files to Loggly.

Keep in mind you could do all this from just AppEngine if you are already hosting your app there. Check out the logging from Google AppEngine section on the wiki for more info on logging from AppEngine into Loggly. The current code is asynchronous and non-blocking!

AppEngine Setup

Getting the AppEngine code running is pretty straightforward. Go grab the code from Github, and then save it into your local AppEngine Launcher. Edit the app.yaml file and change the name of the app to something unique. Log into your Google AppEgngine account and create a new application of the same name.

Now go edit the main.py file, and edit up a few lines:

 


  • the appengineurl should be your appname + .appspot.com

  • the logglyaccount should be your loggly account name + .loggly.com

  • the instances of username/password should be the same as your Loggly user’s credentials

  • the URL in the logging setup (at the bottom) should be a URL for one of your HTTP inputs

Once you are done editing, start up the AppEngine Launcher and go to File…Add Existing Application and add the new application to the launcher. Feel free to test the app locally first to ensure it’s working correctly. You’ll get a few links to test with, which will return XML.

Once you get all this working, head on over to Geckoboard, and create a new custom line chart widget. Set the widget type to custom and the format to XML. Enter a label for the chart, and then enter a URL with which Geckoboard will fetch the data from Loggly.

Edit the URL a bit to reflect the app name you picked on AppEngine, and the term you are using to search your logs. In our example, our app name is ‘logglygecko’ and we’re looking for ‘404’ across all inputs for the last hour and day.

http://logglygecko.appspot.com/graph?q=404

If you want to include additional search terms, you’ll need to hand encode the URL’s spaces. This example looks for 404 errors on .jpg files:

http://logglygecko.appspot.com/graph?q=404%20AND%20.jpg

The code also supports the meter widget. Just create a meter widget on your dashboard, and go through the same settings as for the chart widgets, except replace graph with meter in the URL.

The Sky’s the Limit

Geckoboard already has a ton of widgets for displaying valuable data from various services like Google Analytics, MailChimp and more. With Geckoboard’s custom widgets, and the data coming in from Loggly’s APIs, you can now quickly hack up a dashboard to display all that gold hidden away in your log files too!

1 Comment   |   Leave a Comment   |  

Do you have a logstash?

Posted 16 Nov, 2010 by Jordan Sissel in Code and Log Management

I’m a dev ops guy, and I’ve been talking about logging problems for a long while now. Talking about storing logs. Talking about parsing logs. Talking about searching logs. Talking about reacting to logs. Now I’m at Loggly, I’m talking about it more than ever.

Today I’m releasing logstash, an Open Source tool to accomplish all that and more. You can read about the release on my blog and then go download the source and get started with it.

If you want to see it in action, I’ve uploaded a demo video on YouTube. Also, Kord and I sat down today and chatted about logstash and its future. That video is below.

Welcome to Logstash from hoover on Vimeo.

 

We’re looking for people to help on the project, so if you are interested, drop me a line.

3 Comments   |   Leave a Comment   |  

Big Data Gets Bigger

Posted 13 Oct, 2010 by Kord Campbell in Business, Code, and Log Management

Edited on October 14th, for 2 orders of magnitude bad math.

Big data is big news. Big data is a big problem, and big solutions for it can drive big revenues. Because big money is involved, more and more people are writing and focusing on how big of pack-rats we’ve become. There’s only one fact everyone seems to be missing: Big is relative, after all.

Big Data in the Past

Back in the 70s when I was a kid, my family’s oil business had one of these old clunky Burroughs which my mom not-so-fondly called Maribel. Whenever you wanted to invoice someone, you would load Maribel up with the customer’s account history from paper tape and then manually enter the new invoices. When the existing tape got full, you started a new one. The tapes were yellow, about an inch across and maybe 20 feet long.

We stored these tapes in envelopes, and the envelopes were in turn stored in vertical file cabinets. The hall outside my mom’s office was lined with these files cabinets and the cabients were literarily overflowing into the kitchen because there was no more room in the hall for them. If you estimated 5 bits per line, 72 lines per foot, and 20 feet of tape, that would give you roughly 1KB of storage on a single tape. Multiply that by 1000’s of these tapes and I figure we had a total of 1-2MB of data stored in about 100-200sq/ft of space.

Lots of customers, lots of tape, lots of work, and lots and lots of data. At least lots for 1976.

Your Future Arrived Yesterday

In 1996 my future had arrived. I was running a moderate sized ISP, and found myself buying a full-height 5 1/2" 8GB drive from Seagate for my news server. It cost me just over $2,000. With that one drive alone, I could have stored nearly 300 football field’s worth Maribel’s yellow tape based data.

Just last weekend at Lucene Revolution I gave some company my email address in exchange for a 8GB USB drive. I promptly tore it apart and extracted from it’s guts a sliver of a micro SD card. I could easily fit a few thousand of those cards in the space of that old clunky Seagate drive.

Earlier this year an article in Wired quoted IDC as saying, the size of the information universe in 2009 was 800 Exabytes. IDC went on to say 2020’s information universe was expected to be a staggering 35 Zettabytes; nearly 44 times as much data as there is in existence today.

For reference, one Zettabyte = one thousand Exabytes, one Exabyte = one thousand Petabytes, one Petabyte = one thousand Terrabytes, and one Terrabyte = one thousand Gigabytes. That means a Zettabyte = a million million Gigabytes!

That’s around 3 × 10^16 times as much data as we had in our office in 1976! If we decided to store it in file cabinets filled with yellow tape, our dystopian future’s 35ZB of data would take up the surface area of 546 earths. Say what?

It reminds me of something you’d see in a Douglas Adams novel, where a thousands of small, slightly cranky robots named Maribel are forced to shovel and store yellow tape rolls until they collapse into a pile of rust several millions years later.

Smell the Data Exhaust

Data exhaust can be defined as the machine events generated when a user accesses data stored on a system connected to the Internet, such as when a user access their photos on Flickr. Hadoop Karma indicates Flickr was storing 4 billion photos by the end of 2009. In aggregate, those photos are stored on thousands of servers and are being viewed by millions of users across the globe everyday.

In a simple senario where all the photos on Flickr were viewed once each by a single user, the logs would weigh in at just over 2TB! In reality, Flickr’s log volume probably exceeds a Petabyte or more a year for just the views of the lightbox pages alone. Facebook’s numbers are even scarier. In one month they’ll store 2.5 billion photos on their system. In turn, all the people viewing those photos will generate an order of magnitude more log data than Flickr even has in all the photos they’ve ever stored.

Even though we’re in private beta at the moment, we’re already seeing combined log volumes of around 3GB a day from 15 customers. A few of our customers, including About.me and Server Density are sending us near the max of what we allow on the private beta right now. We expect those volumes to go up considerably when we launch the public beta in December, where an average customer could be sending us anywhere from 1 to 5GB a day each. It won’t take long to start referring to our data in units of Petabytes stored.

While demand for storing all those logs is accelerating along with all the data being generated, the technology behind the storage and processing of data also continues to accelerate. Within a few months time, the technology we are developing at Loggly will provide companies a way to peek into these large volumes of log data – where they couldn’t before – and allow them to see exactly what their users are doing with all that big data.

Loggly’s features for search, reporting and map reducing will make dealing with these huge volumes as trivial as stuffing a yellow punch tape into an envelope, except we don’t need a robot named Maribel to do it.

And so the Universe ended.

4 Comments   |   Leave a Comment   |  

Node.js SSL Server Example

Posted 25 Sep, 2010 by Kord Campbell in Code and Log Management

A buddy of mine pinged me today because he saw my name on Silas Sewell’s howto post on doing HTTP/SSL with Node.js. I emailed Silas a few days ago to have him update his cert handling to include toString() on the end of each filesystem read, and he was kind enough to give a shout out to me.

The nut of the problem was that Node.js puts a carriage return or some such cruft on the end when it reads from the filesystem. It was causing me fits with cert validation and I only found the answer by digging through the Node.js IRC channel logs. Logs, heh.

I had already expounded a bit on Silas’s solution because our signing agent uses a key chain. My buddy was also asking me for an example of how to do SSL and listen on multiple ports, so I pastebin’d him up a solution. Figure it was worth posting here too!

I should note that your keys and certs need to be readable by the Node.js server’s user. Obviously.

1 Comment   |   Leave a Comment   |  

A Logging Library for Django - How We Log at Loggly

Posted 22 May, 2010 by raffy@loggly.com in Code and Log Management

In my last blog entry, I showed you how you can enable logging in Django 1.2. Now we are going to look at the logging library that we built for Loggly to simplify the task of logging in our own Django application, the Loggly Web interface.

Here is how we log from within our application:

from loggly.logging import *

error({'object':'input','action':'create'})

That’s it. The above code creates the following log entry:

Mar 18 15:34:03 app loggly: severity=ERROR,user=logdog_zrlram,request_id=
08BaswoAAQgAADVDG3IAAAAD,object=input,action=create,status=failure

The logging call expects a dict of key-value pairs. This is to enforce key-value based log entries that make it easy for consumers to understand what a specific value means. Without the inclusion of a key, a value is more or less useless. In the example above, note that I only provided two keys: object, and action. However, the log entry contains a number of other data items. Those items are automatically added to the log entries by our logging library without burdening the developer to explicitly include them.

It is probably time to show you Loggly’s logging library:

import logging
import inspect

DEFAULT_LOGGER = 'loggly_web'
logs = None

def logHelper(rest=None, request=None):

    global logs
    output = list()

    # get the logger
    if not logs:
        logs = logging.getLogger(DEFAULT_LOGGER)

    # Loop through all the stack frames until you find the request
    stack = inspect.stack()
    for frame in stack:
        if frame[0].f_locals.has_key('request'):
            request = frame[0].f_locals['request']
            if request is None:
                continue
            # there is a request object
            if hasattr(request,'user') and hasattr(request.user,'username') and len(request.user.username)>0:
                output.append("user="+str(request.user.username).strip())
            if hasattr(request,'META') and request.META.has_key('UNIQUE_ID'):
                output.append("request_id="+str(request.META['UNIQUE_ID']).strip())
            # we found the request object. Get out of here
            break

    # getting input dictionary and appending
    if rest:
        for key in rest:
            output.append("%s=%s" % (str(key.strip()), str(rest[key]).strip()))

    ret = ",".join(map(str, output))
    return ret

def info(rest=None, user=None):

    msg = logHelper(rest, request)
    logs.info(msg)

def error(rest=None, user=None):

    msg = logHelper(rest, request)
    logs.error(msg)

Note that this is only an extract. Download the entire library if you want to use it in your own code. Here are some important things the code does:

  • line 17 to 29: This part of the code inspects the call stack to check whether there is an HTTP request object somewhere. The request object contains the username for the session and that is what we automatically extract . This frees the user from manually adding that information to the logging call. Automation is good!

  • line 26 and 27: We are using UNIQUE_IDs in Apache. In order to track a request from the Apache logs down into our application, we include that same ID into our Django logs. This is a huge win for associating Apache logs with our application logs.

  • line 32 to 24: All the dict entries are added as ‘key=value’ pairs to the log entry. So you can log any key you want.

  • line 39 to 47: These are the calls that you use in your code. Note that you can add a user field, which overwrites the username from the request. In some cases that is necessary and useful.

Let us know if you are using our library. I would love to hear back from you. I will post another blog entry later, where I will be talking about how to patch Django itself to do some more logging. We will be looking at how the authentication methods can be extended.

The links:
Django 1.2 Logging Patch
Loggly Logging Library

2 Comments   |   Leave a Comment   |  

Securing your Web Application with httponly cookies OR How Apache.org and Atlassian could have been secured

Posted 14 Apr, 2010 by raffy@loggly.com in Code

Attack

The other day I was reading about the Apache and Atlassian hack. Max wrote a really nice summary of how that attack could have been prevented. One of the points he raised was that they should have used HTTPONLY cookies.


I then realized that we might have the same problem with Loggly. After some traffic dumping of our Web sessions, I realized that Django didn’t support httponly cookies. A quick google search revealed that someone wrote a djangosnippet to add httponly cookies. I had to slightly rewrite it, so here is the code I am using:


<pre name=code class=python>class cookie_httponly:
def process_response(self, request, response):
scn = settings.SESSION_COOKIE_NAME or ‘sessionid’
if response.cookies.has_key(scn):
response.cookies[scn][‘httponly’] = True
return response

Don’t forget to add the middleware right before the SessionMiddleware. If you are using Python 2.6 or higher, you are done. Unfortunately, we are running Python 2.5, which does not support the httponly flag on cookies. A quick patch solved that problem as well:
<pre name=code class=bash>—- /usr/lib/python2.5/Cookie.py (revision 66233)
+ /usr/lib/python2.5/Cookie.py (working copy)
@ -408,6 +408,9 @

  1. For historical reasons, these attributes are also reserved:
  2. expires
    #
    + # This is an extension from Microsoft:
    + # httponly
    + #
  3. This dictionary provides a mapping from the lowercase
  4. variant on the left to the appropriate traditional
  5. formatting on the right.
    @ -417,6 +420,7 @
    “domain” : “Domain”,
    “max-age” : “Max-Age”,
    “secure” : “secure”,
    + “httponly” : “httponly”,
    “version” : “Version”,
    }

@ -499,6 +503,8 @
RA)
elif K == “secure”:
RA)
+ elif K == “httponly”:
+ RA)
else:
RA)

Loggly is now more secure against XSS attacks!

2 Comments   |   Leave a Comment   |  

Visualizing your Data in the Cloud with Loggly and HighCharts

Posted 26 Mar, 2010 by raffy@loggly.com in Code and Log Management

A short while into writing code for the Loggly interface we decided that we needed some eye candy. Given my background in visualization, I was keen on providing our users with an experience that helps them understand their data in an intuitive way.


Over the last few years I’ve been looking into a ton of visualization libraries for the Web. In the past, if you had asked me what library to use for generating charts on your Web site, I would have said, “Use Flash”. While there are a number of interesting Flash libraries out there, the landscape has shifted significantly in the last year. Everyone is moving to JavaScript. After some research, I opted to use a JavaScript charting library called HighCharts. I tried a bunch of other canvas-based libraries, but let me tell you without hesitation, HighCharts rocks.


I am going to show you how we are using HighCharts and how I implemented zooming to dynamically reload more event data on the fly. With any charting library, if you keep zooming in on a chart, it will not progressively load more detailed data. At detailed zoom levels you end up with a small range of data in your graph. Basically if you view a day’s data first, and then zoom into a specific minute, you would only see one data point.


To start, here’s the JavaScript I use to display a chart:

 

var parse_date = function(data) {
    var result = [];
    $.each(data, function(key, value) {
        var re = new RegExp(/(\d+)-(\d+)-(\d+)T(\d+):(\d+):(\d+)(?:\.(\d+))?/);
        var date = re.exec(key);
        if (date[7] == undefined) {date[7]=0;}
        var real_date = Date.UTC(date[1], parseInt(date[2])-1,date[3],date[4],date[5],date[6],date[7]);
        result.push([real_date, value]);                   
    });
    return result; 
}

chart = new Highcharts.Chart({
    credits: { enabled: false },
    chart: {
        renderTo: 'activity',
        defaultSeriesType: 'area',
        margin: [10, 20, 40, 55],
        zoomType: "x",
            events: {
                selection: function(event) {
                    // change the time frame to be searched
                    var start = Highcharts.dateFormat('%Y-%m-%dT%H:%M:%SZ', event.xAxis[0].min);
                    var end = Highcharts.dateFormat('%Y-%m-%dT%H:%M:%SZ', event.xAxis[0].max);
                    $.ajax({ type: "GET", url: "http://subdomain.loggly.com/api/search/?" \
                        + "q=inputname:logglyapp&starttime="+start+"&endtime="+end \
                        + "&facets=True&buckets=24",
                        success: function(data) {
                             chart.xAxis[0].setExtremes();
                             chart.series[0].setData(parse_date(data)); 
                             // fix the reset zoom button
                             $('.highcharts-toolbar').click(resetZoom);
                        },
                        error: function(req, text, error) {
                            $("#err").html("Reload error!");
                        }
                    });
                }
        }
    },
    xAxis: { title: { text: 'Time' }, type: 'datetime' },
    yAxis: { title: { text: '# Events' }, min:0, 
        plotLines: [{ value: 0, width: 1, color: '#808080' }]
    },
    tooltip: { formatter: function() {
            return Highcharts.dateFormat('%B %e %Y %H:%M:%S', this.x) + '<br/>'+
            '<b>'+this.y+' Events</b>' }},
    plotOptions: {
        area: {
            dataParser: parse_date,
        }
    },
    series: [{ id: 1, name: 'search', 
        dataURL: 'http://subdomain.loggly.com/api/search/'
            + '?q=inputname:logglyapp&facets=True'}],
    title: { text: 'traffic last 24 hours' }
});

var reset_zoom = function() {
    // requery for the original data:
    $.ajax({ type: "GET", url: "http://subdomain.loggly.com/api/search/"
        + "?q=inputname:logglyapp&facets=True",
        success: function(data) {
           chart.toolbar.remove('zoom');
           chart.xAxis[0].setExtremes();
           chart.get(1).setData(parse_date(data)); 
        },
        error: function(req, text, error) {
            $("#err").html("Loading error!");
        }
    });
}
});

Let’s have a quick look at the code. There are two things I want to communicate here: 1. The code I used to display a HightChart graph and 2. The way I am using Loggly’s APIs to query the data.

I mentioned the special zooming that I implemented. Take a look at lines 20 to 39. This is the function that handles zooming, and it is where I am reloading the more detailed data. I set the new start and end dates (lines 23 and 24) and then I am querying the Loggly API with the new timeframe (lines 25 to 27). Upon success – this is important – I am using the chart.series0.setData() method to set the new data for the chart. The next line overwrites the default button or a link that lets the user zoom out again (lines 32). Note: because you are implementing your own zoom, the default “reset zoom” button from HighCharts will not work anymore and you have to implement your overwrite it with your own function to reset the chart.

The function dealing with the reset functionality is on lines 59 to 72. It does nothing else than query the Loggly API for the original data (I am passing no time parameters) and setting the data just like the previous call. The other thing you have to do is in lines 64 where you need to remove the HighCharts default “reset zoom” link and reset the extremes (line 65).

 

Moving on, we’ll briefly discuss the way I’m using the Loggly API

. If you’d like to use it, you need an account with us. We are currently in private beta, therefore you will need us to give you access to the beta program in order to do so. Email if you want an account to play around with! Back to the code. Make sure you replace the with your actual subdomain. Now that this is out of the way, you can query the API by simply making a GET request to: /api/search. You pass the q parameter with your query. In my example I am getting all the data from my input with the name logglyapp. To get timeline data, you’ll need to pass the parameter facets=True into the call. This will give you counts for time buckets.


To make everything work together, you need one more piece: the date_parse function. You need this part because the Loggly API returns the data with real human readable timestamps and HighCharts wants UTC encoded timestamps. The function on lines 1 to 11 takes care of converting the time for you. Just copy it.


I hope this was useful. Let us know if you are having trouble with any of this. We are looking forward hearing about your graphing endeavors.

 

If you look at my del.icio.us feed, you’ll find a bunch more visualization and charting links.

4 Comments   |   Leave a Comment   |  

How to use RightScale APIs with Python

Posted 17 Mar, 2010 by raffy@loggly.com in Code and Log Management

I have been quiet for long enough on this blog. It’s time for me to share some things that I learned in the last few months while I was working on Loggly’s Application layer. Lately, I spent some quality time with Django and consequentially Python.

What I want to focus on today is our integration with RightScale. At Loggly, we use RightScale to manage our AWS instances. Loggly runs three types of servers. (Well, I am simplifying). We have a proxy tier which receives your log messages. The proxy tier, which is basically a bank of machines, forwards the messages to the indexing back end that runs Solr. The third group of machines are the Web or application servers. When a new proxy box comes online, the RightScale management interface knows about the box. I had to know about thse proxies on the application tier (i.e., within Django) as well. How do you do that?

The first solution would be to have the proxies register with Django, as soon as they get online. What happens though when they go down or are taken offline? Seems complicated to keep track of that. Another solution would be to periodically poll the proxies from Django. Not very nice either.

My solution is much more elegant. RightScale has two features that helped me out. The first one is machine tags. Each proxy server is labeled as such. (See Machine Tagging). Secondly, I am using the RightScale API to figure out how many proxies I have and what their IPs are. (As a side note, the RightScale APIs are in Beta right now. There might be changes or improvements coming down the pipe.)

I struggled for quite a bit with using the RightScale APIs out of Python. Here are some things that I learned the hard way and you might find helpful:

Using the API to query all your machines in a specific deployment:

curl -H 'X-API-VERSION: 1.0' -u [user@domain.com]:[password] \
https://my.rightscale.com/api/acct/[account]/deployments/[deployment_number]

Note how you have to add the extra header to request version 1.0 of the API.

Here is how you get all the machines that have a specific tag. Note the structure of my tag! I set role:proxy=true. You need to use this hierarchical model!

curl -H 'X-API-VERSION: 1.0' -u [user@domain.com]:[password] -d'resource_type=server' \
-d 'tags[]=role:proxy' https://my.rightscale.com/api/acct/[account]/tags/search.js

Want JSON output instead of XML, add “&format=js” at the end of your request!

Now, from the response, you would think you could just use that HREF to query an individual server. Wrong. That doesn’t work. You have to add “/settings” in order to make that work:

curl -H 'X-API-VERSION: 1.0' -u [user@domain.com]:[password] \
https://my.rightscale.com/api/acct/20184/instances/[instance_id]/status

Here is how you set a tag on a server: (Note: If you change the tag in the user interface for a running server, it will not take effect. Only if you start a new server of that type, will the tag be there. Unlike the API call, where you can set a tag on a running machine).

curl -H 'X-API-VERSION: 1.0' -u [user@domain.com]:[password] \
-d 'resource_href=https://my.rightscale.com/api/acct/[account]/servers/[server_id]' \
-d tags[]=role:proxy=true https://my.rightscale.com/api/acct/[account]/tags/set

The part I struggled with most was how to call the API from within Python. Turns out httplib2 expects the Web server to respond slightly different than the RightScale server is. If you are using the following code, you will not be able to connect:

h = httplib2.Http()
h.add_credentials(user,password)
response, content = h.request(url, headers=headers)

httplib2 will connect to the Web server without sending the credentials. Only if the server challenges the client to use auth, it will then send the authentication headers. And this is precisely what RightScale is not doing. Therefore, you have to do the following in order to include the authentication headers in the first request already:

h = httplib2.Http()
import base64
base64string = base64.encodestring('%s:%s' % (user, password))[:-1]
headers['Authorization'] = "Basic %s" % base64string
response, content = h.request(url, headers=headers)

Credentials are an interesting topic. I ended up creating a separate user in the RightScale interface that I am using for the APIs. Don’t be fooled though. These credentials still let that user log into the Web interface. I hope that RightScale will add a capability such that I can have a user that can only use the API.

I hope this helps you getting off the ground a bit quicker when using RightScale. Let me know how it goes. You can also find me on Twitter: @zrlram

1 Comment   |   Leave a Comment   |  

Django Middleware Munging

Posted 4 Dec, 2009 by Kord Campbell in Code and Log Management

We’ve been hitting the code pretty hard of late at Loggly, and the beta is really starting to take shape on the development servers.  There’s lots to do, of course, so we’ve taken to using Unfuddle to track tickets, host our repository for code commits.  Later on we’ll use Unfuddle’s APIs to help track customer’s feature requests and tickets.  Here’s a screen cap of our latest commit timeline:

commits

One of the things you’ll notice when you use Unfuddle is the presence of a subdomain in the URLs you use on the site.  Our subdomain is ‘loggly’ on Unfuddle, and we  log into our project area by going to http://loggly.unfuddle.com/ (no, you can’t check our code out).  This type of customer segmentation allows for multiple unique usernames per customer, but doesn’t require a unique username site-wide.  For non-SEO sections of the site, this is a perfect solution.

We are taking a similar approach with Loggly, where a user will sign up for an account and define a unique customer identifier (we’re kicking around calling this a “mill”), which will then be mapped to a subdomain on the system.  So, for example, if Foobar, Inc. were to sign up for a Loggly account, they would access the site via http://foobar.loggly.com/, and then could create any number of user/pass combinations they wanted to access their company’s log resources.

The only problem with this approach is that we use Django, and their built in auth system (which is fantastic, BTW) doesn’t really have facilities for this type of functionality.  While we could certainly hack the Django auth system by writing our own multi-tenant auth module, it would take away from more pressing issues – like launching the beta!

Enter the Middleware Solution

One way to solve this is by munging the subdomain and username together, which provides a unique system-wide username. If, for example, you were to log in as steve under foobar.loggly.com, then we’d stick them together to be something like “foobar_steve”. Obviously we can’t have everyone remembering this long monstrosity for their username, so we’ll need to munge the subdomain off the URL and the username the user types in to get the correct combination to send off to the auth system.

Thankfully Django provides a super-easy way to add middleware to a project. By injecting a small piece of code into the request from the user’s browser, we are able to do our on-the-fly transformation before the auth system takes over. Nobody is the wiser because we can modify the display name code in the profile model to show the “normal” username to the user. Here’s what the result looks like:

settings.py:
...
MIDDLEWARE_CLASSES = (
'loggly.profile.MungeMiddle.MungeForMillMiddleware',
...
)
...

MungeMiddle.py:
class MungeForMillMiddleware:
    def process_request(self, request):
        if request.POST.has_key('username'):
            data = request.POST.copy()
            user = "%s_%s" % (request.META['HTTP_HOST'].split('.')[0], data['username'])
            data['username'] = user
            request.POST =  data

When a request comes in, we pull out the POST data and make a copy of it with .copy(). We then munge up the username with the subdomain out of HTTP_HOST, and then set the POST data to forward on to the rest of the stack. We don’t do this for all requests, just ones with the username set, so it’s lightweight enough for production use. We end up sticking the shorteded version of the username into the profile table, and use it for display.

So there you have it. A 5 minute fix for a 5 hour problem. I’m sure there are more elegant solutions to doing subdomain segmentation with Django’s out-of-the-box auth system, but frankly we don’t have time to stop and code them up. We’re bent on getting our beta out as soon as possible, and if it requires hacks like these to do it, then so be it! Release early, release often.

2 Comments   |   Leave a Comment   |  

Coolcam for Your Log Files

Posted 11 Sep, 2009 by Kord Campbell in Code and Log Management

I’ve been talking about coolcams for years as a way to help quickly show off a product’s features. Coolcams aren’t meant to be useful – they exist simply to entertain and engage your audience. It’s like an elevator pitch for your product demo.

2429711287_a25fd6a0e3

Last year I did a coolcam mashup based around Poly9’s Flash Globe and events from my web server’s log files. I never did get around to publishing the code, but the URL was accessed 100s of times by the sales guys I worked with. If it ever was down, or broken, I’d get an email from then in minutes. As it turns out, a lot of them used the globe to start conversations with their customers.

Loggly Globe
Flash forward to present day. I’ve completely rewritten the code and put it up on Loggly’s site to share with everyone. The way the globe works is pretty simple. Using the web.py framework, it starts a web server which does two things. First, it serves up the HTML to your browser, which includes the Poly9 globe object and the jQuery library. Second, it serves up a JSON object to the page which is parsed and sent to the globe object. The code that serves up the JSON object does a few magical things for you:

  • tails yor web access log file for visits

  • parses out the ip address, timestamp, etc. from the log event

  • takes the ip address and does a geoip lookup on it

  • removes duplicate visits from a single ip address

  • wraps the whole thing up in a JSON object

While Loggly Globe is hard coded to parse our logs, it should be fairly easy to use it yourself. To get started, download the tarball for Globe 1.0, and then extract it somewhere on your server:

kord@loggly> tar xvfz globe_1.0.tar.gz

You may need a couple of Python libraries installed. Assuming you have easy_install installed you can run:

kord@loggly> easy_install web.py

kord@loggly> easy_install httplib2

You’ll want to edit the globe.py file and modify the location of the Apache log file to point to your local log file. You’ll also want to edit the regular expression extractions to match your log file format. Here’s a line out of our logs for reference, and the corresponding extractions, most of which were pulled from Random Encounter. Make any changes you need to match the regex up with your logs.

 

75.101.142.96 - - [11/Sep/2009:09:19:17 -0700] "GET / HTTP/1.1" 200 6196 "-" "collectd/4.4.2" 195546

parts = [
   r’(?P\S+)‘, # host %h
   r’\S+’, # indent %l (unused)
   r’(?P\S+)‘, # user %u
   r’\[(?P.)\]‘, # time %t
   r’"(?P.
)"’, # request “%r”
   r’(?P[0-9])‘, # status %>s
   r’(?P\S
)‘, # size %b (careful, can be ’-’)
   r’"(?P.)"’, # referer “%{Referer}i”
   r’"(?P.
)"’, # user agent “%{User-agent}i”
   r’(?P[0-9]+)’, # stuff at end
]

Now you’ll want to start the server. You can specify a port number to listen on if you want:

kord@loggly> cd globe
kord@loggly:/globe> python globe.py 8001
http://0.0.0.0:8001/

Try hitting http://yourserver:8001/json and see if you get a response back. Here’s an example of what you should see: http://www.loggly.com:8001/json. Here’s the demo again, if you just want to skip to the good stuff. Additional work could be done to integrate the code into an Lightty or Apache install to make it more permanent. You can read more about doing that on Web.py’s cookbook page.

Once we get the beta launched, you’ll be able to make mashups like these with your own log files. We’re looking forward to doing more coolcams like this with Loggly!

2 Comments   |   Leave a Comment   |  

Blog Categories

Search

Loading

Archives by Month